Older blog entries for rodrigo (starting at number 59)

Netlink-based D-Bus

As stated in my last blog post, we have been looking at different options for optimizing D-Bus. After some internal discussion and reviewing of the feedback we got, we think the best solution is to get the best ideas from AF_DBUS, but without having to create a new socket family, which wasn’t very well welcomed by the Linux kernel developers. This brought us to having to choose a transport that allowed us to do that, so we decided on Netlink (an IPC mechanism for kernel to user-space communications).

Below is a detailed description of the architecture we are planning.

Netlink sockets
The Netlink protocol is a family of socket based IPC mechanism that can be used to communicate between the kernel and user-space processes and between user-space processes themselves. It was created as a replacement for ioctl and to receive messages sent by the kernel. It is a datagram-oriented service with both SOCK_RAW and SOCK_DGRAM valid socket types. It is based on the Berkeley sockets API and uses the AF_NETLINK address family. Netlink supports different Netlink families such as NETLINK_ROUTE, NETLINK_FIREWALL and NETLINK_SELINUX, each of which is used to communicate with a specific kernel service.

Since Netlink is used as an IPC mechanism for processes (and the kernel) on the same machine, its address only has a port number that identifies each peer (nl_pid). Since Netlink supports both unicast and multicast communication, a message to a group (nl_groups) can also be sent but only process with uid 0 are allowed to send multicast messages from user-space. A Netlink address is represented using the sockaddr_nl data structure:

struct sockaddr_nl {
        __kernel_sa_family_t    nl_family;      /* AF_NETLINK   */
        unsigned short  nl_pad;         /* zero         */
        __u32           nl_pid;         /* port ID      */
        __u32           nl_groups;      /* multicast groups mask */
};

A Netlink message header consists of the fields:

struct nlmsghdr {
        __u32           nlmsg_len;      /* Length of message including header */
        __u16           nlmsg_type;     /* Message content */
        __u16           nlmsg_flags;    /* Additional flags */
        __u32           nlmsg_seq;      /* Sequence number */
        __u32           nlmsg_pid;      /* Sending process port ID */
};

The Netlink protocol is explained in detail here.

Generic Netlink subsystem
Every Netlink family is identified by an integer number that allows using different Netlink services. Currently there are 21 assigned Netlink families from a maximum of 32. To avoid a shortage of Netlink families the Generic Netlink subsystem was created.

The Generic Netlink subsystem can multiplex different communication channels on a single Netlink family NETLINK_GENERIC. Generic Netlink subsystem is not only a simplified Netlink usage but also the communication channels can be registered at run-time without modifying core kernel code or headers.

The Generic Netlink subsystem is implemented as a service bus inside the kernel and users communicate with each other over it. The users can reside both in user-space or inside the kernel. The bus supports a number of Generic Netlink communications channels that are dynamically allocated by a Generic Netlink controller. This controller is a kernel Generic Netlink user itself, that listens on a special pre-allocated Generic Netlink channel “nlctrl” (GENL_ID_CTRL) where users send requests to create, remove and learn about available channels.

Communication channels are uniquely identified by a channel number that is allocated by the Generic Netlink controller. Users that want to provide services over Generic Netlink bus have to communicate with the Generic Netlink controller and ask it to create a new communication channel. Also, users that want to access those services have to query the Generic Netlink controller to know if these services are available and which channel number are currently using.

Every channel is identified by a Generic Netlink family and defines a set of commands that users can trigger. Each command is associated with a function handler that gets executed when a user sends a message specifying this command.

A Generic Netlink message header consists of the fields:

struct genlmsghdr {
        __u8    cmd;
        __u8    version;
        __u16   reserved;
};

Generic Netlink uses the standard Netlink system as a transport so its message format is defined as follows

  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                Netlink message header (nlmsghdr)              |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |           Generic Netlink message header (genlmsghdr)         |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |             Optional user specific message header             |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |           Optional Generic Netlink message payload            |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The family (communication channel) used is specified using the Netlink message header (nlmsghdr) type field (nlmsg_type).

Each Generic Netlink family can use a family specific header to be used by the service provided in that channel.

D-bus as a Generic Netlink service
D-Bus can be implemented as a Generic Netlink service by creating a new Generic Netlink family (communication channel) “dbus”. Applications will use this communication channel to send and receive D-Bus messages.

In this scenario, most of the work that is currently done by the dbus-daemon will take place in the D-Bus Netlink service, such as adding applications to the bus when they gain ownership of a name (NameAcquired signal), route the messages to the destination based on the application’s unique name and maintaining match rules (AddMatch method).

The D-bus daemon will only be a special user of the Generic Netlink D-bus service, although it will still have some responsibilities such as authentication and, of course, implementing org.freedesktop.DBus service.

The other D-Bus users (apart from dbus-daemon itself), will just work as they do now, using the D-Bus wire protocol on top of the Netlink transport, although they will have to do some extra step, as explained below.

Genetlink D-bus will provide to applications the following services:
Mechanism to create and delete D-Bus buses: Since we need to separate the traffic for the different buses (system, user, etc) in the kernel module, we need a way for dbus-daemon instances to register buses in the kernel module. To do so, we can specify D-Bus family commands DBUS_CMD_NEWBUS and DBUS_CMD_DELBUS. The process who creates the bus will be the D-bus daemon implementing that bus and all the messages that have org.freedesktop.DBus as destination will be routed to it.

Besides the commands, we have to define a way to specify the name and type of the bus to be added. We can either store that information in a user defined header or define Generic Netlink family attributes to pass that information to the D-bus Generic Netlink service. In any case, the dbus-daemon will be the responsible for choosing a unique name (in the form netlink:name=unique_name), so that the kernel doesn’t have to read any configuration at all, and just has to associate the unique addresses with each bus.

Another option would be to map a D-Bus bus to a multicast group and use the Generic Netlink controller CTRL_CMD_NEWMCAST_GRP and CTRL_CMD_DELMCAST_GRP commands. But we need more fine-grained control over the routing of the messages. We can’t just use genlmsg_multicast() and send the message to every application in the bus. A signal message sent to a bus is not received by all the applications since AddMatch rules can prevent some applications to receive the message. So, we have to maintain our own multicast group based on match rules.

Connect and disconnect from buses: To allow applications to connect to a bus we can define another set of D-bus family commands DBUS_CMD_CONN_BUS and DBUS_CMD_DISC_BUS. When an application wants to connect to a bus, first the bus type is checked, if the type is a
session bus, then only processes that are executed with the same uid as the one for the D-bus daemon are allowed. This restriction is not true for system bus, which allows connection from processes running as any user. Connection requests are routed to the D-bus daemon who does the authentication.

As the create/delete group case, we need to specify to which bus we are trying to connect. We can also store that information on the user defined header or define a set of Generic Netlink family attributes.

Transport to send and receive messages: Receiving messages is straightforward. You only have to create a socket with:

int sd = create_nl_socket(NETLINK_GENERIC, 0);

and use standard BSD socket API such as recv().

To send messages to the bus we have to define a Generic Netlink D-bus family command DBUS_SEND_MSG and fill the Netlink header, Generic Netlink header and if applicable a D-bus service specific header:

struct sockaddr_nl nladdr;
struct {
	struct nlmsghdr n;
	struct genlmsghdr g;
	char buf[256];
} req;
char *dbus_message;

memset(&nladdr, 0, sizeof(nladdr));
nladdr.nl_family = AF_NETLINK;

req.n.nlmsg_len = NLMSG_LENGTH(GENL_HDRLEN);
req.n.nlmsg_type = dbus_family_id;
req.n.nlmsg_flags = NLM_F_REQUEST;
req.g.cmd = DBUS_SEND_MSG;

na = (struct nlattr *) GENLMSG_DATA(&req);
na->nla_type = DBUS_ATTR_PAYLOAD;
na->nla_len = NLA_HDRLEN + strlen(dbus_message);
memcpy(NLA_DATA(na), &dbus_message, strlen(dbus_message));
req.n.nlmsg_len += NLMSG_ALIGN(na->nla_len);

ret = sendto(sd, (char *)&req, req.n.nlmsg_len, 0, (struct sockaddr *) &nladdr,
	     sizeof(nladdr));

All this is even easier when using libnl, a library that simplifies a lot the use of Netlink in user-space applications. This library is used in other system services, like NetworkManager, so adding a dependency on it to D-Bus shouldn’t be a problem.

The Genetlink D-bus service will parse the D-bus message, add the sender field and route to the correct destination in the case of a unicast message. If the message is a signal, the service will get the recipients list according to the match rules.

Also, it will process the NameAcquired and NameLost signals as well as the AddMatch method calls, so that it can keep track of where the messages need to go to.

Security framework: In the previous sections, authentication was mentioned as one of the responsibilities of the dbus-daemon itself. This is indeed what it does right now, but with the kernel Netlink module doing the routing based on user id, as explained above, maybe no authentication is needed on the dbus-daemon side. The question is whether the dbus-daemon should trust all that comes from the kernel or just do an extra check.

For some more fine-grained security, D-Bus services can use PolicyKit to prompt the user requesting the operation for extra authentication.

Support sending large messages: Some D-Bus users complain about bad performance from D-Bus when sending large chunks of data over it, that being the reason for file descriptor passing being available on D-Bus. It is true, though, that one can argue that those applications shouldn’t be sending that much data over the bus, and that it is the application’s responsibility, but the truth is that the problem exists.

Netlink provides the ability to send large messages by using multipart messages, so that the data to be sent can be sliced into chunks (no bigger than the kernel socket buffers’ size), resulting in better performance.

Implementation details
All this needs a bigger change to libdbus/bindings as in our initial plan, since the Netlink messages, as explained before, carry on an extra header that needs to be parsed before the real D-Bus message is processed.

So, for libdbus, we will implement DBusServerNetlink object for the implementation of a Netlink-based D-Bus server, and DBusTransportNetlink for the actual implementation of the wire protocol to be used when using Netlink as a transport. DBusTransportNetlink will be responsible for getting and parsing messages from the Netlink D-Bus kernel service.

For bindings, similar work will be needed to add support to reading and writing Netlink messages, but with the use of the libnl library, this should make it easier, and anyway, it is part of our plan to add whatever code is needed to the most popular bindings.

And that’s all for now, any comments/feedback is appreciated.

Syndicated 2012-03-20 15:34:10 from Rodrigo Moya

D-Bus optimizations II

As explained in my previous post, we are working on optimizing D-Bus for its usage on embedded systems (more precisely on GENIVI).

We started the discussion on the Linux-netdev mailing list about getting our patch to add multicast on UNIX sockets accepted, but, unfortunately, the feedback hasn’t been very good. So, since one of the premises from GENIVI is to get all the work we are doing accepted upstream, we have been thinking in the last few days about what else to use for, as stated in my previous post, fixing the main issue we have found in D-Bus performance, which is the number of context switches needed when getting all traffic on the bus through dbus-daemon. So, here’s a summary of the stuff we have been looking at:

  • Use TIPC, a socket family already available in modern kernel versions and which is targeted at clustering environments, although it can be used for communications inside a single node.
  • Use ZeroMQ, which is a library that, from a first look, provides the stuff we need for D-Bus, that is multicast on local sockets.
  • Provide the multicast on UNIX sockets as a new socket family (AF_MCAST), although this wasn’t well received neither on the linux-netdev discussion. This will contain a trimmed down version of AF_UNIX with only the stuff needed for doing multicast.
  • Extend the AF_DBUS work from Alban to include what we have learnt from the multicast AF_UNIX architecture. This would mean having a patch to the kernel that, as with the AF_MCAST solution, would have to be maintained by distributors, as the linux-netdev people didn’t like this solution neither.
  • Use Netlink, which has all that we need for D-Bus, that is, multicast and unicast, plus it is an established IPC mechanism in the kernel (from kernel space to user space), and is even used for other services similar to D-Bus. We would create a new Netlink subfamily for D-Bus, that would contain code to do the routing, as Netlink, for security reasons, does not allow direct connection between user space apps.
  • Use KBUS, which is a lightweight messaging system, provided as a kernel module.

Right now, we have working code for AF_MCAST, and are looking at Netlink, TIPC and KBUS, so will be blogging more details on what we find out in our experiments. But any feedback would be appreciated since, as stated before, we want to have all this work accepted upstream. So, comments, suggestions?

Syndicated 2012-03-07 18:08:22 from Rodrigo Moya

D-Bus optimizations

In the last month and a half, I have been working, as part of my work at Collabora, on optimizing D-Bus, which even though is a great piece of software, has some performance problems that affect its further adoption (specially on embedded devices).

Fortunately, we didn’t have to start from scratch, since this has been an ongoing project at Collabora, where previous research and upstream discussions had been taking place.

Based on this great work (by Alban Créquy and Ian Molton, BTW), we started our work, looking first at the possible solutions for the biggest problems (context switches, as all traffic in the bus goes through the D-Bus daemon, as well as multiple copies of messages in their trip from one peer, via the kernel, then to the daemon, to end up in the peer the message is targeted to), which were:

  • AF_DBUS work from Alban/Ian: while it improved the performance of the bus by a big margin, the solution wasn’t very well accepted in the upstream kernel mailing list, as it involved having lots of D-Bus-specific code in the kernel (all the routing).
  • Shared memory: this has no proof-of-concept code to look at, but was a (maybe) good idea, as it would mean peers in the bus would use shared memory segments to send messages to each other. But this would mean mostly a rewrite of most of the current D-Bus code, so maybe an option for the future, but not for the short term.
  • Using some sort of multicast IPC that would allow peers in the bus to send messages to each other without having all messages go through the daemon, which, as found out by several performance tests, is the biggest bottleneck in current D-Bus performance. We had a look at different options, one of them being AF_NETCAST, which mostly provides all that is needed, although it has some limitations, the biggest one being that it drops packets when the receiver queue is full, which is not an option for the D-Bus case.
    UDP/IP multicast has been mentioned also in some of the discussions, but this seems to be too much overhead for the D-Bus use, as we would have to use eth0 or similar, as multicast on loopback device doesn’t exist (hence no D-Bus in computers without a network card). Also, losing packets is another caveat of this solution, as well as message order guarantee.

So, the solution we have come up with is to implement multicast on UNIX sockets, and make it support what we need for it in D-Bus, and, of course, make use of that in the D-Bus implementation itself. So, here’s what we have right now (please note that this is still a work in progress):

The way this works is better seen on a diagram, so here it is. First, how the current D-Bus architecture works:

and how this would be changed:

That is, when a peer wants to join a bus, it would connect to the daemon (exactly as it does today), authenticate, and, once the daemon knows the peer is authenticated, it would join the accept‘ed socket to the multicast group (this is important, as we don’t want to have peers join by themselves the multicast group, so it’s the daemon’s job to do that). Once the peer has joined the multicast group, it would use socket filters to determine what traffic it wants to receive, so that it only gets, from the kernel, the messages it really is interested in. The daemon would do the same, just setting its filters so that it only gets traffic to the bus itself (org.freedesktop.DBus well-known name).

In this multicast solution, we might have to prevent unauthorized eavesdropping, even though peers need to authenticate through the daemon to join the multicast group. For this, we have been thinking about using Linux Security Modules. It is still not 100% clear how this would be done, so more information on this soon.

The above-mentioned branches work right now, but as I said before, they are still a work in progress, so they still need several things before we can call this work finalized. For now, we have succeeded in making the daemon not get any traffic at all apart from what it really needs to get, so a big win there already as we are avoiding the expensive context switches, but the socket filters still need a lot of work, apart from other minor and not so minor things.

Right now, we are in the process of getting the kernel part accepted, which is in progress, and to finish the D-Bus branch to be in an upstreamable form. Apart from that, we will provide patches for all the D-Bus bindings we know about (GLib, QtDBus, python, etc).

Comments/suggestions/ideas welcome.

Syndicated 2012-02-27 12:02:42 from Rodrigo Moya

New beginning

I guess it is time to announce that since yesterday I am working at Collabora, a UK-based company very well known for its work in several free software projects, like Telepathy, Farstream, GStreamer and others.

Haven’t had much time really to transition (and relax) from Canonical to Collabora, apart from last week, which I spent skiing, but hey, new year, new life, as we say in Spain, so the sooner you start with your new life, the better.

Syndicated 2012-01-03 15:17:10 from Rodrigo Moya

Leaving Canonical

Today marks the beginning of my last week at Canonical, where I’ve been working for the last 2.5 years. Because of the conflicts between the direction the company is driving to and my personal interests (GNOME), I have decided it is time for me to move on.

Since I am a positive person, I would just remember the good things of these 2.5 years, which have been, mainly, the nice people I’ve been working with, with a special mention to the Ubuntu Desktop team, composed of very great people. Also, some good projects I’ve worked on, like the Ubuntu One music store or the work at the Desktop team.

I can’t say yet publically where I’ll be working next, but I’ll continue being around GNOME.

Syndicated 2011-12-19 13:28:35 from Rodrigo Moya

Fix PDFs hack

I recently bought a new ebook reader (Wolder miBuk ALFA 7.0 Color) because my previous one was very bad at reading comics. It looked really great in the shop, but as soon as I copied my entired e-book collection to a memory card and inserted it on the reader, I found its 1st problem: it doesn’t have the option to display the books collection by file name, but it gets the PDF metadata and uses that. So, since lots of my books didn’t have correct metadata, it was very hard to find books in the library view.

But thanks to the help of Carlos García Campos (famous Evince/poppler hacker), I cooked up a patch for Poppler to add API to be able to set the metadata, and, right after that, wrote a very simple GTK program to allow me to “fix” my ebook collection.

The Poppler patch is still not ready to be pushed upstream (my fault, lack of time in the last couple of weeks, but will fix it soon), but posting this now just in case it is useful for someone.

Syndicated 2011-05-06 12:18:08 from Rodrigo Moya

Unofficial GNOME3 on Ubuntu PPA

A friend of mine was having problems with the GNOME3 packages in Ubuntu, and after some questioning, he told me he was using a PPA from this Launchpad team:

https://launchpad.net/ubuntugnome

The GNOME3 PPA for that team seems to be just a copy of the official GNOME3 PPA, but just in case, this is a public announcement to let people know that they shouldn’t use that PPA (unless they really want to, of course), but use the official one instead, which is at:

https://launchpad.net/~gnome3-team/+archive/gnome3

That is, the official team is the gnome3-team, so please make sure to check your sources.list if you really want to use the official one.

Syndicated 2011-04-11 11:56:34 from Rodrigo Moya

Internet hoaxes

As the number of my computer-illiterate friends that get an email address grows and grows, the number of mails containing hoaxes that I receive from them increases every day (things like “please forward this mail or the child would die”, “this music group helps financing a terrorist group”, “Mars will be as big in the sky as the moon”, etc). So, yesterday I got one about a restaurant charging 250€ instead of 2.50€ for the recipe of some cookies, giving the name of a real restaurant in Spain. Yolanda did a quick search for that restaurant and found a forum where people were complaining about that, and where one (clever) person pointed everyone to a page explaining the same hoax (word by word) for some restaurant in the US.

So yeah, a typical Internet hoax, you would say, but if I’m blogging about it is because I wondered yesterday what the purpose of these hoaxes is. Is it really just making fun of people? sociological studies? or using this for a revenge against a restaurant/shop/etc? There are clear cases, where you are asked to keep all people in the CC when answering, which seem, to me, related to getting email addresses for spammers, but all these hoaxes where people are just asked to forward the mail to their friends, what’s the purpose of them?

Please ask quick, as I couldn’t sleep last night because of this existential doubt :-D Another thing for further study would be how is it that so many people believe those hoaxes, but I’ll leave that for another time…

Syndicated 2011-03-24 11:01:28 from Rodrigo Moya

GNOME3 on Ubuntu

I already blogged about this some time ago but since some people keep asking, I’d thought about giving it more publicity.

So, in case you don’t know, next Ubuntu version won’t ship GNOME 3, but we have been working in the last few months on providing GNOME 3 packages for anyone interested in running GNOME 3 on Ubuntu. The packages are in the GNOME 3 PPA, and although it still doesn’t include everything GNOME 3ish, it includes the stuff that has changed the most, like the new control center, gnome-shell and other core desktop things and some applications. Thanks to Allan Day, here are some instructions on how to use a PPA.

It still misses lots of apps and some core desktop things, like gnome-session, but should be ready for daily usage (using it myself on my systems).

You can report any problem you find on the PPA via the GNOME 3 team mailing list or directly to me, as you like.

Syndicated 2011-03-03 13:16:26 from Rodrigo Moya

“GNOME 3″ on Ubuntu

With the great work from Robert Ancell and Sebastien Bacher, who worked on packaging the new GLib/GTK3 stack, and with the recent packaging of a few GNOME 3 applications (eog, Nautilus, the new control center, …), you can start testing what will be GNOME 3 on Ubuntu (Natty) by using this PPA.

Please note that this is a work very much in progress, which means that, apart from the usual problems of running unstable software, it’s got the unstability of new packages added, so please USE WITH CARE. I would suggest to use a virtual machine for testing this, but please test it and report any problems you might find. It seems to be running ok for me (on a virtual machine), but please don’t risk your every day desktop :-D

Syndicated 2010-11-12 11:15:45 from Rodrigo Moya

50 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!