Simple Multiplexing Protocol

Posted 14 Dec 2000 at 15:43 UTC by itamar Share This

(In the preview the PRE and TT tags aren't working, I hope they do in the real page. Original version of this article is at http://itamarst.org/multiplex/).

I'm trying to design a protocol that allows multiple channels running over the same TCP connection. Why would we want to do that? Well, it saves time opening new connections, we can have multiple requests to the server waiting for replies at once. Unlike pipelining, even if one of our requests returns a lot of data, we don't have to wait until all of its is sent - we can get replies to other requests while we are getting all that data.

The sMUX protocol alows multiple channels of communications to be run over the same TCP connection, for client/server protocols. A channel allows two entities to send messages back and forth. The receiver gets the total message the sender sent. Therefore, unlike a regular protocol, we don't need to use \r\n to signify the end of a request, since the object that gets a message knows it got the total contents of the message. For example, if the sender sent the message HELLO, the receiver will be notified it got a message of length 5 with the content HELLO.

Protocol Definition

A message is made up of two parts - the channel ID, and the actual data of the message. The channel ID is two bytes chosen by the client. Each channel has a unique ID, so we can have 65536 channels. The message data is some number of bytes, the contents of which depend on the protocol.

To send a message we prepend the channel ID (two characters) to the message data, and the resulting string is encoded as a netstring and added to a sending queue. At the same time the messages in the queue are taken one by one and sent over the TCP connection.

What exactly is a netstring? To quote Dan Bernstein's article:

A netstring is a self-delimiting encoding of a string. Netstrings are very easy to generate and to parse. Any string may be encoded as a netstring; there are no restrictions on length or on allowed bytes. Another virtue of a netstring is that it declares the string size up front. Thus an application can check in advance whether it has enough space to store the entire string.

For example, the string hello world! is encoded as <31 32 3a 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 2c>, i.e., 12:hello world!,. The empty string is encoded as 0:,.

When the other side receives the message, it know how much to read since the message is a netstring. Once the receiver has read in the whole message, it extracts the message data and channel ID from the netstring and hands the message to the object that is using that channel.

If the server recieves a message with a channel ID that was not in use before, this implicitly means the client has openned a new channel.

Example

Let's look at an example - a counter application. Whenever a new channel is opened, the server creates a new counter with value 0. The client can then increment the counter, and the server will send back it's new value. Let's look at an example session:

Client on channel ab sends: INCREMENT 100
Client on channel zx sends: INCREMENT 50
Client on channel ab sends: INCREMENT 20
Client on channel ab sends: VALUE?
Client on channel zx sends: VALUE?

Server on channel ab responds: 120 Server on channel zx responds: 50

So what was actually sent was:

15:abINCREMENT 100,14:zxINCREMENT 50,14:abINCREMENT
20,8:abVALUE?,8:zxVALUE?,

Notice that we do not know what order the server will respond in - perhaps ab's response will be returned first, perhaps zx's. SO, the response from the server can be either:

5:ab120,4:zx50,

or, alternatively:

4:zx50,5:ab120,

<h3>Techniques</h3>

Sending large messages is not a good idea, since this will prevent other channels from sending messages while the large message is being sent. The solution is to break up the data into multiple peices of a small size, e.g. 10kb. If there are more pieces of the data to be sent after this one, we prepend "M" to the data, and send that as the message. If this is the last piece, we prepend "S" the data before sending it as the message.

The receiver, knowing this data being sent is broken up (because it just issued a download command for a file, for example) examines the data of the messages as it gets them. If the data starts with "M", it knows there are more pieces of data coming, but if it starts with "S" it knows this is the last piece of data.

Another benefit of this technique is that it can be used to send large amounts of data when the sender does not know how much data will be generated (e.g. when sending the results of a long-running program.)

Sample Implementation

You can download a sample implementation written in Python - multiplex-0.2.tgz.


Why not use BXXP?, posted 14 Dec 2000 at 18:30 UTC by Tv » (Journeyer)

BXXP invented that wheel already. And a few other basic geometric figures, too. There's also a python library.

Yes, it seems a bit heavy at first, but after reading some of the specs, I'm starting to think they found a pretty good balance between features and simplicity.

Just ignore the hype and you might like it.

Reconsider multiple TCP streams?, posted 14 Dec 2000 at 19:32 UTC by egnor » (Journeyer)

First, you should indeed look at BXXP.

But second, you quickly skip past your rationale for not using multiple TCP streams. Multiplexing reliable, sequenced streams over the same physical connection is, after all, what TCP/IP is all about. You claim that you want to "save time opening new connections", but I fear that might be a case of premature optimization. Have you actually measured the cost of opening a connection, or looked into optimizations that can reduce that cost? Do you understand what your performance requirements (for latency and throughput) are in the first place, and the conditions under which this protocol will be operating?

(I'm not saying it's a bad decision, I'm just saying it's an insufficiently justified decision.)

Either way, this is not a wheel that's worth reinventing.

Re: cost of opening TCP stream, posted 14 Dec 2000 at 21:44 UTC by jmg » (Master)

TCP_NODELAY anyone?

Mimimize open/close overhead with T/TCP, posted 15 Dec 2000 at 02:29 UTC by ncm » (Master)

Also don't forget T/TCP. It includes data in the SYN packet (open request) and a FIN flag on the final (possibly first) packet. These minimize connection setup/teardown overhead. See pp. 369-371 of Stevens, "Unix Network Programming".

To use it you just have to use sendto() with the right flags.

SMB multiplexing id, posted 15 Dec 2000 at 11:15 UTC by lkcl » (Master)

BXXP looks just like SMB multiplexing ids (except that SMB is a mess).

multiplexing, posted 15 Dec 2000 at 11:20 UTC by lkcl » (Master)

microsoft will love these. why? because their servers are hit badly by clients that issue multiple TCP connections. for server, read IIS. for client, read netscape.

if anyone hears of any microsoft noises to extend any existing protocols (they're already trying with HTTP) from text-based to binary-streams-based, please tell them, politely, to look at things like BXXP or SMP or whatever, as a semi-"transport"/semi-"application" layer, and if they won't listen, then feel free to try impolitely.

thanks for advocating netstring format!, posted 15 Dec 2000 at 19:14 UTC by splork » (Master)

While I think you're re-inventing the wheel possibly for no good reason (as others have pointed out), thanks for using the netstring format for your data.

We use it as a major part of our protocols in Mojo Nation Along with Ron Rivest s-expressions for grouping/structure of data.

CRLF or < > delimited protocols really suck as they can't represent binary data efficiently and are easy to unintentionally write bad buffer-overflow ridden parsers for.

Some answers, posted 17 Dec 2000 at 09:05 UTC by itamar » (Master)

First of all, BXXP was what inspired me to use multiplexing :)

Why multiplexing?

  1. BXXP docs have a few reasons why, IIRC.
  2. "The overhead for opening a connection" - I wasn't thinking so much of the TCP overheard as the overhead for authentication and encryption. I don't know much about SSL, but http://isglabs.rainbow.com/isglabs/shawn/SSL_Perf/otpssl8.html claims "Given these figures, one should estimate at least a 500ms addition to any user's HTTP service time under SSLv3." And that's for decent connections - from Israel where I live it'd be more like 800ms.

Why not BXXP?

  1. For educational purposes - designing something from sratch can teach you a lot.
  2. BXXP CVS was down when I visited their site, and the Python implementation was not described as being stable.
  3. Unlike BXXP, my protocol is very simple and easy to implement - a first prototype using asyncore took a couple of hours to write, and reimplementing from scratch using threads took another day. Writing a BXXP implementation is not trivial - PyBXXP has 1900 lines of code, my code is 500 lines of code. Of course, BXXP offers a lot more.

Given all that, when there exists a working, stable Python BXXP implementation I might switch to using it. Till then, at least I have some working code to use.

What happened to the tcpmux service?, posted 17 Dec 2000 at 14:05 UTC by jay » (Journeyer)

The tcpmux service is TCP protocol number 1 (see your /etc/services file). What happened to that?

tcpmux is something else, posted 17 Dec 2000 at 14:15 UTC by itamar » (Master)

See RFC 1078:

   This RFC defines a protocol to contact multiple services on a single
   well-known TCP port using a service name instead of a well-known
   number.  In addition, private protocols can make use of the service
   without needing an official TCP port assignment.

More multiplexing protocols, posted 17 Dec 2000 at 14:55 UTC by itamar » (Master)

BXXP isn't the only one:

Basically, your idea sucks ;-), posted 17 Dec 2000 at 20:59 UTC by Fefe » (Master)

Sorry for being so blunt, but your idea sucks. First, you don't save any cost on the server, because it has to parse the packets and distribute the data anyway. It does not matter whether the OS does it or whether some user space program does it. If it is too slow, rewrite the software.

Second, TCP has a large window, i.e. when you multiplex one channel for bulk data and one channel for interactive data, the interactive data will receive a huge latency because it has to wait for the bulk data transfer buffer to be sent before it hits the wire.

So, the only situation where this could be any useful are several bulk transfers and several interactive connections. Bulk transfers can normally be pipelined just fine (saving the multiplexing overhead) and interactive connections don't cause enough system load to warrant this kind of complexity.

I suggest you find yourself a different problem. ;-)

Yup, that sounds like my protocol, Fefe, posted 18 Dec 2000 at 08:57 UTC by itamar » (Master)

No pipelining possible, and both bulk and interactive communication needed.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

X
Share this page