dreier is currently certified at Journeyer level.

Name: Roland Dreier
Member since: 2000-05-04 21:54:41
Last Login: 2007-05-17 02:43:10

FOAF RDF Share This

Homepage: http://www.digitalvampire.org/

Notes: I'm currently a software engineer at Cisco. I spend most of my time hacking on the Linux kernel, mostly working on InfiniBand support.

Projects

Recent blog entries by dreier

Syndication: RSS 2.0

10 Jun 2009 »

Lazyweb: best Verizon data card?

I currently have Verizon mobile data service with a Kyocera PC card, and it works well with recent distros using NetworkManager.  However, my venerable laptop is being replaced with a Lenovo X200, which has no PC card slot, so I’ll have to replace my Verizon data card as well.  According to the Verizon Wireless web site, my choices seem to be the NovaTel V740 for ExpressCard, or for USB the UTStarcom UM175 or the Novatel USB760.

My question for the lazyweb is: which data card/EV-DO modem should I get (assume that I’ll be running Linux 99.9% of the time when I use it)?  The ExpressCard is substantially more expensive and less flexible (since I may want to use this card on a system without an ExpressCard slot someday), so I’d probably go with one of the USB cards if it’s left up to me.  The USB760 doubles as a micro SD reader, which is not useful to me, and confounds things with a mass storage interface that probably just causes confusion, so my first choice would be the UM175 probably.  However if someone with first-hand knowledge knows why that’s a bad decision, I’d love to hear about it in the comments.

(And I put a very high value in not having to boot into Windows periodically to update cell tower locations or anything like that, for what it’s worth)

Syndicated 2009-06-10 00:15:30 from Roland's Blog

26 Mar 2009 »

RDMA on Converged Ethernet

I recently read Andy Grover’s post about converged fabrics, and since I particupated in the OpenFabrics panel in Sonoma that he alluded to, I thought it might be worth sharing my (somewhat different) thoughts.

The question that Andy is dealing with is how to run RDMA on “Converged Ethernet.” I’ve already explained what RDMA is, so I won’t go into that here, but it’s probably worth talking about Ethernet, since I think the latest developments are not that familiar to many people.  The IEEE has been developing a few standards they collectively refer to as “Data Center Bridging” (DCB) and that are also sometimes referred to as “Converged Enhanced Ethernet” (CEE).  This refers to high speed Ethernet (currently 10 Gb/sec, with a clear path to 40 Gb/sec and 100 Gb/sec), plus new features.  The main new features are:

  • Priority-Based Flow Control (802.1Qbb), sometimes called “per-priority pause”
  • Enhanced Transmission Selection (802.1Qaz)
  • Congestion Notification (802.1Qau)

The first two features let an Ethernet link be split into multiple “virtual links” that operate pretty independently — bandwidth can be reserved for a given virtual link so that it can’t be starved, and by having per-virtual-link flow control, we can make sure certain traffic classes don’t overrun their buffers and avoid dropping packets.  Then congestion notification means that we can tell senders to slow down to avoid congestion spreading caused by that flow control.

The main use case that DCB was developed for was Fibre Channel over Ethernet (FCoE).  FC requires a very reliable network — it simply doesn’t work if packets are dropped because of congestion — and so DCB provides the ability to segregate FCoE traffic onto a “no drop” virtual link.  However, I think Andy misjudges the real motivation for FCoE; the TCP/IP overhead of iSCSI was not really an issue (and indeed there are many people running iSCSI with very high performance on 10 Gb/sec Ethernet).

The real motivation for FCoE is to give a way for users to continue using all the FC storage they already have, while not requiring every server that wants to talk to the storage to have both a NIC and an FC HBA.  With a gateway that’s easy to build an scale, legacy FC storage can be connected to an FCoE fabric, and now servers with a “converged network adapter” that functions as both an Ethernet NIC and an FCoE HBA can talk to network and storage over one (Ethernet) wire.

Now, of course for servers that want to do RDMA, it makes sense that they want a triple-thread converged adapter that does Ethernet NIC, FCoE HBA, and RDMA.  They way that people are running RDMA over Ethernet today is via iWARP, which runs an RDMA protocol layered on top of TCP.  The idea that Andy and several other people in Sonoma are pushing is to do something analogous to FCoE instead, that is, take the InfiniBand transport layer and stick it into Ethernet somehow.  I see a number of problems with this idea.

First, one of the big reasons given for wanting to use InfiniBand on Ethernet instead of iWARP is that it’s the fastest path forward.  The argument is, “we just scribble down a spec, and everyone can ship it easily.”  That ignores the fact that iWARP adapters are already shipping from multiple vendors (although, to be fair, none with support for the proposed IEEE DCB standards yet; but DCB support should soon be ubiquitous in all 10 gigE NICs, iWARP and non-iWARP alike).  And the idea that an IBoE spec is going to be quick or easy to write flies in the face of the experience with FCoE; FCoE sounded dead simple in theory (just stick an Ethernet header on FC frames, what more could there be?) it turns out that the standards work has taken at least 3 years, and a final spec is still not done.  I believe that IBoE would be more complicated to specify, and fewer resources are available for the job, so a realistic view is that a true standard is very far away.

Andy points at a TOE page to say why running TCP on an iWARP NIC sucks.  But when I look at that page, pretty much all the issues are still there with running the IB transport on a NIC.  Just to take the first few on that page (without quibbling about the fact that many of the issues are just wrong even about TCP offload):

  • Security updates: yup, still there for IB
  • Point-in-time solution: yup, same for IB
  • Different network behavior: a hundred times worse if you’re running IB instead of TCP
  • Performance: yup
  • Hardware-specific limits: yup

And so on…

Certainly, given infinite resources, one could design an RDMA protocol that was cleaner than iWARP and took advantage of all the spiffy DCB features.  But worse is better and iWARP mostly works well right now; fixing the worst warts of iWARP has a much better chance of success than trying to shoehorn IB onto Ethernet and ending up with a whole bunch of unforseen problems to solve.

Syndicated 2009-03-26 04:11:08 from Roland's Blog

20 Feb 2009 »

Know anyone at Coverity?

The recent mention of scan.coverity.com at lwn.net reminded me that the Coverity results for the kernel (what they call “linux-2.6″) have become pretty useless lately.  The number of “results” that their checker produce jumped by a factor of 10 a month or so ago, with all of the new results apparently warning about nonsensical things.  For example, CID 8429 is a warning about a resource leak, where the code is:

      req = kzalloc(sizeof *req, GFP_KERNEL);
      if (!req)
              return -ENOMEM;

and the checker thinks that req can be leaked here if we hit the return statement.

The reason for this seems to be that the checker is run with all config options enabled (which is sensible to get maximum code coverage), and in particular it seems to be because the config variable CONFIG_PROFILE_ALL_BRANCHES is enabled, which leads to a complex C macro redefininition of if() that fatally confuses the scanner.

I’ve sent email to scan-admin about this but not gotten any reply (or had any effect on the scan). So I’m appealing to the lazyweb to find someone at Coverity who can fix this and make the scanner useful for the kernel again; having nine-tenths or more of the results be false positives makes it really hard to use the current scans. What needs to be done to fix this is simple to make sure CONFIG_PROFILE_ALL_BRANCHES is not set; in fact it may be a good idea to set CONFIG_TRACE_BRANCH_PROFILING to n as well, since enabling that option causes all if statements annotated with likely() or unlikely to be obfuscated by a complex macro, which will probably lead to a similar level of false positives.

Syndicated 2009-02-20 06:38:54 from Roland's Blog

1 Dec 2008 »

Signs you may not be dealing with a straight shooter

A play in two scenes.

DRAMATIS PERSONAE

  • X - a senior manager
  • X’s admin - keeper of X’s schedule
  • Y - a hard-working developer

SCENE I

setting: X’s admin’s desk

X’s admin: Hi, Y.

Y: Hi.  I’d like to get some time with X when there’s an opening.

X’s admin: How about next Tuesday at 1?

Y: Perfect.  Please put me on the schedule then.

SCENE II

setting: X’s office, next Tuesday at 1

X: Hi, Y.

Y: Hi, good to see you.

X: Thanks for coming by.  We haven’t talked for a while and I just wanted to touch base.

Y: …

Syndicated 2008-12-01 06:09:42 from Roland's Blog

19 Nov 2008 »

On over-engineering

I’ve been trying to get a udev rule added to Ubuntu so that /dev/infiniband/rdma_cm is owned by group “rdma” instead of by root, so that unprivileged user applications can be given permission to use it by adding the user to the group rdma.  This matches the practice in the Debian udev rules and is a simple way to allow unprivileged use of RDMA while still giving the administrator some control over who exactly uses it.

I created a patch to the Ubuntu librdmacm package containing the appropriate rule and opened a Launchpad bug report requesting that it be applied.  After two months of waiting, I got a response that basically said, “no, we don’t want to do that.”  After another month of asking, I finally found out what solution Ubuntu would rather have:

Access to system devices is provided through the HAL or DeviceKit interface. Permission to access is managed through the PolicyKit layer, where the D-Bus system bus service providing the device access negotiates privilege with the application requesting it.

Because of course, rather than having an application simply open a special device node, mediated by standard Unix permissions, we’d rather have to run a daemon (bonus points for using DBus activation, I guess) and have applications ask that daemon to open the node for them.  More work to implement, harder to administer, less reliable for users — everyone wins!

Sigh….

Syndicated 2008-11-19 17:33:51 from Roland's Blog

38 older entries...

 

dreier certified others as follows:

  • dreier certified dmarti as Journeyer
  • dreier certified fusion94 as Journeyer
  • dreier certified bwoodard as Journeyer
  • dreier certified jallison as Master
  • dreier certified dtype as Journeyer

Others have certified dreier as follows:

  • dmarti certified dreier as Journeyer
  • taral certified dreier as Journeyer
  • benson certified dreier as Journeyer
  • dtype certified dreier as Journeyer
  • jallison certified dreier as Journeyer
  • fusion94 certified dreier as Journeyer
  • khazad certified dreier as Journeyer
  • ole certified dreier as Journeyer

[ Certification disabled because you're not logged in. ]

New Advogato Features

FOAF updates: Trust rankings are now exported, making the data available to other users and websites. An external FOAF URI has been added, allowing users to link to an additional FOAF file.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

X
Share this page