11 Jun 2000 tmattox   » (Apprentice)

It's been a busy week and a half since my last entry.

  • On Friday, Hank and I finished writing an article about how we got over 64 GFLOPS on KLAT2. Hopefully it will be appearing on Ars Technica in the next week.
  • I sent off the PCB design files for a new AFN based on a revised PAPERS 960801 module. So, hopefully, in a few weeks, we can have an AFN up and working on KLAT2. The new design files will be posted once I've verified that none of my changes/tweaks messed up the functionality/reliability of the PAPERS 960801 design. I didn't need to make revisions, however, since we needed to have a new run of PCBs made, it was a good time to correct some annoyances with the old design.
  • And now for the ugly events of the past two weeks or so:

The 32-port Fast Ethernet switches that we purchased for KLAT2 had a design flaw, causing a 60% failure rate after a few weeks of use. The manufacturer says it is a latent thermal problem. I suspect that the failure rate will approach 100% within another month or two. The company is sending us replacements for the entire set, that will have a design revision that supposedly fixes the problem. Yet, they won't get here for another two weeks! I've set the thermostat in the lab down to 65 F, making it rather unpleasant for me to work in the room. Hopefully, the remaining 4 switches will keep working until the replacements arive.

The other recent unpleasant event is our discovery that the "marketroids" have again redefined a technical term/phrase into oblivion. The phrase "wire-speed switching on all ports" used to have a technical meaning that the backplane bandwidth of a switch was large enough to handle all ports going at full speed in full duplex mode continuously, as long as the communication pattern was a permutation. The key here is that "wire-speed" should mean that as long as I am the only processor/NIC sending to another particular processor/NIC, I should have full wire-speed bandwidth available for my use, regardless of what other traffic is in the switch. The marketroids seem to have modified this definition to mean that for some permutations, you can achieve wire-speed, but not for all permutations. ACK!

So, if we can get more specific details on the internal structure of common switches we will try and modify our GA to accomodate the restrictions when designing a FNN. Most switches seem to be built with 8+1 switch-on-a-chip modules, where the +1 ports are tied together in a unidirectional ring of varous bandwidths. The key is that, depending on how high the ring's bandwitdh is, the overall switch cannot achieve wire-speed for permutations that must go almost all the way around the ring. This will also affect the observed latency of your connection patterns, possibly dramatically.

P.S. - We did NOT want to know this. But too late now... What happend to crossbars, fat-trees, and star topologies for internal switch fabrics? (I know: economics...) Addendum: I just read through a document from Allayer, a switch-on-a-chip maker, that reasonably explains the choice of a ring.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!