Older blog entries for robogato (starting at number 25)

Advogato Status Report

A new rev of mod_virgule code is live on Advogato. See the changelog for the details.

Mostly minor stuff. Setting a project staff relation to none now consistently removes the relation from your user profile. Thanks to Gary Benson for noticing the bug. I upgraded the server from CentOS 4.4 to 4.5. This was just a maintenance update and shouldn't cause any changes. We're having another wave of account spam lately but the new flagging system has largely controlled it. One of the spammers discovered a way of circumventing the code which strips anchor tags posted in the notes field of untrusted accounts. I've fixed the bug that allowed this.

GPL v3 Release Party in Dallas?

The GPLv3 is supposed to be released on 29 June. I saw joolean mention a GPLv3 release part in Brooklyn and figured, why not here in Dallas too? If there are any other Advogatoans in the DFW area who'd like to get together to celebrate the release of the new and improved GPL, let me know.

Trust Metric Growing Pains

The good news is that Advogato is growing again. The bad news is that this is bringing to light some issues with the trust metrics. First, there are a growing number of new users who have multiple certs but are still rated as observer. Second, there was the related incident with user OpenSpecies. Many people thought his blog posts looked spammy and flagged him as spam. Other users trusted him at Apprentice or Journeyer level but even with six or seven certs he never acquired enough gato-juice to reach Apprentice level. Because he stayed at Observer level, his account was always at risk of being classified as spam. This happened once, resulting in the decision to increase the spam score required to delete an account. I reinstated his account from a backup. A few months later it had been flagged as spam enough times to get deleted again. I restored it, however, OpenSpecies opted to move elsewhere and requested the account be permanently deleted.

The lack of gato-juice available for certifying people can be traced back to an issue with the trust metric seed users. Of the four original seed users, only raph is actively visiting Advogato and certifying users. Federico has visited in the last year but no longer certifies any users. Miguel hasn't visited in many years and only certified a handful of users. Alan has certified many users but no longer seems to be an active user himself (hopefully I'm wrong about that). This means there are really only two seeds and almost all the trust flowing to new users through certification is at best several generations removed from them.

To improve the situation, I'm going to add a few new seed users. This will need to be done gradually so that we can make sure it fixes the problem without resulting in cert inflation. My criteria for selecting new seed users will be: 1) Must be currently rated as a master by at least one of the original seed users 2) Must be rated as master by other non-seed users 3) Must be an active Advogato user who visits the site regularly and has posted at least one article 4) Must be reasonably well known within the community and have occasion to meet and interact with many other Free Software developers in person.

I talked with Raph about possible ways of handling this. Elections, nominations, automated selection by the trust metric itself, or just picking someone. Eventually, I think it would be interesting to have the trust metric select new seeds automatically as needed but that will take more time for testing and experimenting than I've got right now. So, initially I've opted for picking someone who meets the qualifications to save time. Our first new seed is: mako. By a handy coincidence, he's traveling to several European conferences over the next few weeks, giving him a chance to meet more people who may need certifying.

This is one of several things that I think should start pumping some new life into the trust metrics. Another issue I'm looking at is what to do with inactive users who have become stagnant sources in the trust metric network flow. These include users who will not return for one reason or another such as ettore, sisob or lilo. Trust passing through these nodes is essentially unchangeable, which is a problem because trust in the real world is dynamic. Sometimes we trust a person today that we didn't yesterday. Sometimes we no longer trust someone that we trusted in the past. If enough certs become stagnant and cannot be removed, this tends to make the trust metrics innaccurate. One way of dealing with this is to identify users who are inactive and expire their outbound certs automatically after enough time has elapsed. The tricky part is deciding how long a user has to go without visiting the site before being considered inactive. DV, for example, is an active user yet has gone for as much as a year between logins. Federico, one of our seed users, hasn't logged in for seven months. Right now, I'm thinking that exceeding one year without a login is a pretty good indication of inactivity.

Advogato buzz

Advogato showed up on a list of social network site statistics at the X2iN blog: Social Network Marketing, the Sky is the Limit.

Advogato's founder Raph Levien will be giving a talk titled Advogato: Lessons Learned at 6:30 PM on Monday, June 25 as part of Google's Open Source Developers @ Google series. The talk will be at Google's Mountain View campus. Guest are welcome and should sign in at Building 43.

Advogato Status Report

A new rev of mod_virgule code is live today on Advogato. See the changelog for the details.

The mod_virgule config.xml file now supports having a list of a authorized "editors". Article posting priviledges can be limited to these editors. Don't worry, this feature isn't intended for Advogato, where all certified members will continue to be able to post articles. It will be used on robots.net. In the past robots.net was configured such that only the users who were trust metric seeds could post stories. As robots.net has grown, the need has arisen to make a clear distinction between the list of trust metric seed users and the article editors. I think this feature will be useful on other sites that use mod_virgule as well.

I've tweaked the HTML layout of the diary entries, replacing the older style markup with divs. At the request of trs80, the div wrappers on each diary entry now include the username as a second class. While not needed for CSS, this additional class designator can be used by screen-scrapers to easily identify the author of each entry in the recentlog. Screen-scraping aggregators can use this as part of a dupe-control mechanism. This same username as class convention is used on many Planet sites, so it should make Advogato's recentlog more easily parsable by existing Planet scrapers. The fun part was the slight difference between legal mod_virgule usernames and legal CSS1/2 class names. This prompted the creation of a new utility function, virgule_force_legal_css_name(). Supplied with an arbitrary string of text, this function will return a properly escaped CSS1 class name.

More good Advogato buzz

Andrey Golub of Milan IN recently discovered Advogato and gave a nice mention in his blog. He also added Advogato to Milan-IN's listing of Online Social Networking Platforms. Perhaps this will bring a few other new Advogato members from the Italian free software community our way.

Dan York also gave us a great mention in his blog. He's an Advogatoan from way back who left Advogato for LiveJournal during the extended Advogato server outage back in 2004. He was writing to commemerate his 7th year of blogging and rediscovered Advogato in the process. His entry summarizes the recent changes on Advogato and suggests dyork may be making an appearance in our recentlog again soon.

During a recent discussion on the Extreme Programming mailing list about the possibility of a certification mechanism for XP programmers, Martijn Meijering suggested that a community trust metric system similar to Advogato's might be a desirable alternative to certification based on traditional knowledge-based testing.

Advogato Status Report

A new rev of mod_virgule code went live today on Advogato. See the changelog for the details.

I improved the ATOM feed handling of the aggregator code. Feeds that include only a <summary> tag and no <content> tag are now handled correctly. Also, feeds that include an <updated> tag but no <published> tag are handled correctly. Both these variations, while technically legal according to RFC 4287, seem to be very rare in the real world (not to mention a bit odd). Why include the datestamp of the last update but not the original publication date? Why include the full content of the blog but call it the summary instead of the content? Both these weird-but-legal annoyances were apparently generated by a "django powered" site. Not sure if that means the problem stems from django or just how it was used in this case.

The last few sections of mod_virgule still using hard-coded pages now use templates. This allowed another nasty chunk of hard-coded, site specific markup to be removed from the mod_virgule codebase. It was nice to see the code and binary get smaller for a change! Even though you probably won't notice any huge change in how the site looks, this is a major milestone for mod_virgule. It's finally possible to use it for a new website without having to modify the C source to remove Advogato or robots.net specific HTML. A few more changes are needed to group all the templates together with the CSS files to create an easily themable layout.

Despite the report that Advogato has failed, things continue to look better each month. We've set new records for user logins three months running (at least since I started keeping records six months ago). More than 70 Advogato users have returned to the recentlog via blog aggregation so far. The founding gato himself even stopped by this week to post an article on the new browser wars.

Advogato got a positive mention in a recent comparison of sites for software developers in John Manoogian's blog Inventing What's Next.

10 Apr 2007 (updated 10 Apr 2007 at 21:28 UTC) »

Advogato Status Report

New mod_virgule code went live today on Advogato. See the changelog for the details.

I've refactored some of the page rendering code to simplify the problem of pre-rendering page content for use in template-based pages. This should make the job of converting mod_virgule's hard-coded pages to template-based pages as easy as swapping out three or four lines of code. All the profile pages are now template based, as are the project pages. The new header has been added to all these pages. There are still a handful of hard-coded forms and form result pages. They're next up on the ToDo list.

You may have noticed some experimental social bookmarking links I've added to the article headers. Three social bookmarking sites are supported: Digg, del.icio.us, and Reddit. If you have an account at one or more of these services, try it out and let me know if it works for you. I'd like to get some feedback on this idea. Would you like to see additional bookmarking services included? Which ones? Also, would you like to see this idea extended to blog entries as well as articles? If this turns out to be a handy feature, I may encapsulate all the bookmark icons in some sort of little popup window, something like Alex King's "share this". Then we'd just have one little icon instead of a whole string of them - probably the emerging social bookmarking icon.

Advogato Status Report

New mod_virgule code is live today on Advogato. See the changelog for the details. No new release yet, though. I'm hoping I'll find time to finish up a couple of additional things before the next release.

The feed aggregator can now handle RSS/ATOM feeds that include the blog content as unescaped XHTML within the feed XML tree instead of as escaped content within a single XML node. This seems like a risky approach since the slightest markup error in the blog's XHTML renders the whole feed invalid and unparsable. Worse, the particular ATOM feed that brought this problem to light, generated by blogger, appears to randomly alternate between the two methods. One post is carried as normal escaped content within the entry node and the next is shoved in as an unescaped tree of XHTML tags. But who am I to argue with blogger? If it exists in the wild and doesn't appear to violate the standards, I'll try to make mod_virgule handle it correctly.

I've added support for the foaf:mbox_sha1sum field in the FOAF files output by mod_virgule. This field is an SHA-1 hash of the user email address. It's used as an identifier by some FOAF applications. There is also a group working on a SpamAssassin plugin and email whitelist database that will use trust metrics and FOAF data collected from community sites like Advogato. The email field in the user profile used to be optional, so if you're an old time Advogato user, check your profile and make sure your email address is included. Actually, everyone ought to make sure their email address is current, just in case you need to use the password reminder some day.

Blog (diary) pages are now template based rather than hard coded HTML generated by mod_virgule. The blog page template includes the new page header.

Barbara Irwin of the Victoria Linux Users Group emailed to let us know they've added Advogato to the Loads of Linux Links (LOLL) directory. The LOLL directly looks like an interesting collection of Linux links. Check it out.

Google turned down Advogato's Summer of Code mentor application. While disappointing, this didn't come as a total shock. There's no official organization behind mod_virgule, it's a very small project, and it still seems to be viewed as dead or dying by a few people. That's okay, maybe next year. In the meantime, I'm going to continue working to bring mod_virgule up to date.

There are several badly needed features that are going to require some major code refactoring and code cleanup. One of the Summer of Code ideas was directly related to this. The existing code base desperately needs improved commenting and documentation. I'd really like to see the comments normalized to Doxygen style and comments added to all the currently uncommented sections of the code. Having better comments and documentation would really help with future refactoring of the code and would also lower the barrier for new developers who need to understand how mod_virgule works. Any volunteers? Adding and rewriting code comments doesn't require extensive programming skill (though you will need to be able read and understand some less than beautiful C code).

There are other SoC mod_virgule ideas that I'd still like to see someone help with. Even without Google funding, it's still good experience and might even be fun. If you think you might be interested in helping out, take a look at the ideas list and let me know.

Advogato Status Report

A new rev of mod_virgule went live yesterday on Advogato. See the changelog for the details.

With all the articles being posted lately, the need to edit an article to correct mistakes and typos resurfaced. The article code is a bit scary and looks way overdue for a complete rewrite. But until then, I've added one more kludge to allow editing. Articles are now editable by the author for a period of 30 days after they're posted. (If you can't fix your typos in 30 days, you probably never will!) Articles that have been edited will include a revision date in the article header.

Otherwise, mostly small changes this time around. The much maligned certification dialog text inherited from robots.net has been toned down to something more minimal. I made a few very minor security enhancements to the new accounts page. A CSS clear:both style was added to the recentlog post headers. This fixes the bug that allowed floated images in a post to overlap the next post. I've migrated a few more pages to the new header style.

I made a few minor tweaks to the profile pages to help control bandwidth wastage and security problems. Untrusted users no longer have RSS feeds or FOAF RDF support on their user profiles. This is to prevent abuse by spammers but will also help cut down on bandwidth slightly. The biggest change is that RSS feeds don't exist until an account has at least one diary entry. This removes about 9,000 RSS feeds that were empty (but still being checked several times an hour by a hundred different aggregators).

I've banned a misbehaving web robot, named VoilaBot, used by a French search engine. Despite retrieving our robots.txt file several thousand times per day, it appears to ignore it. This robot was using gigabits of our bandwidth (up to 10% of the total so far this month). We get no inbound traffic from this search engine in return (which isn't suprising since Advogato isn't a French language site).

I've also banned several other robots that appeared to be harvesting email addresses for spammers. One of these had an agent string only one character different than pipeman's XML-RPC client. A typo on my part blocked him for a few hours. Sorry about that.

Google Summer of Code Mentor Application

I filed a mentor application in Advogato's name for the 2007 Google Summer of Code. If Google accepts it, I'm hoping maybe we can recruit a student or two to help with some of the mod_virgule work.

18 Feb 2007 (updated 18 Feb 2007 at 14:41 UTC) »
Advogato Status Report

A new rev of mod_virgule code went live today. See the changelog for the details. Lots of minor bug fixes and a couple of more interesting changes. Even one hardware note: I've doubled the RAM on the server from 1GB to 2GB.

Long Lost Trust Certifications Restored

You may have noticed some additional inbound or outbound trust certifications on your page or slight changes in your certification level this week thanks to some repairs done to the XML datastore. This would be a good time to go through your certs and make sure you've certified everyone you want to and no one you don't want to.

Over the 8 years Advogato has been online, it has suffered through several semi-catastrophic events including disk failures and power supply failures. There was also a mod_virgule bug triggered under disk-full conditions that truncated many user account profiles a year or so ago. The result of these past catastrophies was the complete loss of a few user profiles and minor corruption of many others. Usually, enough of the profile XML file remained (or could be restored) to allow a user to log in but some or all of the trust metric certifications and other data were lost. For a while the corrupt profiles could cause mod_virgule to segfault during a trust metric update (that bug has been fixed for a while). The most noticeable side-effect is missing or incorrect certs on the profile page.

One of the interesting things about the way the trust certs are stored in the XML database is that each cert is recorded in the profile of both the issuer and subject. This means it's possible to reconstruct a lost cert provided one of the two records still exists. Well, I finally got a chance to write some code to do that. I've written a new mod_virgule function to analyze the user profiles, find these sorts of problems, and repair them when possible. In addition to restoring lost certs, the new code also looks for invalid XML, missing profiles, certs to or from non-existent accounts, and a few other forms of corruption that are known to occur occasionally.

The result?

1115 missing outbound certs records restored
1264 missing inbound certs records restored
17 other misc profile corruption problems fixed

One side effect of all this is that all those missing certs will now be included in the trust metric computations again. So there have probably been a few changes in certification levels.

Consistent Page Headers on the way

One persistent category of Advogato complaints I get is about the inconsistent page layout. Some pages have menus at the top, other pages have the menu at the bottom. Sometimes the menu is centered, sometimes it's right justified. Most pages don't have a logo or even the name of site on them, which makes it confusing if you arrive from a search engine anywhere but the index page. On the other hand I feel like I have to balance the need for an updated, consistent page layout with Advogato's historically minimal design. So I'll try to take things slow and not make any major changes overnight. I've created a standard page header and page layout that should address the consistency issues without drastically altering the appearance of the site.

Over time, I'll try to get the new header on every page so the site begins to look a little more consistent. There are still a few pages with hard-coded HTML generated by mod_virgule. Making these remaing pages template-based will require code changes. One other nice result of finally getting the last few parts of mod_virgule fully template-based is that we should be able to purge the last non-standard HTML and maybe even bring the site up to full XHTML standards compliance.

As part of the page header improvements, I've converted the Advogato logo from GIF to PNG. The new logo has the same dimensions but the filesize is about 20% smaller, saving us a little bandwidth. I've also added a Google Coop AJAX-based search widget to provide a site search function, another frequent request. The new layout can be seen on the people page and a few other pages so far. You may also notice some new stats on the people page - this is another handy use of the new user account analysis code.

Advogato Articles

I was pleased to get all the emails and comments on my GNU/FSF news summary. I'd still like to find a volunteer who's willing to put together a summary like this every month.

I was also very pleased to see other new articles posted by mjg59, fxn, and lkcl. The ACPI article got picked up by linux today and generated more hits than any other article in the last several months. If we could generate a few articles like that every month, we'd be well on the way to making Advogato a more interesting and useful site.

PyCon and Advogato

PyCon is coming to Dallas, where the Advogato site is hosted. Is anyone up for some type of Advogato get-together during the conference (Feb 23-25)? If you'll be in town and want to meet some fellow Advogato users, email me and we'll work out the details.

Advogato's Aggravation

I've been pondering the problem of what do about Advogato's article section on the main page. Aside from the various bugs and feature requests I've been working on, the single most common complaint I've seen about the site is the low quality of the articles. As I mentioned in an earlier post, this problem has been brought up before.

It seems to me that rather than worry too much about how to prevent the occasional bad articles, we should focus on how to encourage useful and interesting articles. The first step is to find a definition of what useful and interesting mean in the context of Advogato.

Obviously, articles about software design, standards, or related topics are always interesting. If you're working on a paper or a talk for an upcoming FOSS conference, consider posting a freely licensed draft as an article to get feedback. The occasional interview, question, insight, or advice from someone in the community can also be interesting. Unfortunately, past experience shows we can't expect many of these types of articles. That still leaves a pretty big gap that will likely be filled by noise if it isn't used for something more interesting.

There are already plenty of sites like Slashdot where one can find vaguely FOSS-related links to news stories. I don't think Advogato should go the route of becoming yet another aggregator of recycled news stories. While that's an easy solution and would probably generate a lot of traffic, it's not why we're here. In one of Raph's early postings about Advogato he said the purpose of the site is "to bring a group of people closer together, not to generate hits.".

Robogato's Revelation

What is it that makes Advogato different from other Free Software/Open Source web communities? Most sites focus on a very particular FOSS sub-community: GNU, Apache, BSD, KDE, Mozilla, RedHat, Debian, FreeDesktop/X.Org, Perl, Python (to name just a few). Often, members of each community aggregate around each other, ignoring or forgetting what's going on in the larger FOSS community. Advogato, on the other hand, has active members from almost all these communities. This is one place where we can read each other's blogs and find out what's going on in other parts of the FOSS community.

When I realized what a unique position Advogato is in, it became obvious to me that one useful and interesting thing we can do is use the articles section to inform each other of what our respective communities have been doing on a weekly or monthly basis. Often the volume of news, blogs, and websites in each sub-community makes it difficult for an outsider to stay up to date.

As an illustration of this, I'm reminded of the LKML. The volume of the list makes it impossible for me to keep up - I simply don't have the time. However, I used to enjoy reading the Kernel Traffic summaries regularly so I'd have some idea of what the Linux developers were up to. Sadly, Kernel Traffic is no more. Likewise, there have been similar efforts to summarize activity in other communities (e.g. Brave GNU World, This Month in BSD, the gcc newsletter, WineHQ news, etc). Most of these are defunct, being replaced by dozens of individual websites, blogs, and mailing lists.

What I propose is recruiting Advogato users from each of the many FOSS communities to write and post a periodic summary of significant events in their respective groups. I'm willing to work with these volunteers to devise a useful format and a system for assembling the reports. This will take some time to get going so I think the best plan is to focus on the communities one by one, working out the system and getting things started, then moving on to the next group. As a start, I've written an example summary of the GNU project's activites this month. I've worked out where to get the information and how to assemble it into a simple format. I'll post it shortly as an article. What I need now is just one volunteer willing to contribute an hour of their time once a month to assemble and post a GNU update. Who's up for the job?

The next question is what FOSS community would you like to see a monthly summary of next? Ruby? Perl? BSD? I need suggestions and volunteers. gato@advogato.org

2 Feb 2007 (updated 3 Feb 2007 at 21:38 UTC) »

Advogato Status Report

A new rev of mod_virgule code went live today. See the changelog for the details.

This rev adds FOAF files to our user profiles, helping to make Advogato part of the Semantic Web. Each account profile page has a visible FOAF link as well an auto-discovery meta link that points to a foaf.rdf file for that account. At present the FOAF files have minimal properties. The FOAF standard allows for some additional features that will probably be added over time. At present, outbound trust certifications are converted to foaf:knows properties. Inbound certs are ignored. Project relations are exported as foaf:currentProject properties. To get an idea of what you can do with FOAF, try using the DISCO Hyperdata Browser to view the FOAF file of an Advogato seed account such as Raph's (see also the FOAFer result for the same file).

In addition to the new FOAF badge, you may have noticed some other very minor changes on the user profile. I've done a little HTML clean up and correction. The old, ugly RSS image has been replaced with the standard feed icon established by the Mozilla Foundation. Combined with our new RSS 2.0 feeds, this almost makes it look like Advogato is a modern website. :-)

Among other minor changes, trust certifications now include a date stamp. This will allow the future addition of date-dependent trust features such as age-based certificate expiration for inactive users.

All of the admin functionality of mod_virgule has been moved to a single base URL where it can be password protected. This includes the diagnostics page and crank pages for diary ratings, trust metrics, and the aggregator. Several of these pages were security risks either by leaking information about the server configuration or by being CPU intensive enough to be useful for DoS attacks.

Certification dialog

cdfrey notes in his blog:

"I just noticed something new in the advogato pages. When looking at a user, you get the following warning:

Note: By certifying a user you are making a public statment that you know this person and can vouch for their identity.

When did this happen?

I must disagree with this sudden pseudo-gpg keysigning level of certification, especially since this warning is now retroactively applied to people's previous certifications, by mere virtue of being tacked on the bottom of the list."

The new text appeared on Oct 1, 2006 when Advogato was migrated to the newer version of mod_virgule. The message is hard-coded in the module that creates the user profile page and was originally added, not for Advogato, but for robots.net some years before.

On robots.net, the users are not all programmers and many don't have previous experience with any sort of trust metrics. As a whole, the user base had begun to view the trust metric system as nothing more than a group-powered method of allowing other users to post on the site. As a result there was a huge amount of cert inflation (even compared to Advogato) with a large percentage of the user base tending toward Master certification. Many users were automatically certifying all new users as Masters, assuming this would allow them to post and therefore improve the community. In reality, it just increased the noise and spam level, of course.

I experimented with a variety of short messages under the cert dialog to impress upon people that by certifying someone, they bore some responsibility for the results. This particular message seemed to have the most dramatic effect and, over time, solved our problem.

I agree it's unnecessary for Advogato since most users here understand to one degree or another what the trust metric is for. I'll take a look at making this page more easily configurable on a site-by-site basis. That will allow us to use different text on Advogato or remove the message altogether.

With regard to the actual meaning, I didn't intend for "know this person" to mean only that you've met them in person, in meatspace. You might also know them in some other online capacity outside of Advogato. You might know them through email, IRC, another website, etc. In some cases, you might even get to know them by reading their blog on Advogato long enough to feel comfortable expressing some trust for them. I assume Raph meant something similar in his original cert instructions when he says to certify "free software developers you know". My understanding of the trust metric is that you're certifying to the community that you trust the subject really is who they claim to be (at least to the extent that they claim to be a member of the free software community).

16 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!