Recent blog entries for Skud

Read this interview with me about leading AdaCamp Berlin and Bangalore

As I mentioned earlier today, I’m off to Europe shortly for AdaCamp Berlin, then in November I’m going to India for AdaCamp Bangalore. I’ll be leading both events, which means I get to welcome everyone and set the stage for the unconference, make sure the sessions and workshops run smoothly, and that the culture of AdaCamp meets its usual high standards.

The Ada Initiative just posted this announcement and interview where I talk a bit about my experience with AdaCamp, running various community events, and what I’ll bring to these ones.

Syndicated 2014-10-02 11:52:33 from Infotropism

Travels: London and Berlin (Oct 7th-20th, ish)

I haven’t mentioned this on here yet so I thought I’d better do so before I actually, you know, board the plane.

I’m heading over to Europe next week and the week after. The main reason I’m going is AdaCamp in Berlin, which I will be helping run, but before and after that I’ll also be spending some time in the UK and running this Growstuff event, to get stuck into some serious code with some of our UK-based developers, in London on Oct 18-19.

If you are in the UK and are interested in food innovation, open data, technology for social good, sustainability, inclusive open source projects, or related fields, I would love to meet you! If you can’t make it to the Growstuff code sprint but would like to catch up for a coffee or something, drop me a line.

Syndicated 2014-10-02 04:29:13 from Infotropism

Why I just stopped using IM (hint: fucking Google)

tl;dr – if we usually talk on IM/GTalk you won’t see me around any more. Use IRC, email, or other mechanisms (listed at bottom of this post) to contact me.


Background: Google stopped supporting open standards for IM a few years ago.

Other background: when I changed my name in 2011 I grabbed a GMail account with that name, just in case it would be useful. I didn’t use it, though — instead I forwarded any mail from it to my actual email address, the one I’ve had since the turn of the century: skud@infotrope.net, and set that address as my default for everything I could find.

Unfortunately Google didn’t honour those preferences, and kept exposing my unused GMail address to people. When I signed up for Google Groups, it would be exposed. When I shared Google Docs, it would be exposed. I presume it was being exposed all kinds of other ways, too, because people kept seeing my GMail address and thinking it was the right way to contact me. So in addition to the forwarding I also set up a vacation reminder telling anyone who emailed me there to use my actual address and not to use the Google one.

But Google wasn’t done yet. They kept dropping stuff into my GMail account and not forwarding it. Comments on Google docs. Invitations. Administrative notices. IM logs that I most definitely did not want archived. These were all piling up silently in an account I never logged into.

Eventually, after I missed out on several messages from a volunteer offering to help with Growstuff, I got fed up and found out how to completely delete a GMail account. I did this few weeks ago.

Fast forward to last night, when my Internet connection flaked out right before I went to bed. I looked at all my disconnected, blank windows, shrugged, and crashed for the night. This morning, everything was better and all my apps set about reconnecting.

Except that Adium, the app I use for instant messaging, was asking me for the GTalk password for skud@infotrope.net. Weird, I thought, but I had the password saved in my keychain and resubmitted it. Adium, or more properly GTalk, didn’t like it. I tried a few more times, including resetting my app password (I use two-factor auth). No luck.

Eventually I found the problem. Via this Adium bug report I learned that a GMail account is required to use GTalk. Even if you don’t use (and have never used) your GMail address to login to it, and don’t give people a GMail address to add you as a contact.

So, my choices at this point are:

  1. Sign up again for GMail, continue to have an unused and unwanted email address exposed to the public, miss important messages, and risk security/privacy problems with archiving of stuff I don’t want archived; or,
  2. Set up Jabber/XMPP, which will take a fair amount of messing around (advice NOT wanted, I know what is involved), and which will only let me talk to friends who don’t use GMail/GTalk (a small minority); or,
  3. Not be available on IM.

For now I am going with option 3. If you are used to talking to me via IM at my skud@infotrope.net address, you can now contact me as follows.

IRC: I am Skud on irc.freenode.net and on some other specialist networks. On Freenode I habitually hang around on #growstuff and intermittently on other channels. Message me any time; if I’m not awake/online I’ll see it when I return.

Email: skud@infotrope.net as ever, or skud@growstuff.org for Growstuff and related work.

Social media: I’m on social media hiatus and won’t be using it to chat at length, but still check mentions/messages semi-regularly.

Text/SMS: If you have my number, you know where to find me.

Voice/video (including phone, Skype, etc): By arrangement. Email me if you want to set something up.

To my good friends who I used to chat to all the time and now won’t see around so much: please let me know if you use Jabber/XMPP and if so what your address is; if you do, then I’ll prioritise getting that set up.

Syndicated 2014-09-30 23:57:30 from Infotropism

Open food interoperability: entities, unique IDs, and semantic equivalence

This is a post I made on Growstuff Talk to propose some initial steps towards interoperability for open food projects. If you have comments, probably best to make them on that post.


I wanted to post about some concepts from my past open data work which have been very much in my mind when working on Growstuff, but which I’m not sure I’ve ever expressed in a way that helps everyone understand their importance.

Just for background: from 2007-2011 I worked on Freebase, a massive general-purpose open data repository which was acquired by Google in 2010 and now forms part of their “Knowledge” area. While working at Google I also worked as a liaison between Google search/knowledge and the Wikimedia Foundation, and presented at a Wikimedia data summit where we proposed the first stages of what would become Wikidata — an entity-based data store for all of Wikimedia’s other projects.

Freebase and Wikidata are part of what is broadly known as the Semantic Web, which has to do with providing data and meaning via web technologies, using common data formats etc.


The Semantic Web movement has several different branches, ranging from the extremely abstract and academic, to the quite mundane and pragmatic. Some of the more common bits of Semantic Web technology you might have come across are microformats, for instance, which let you add semantic meaning to your HTML markup, for instance for defining the meanings of links to things like licenses or for marking up recipes on food blogs and the like. There is also Semantic Mediawiki which adds some semantic features on top of a wiki, to allow you to query for information in interesting ways; Practical Plants uses SMW and its search is based on this semantic data.

At the more academic end of the Semantic Web world are things like RDF which creates a directed graph of semantic data which can be queried via a language called SPARQL, and attempts to define data standards and ontologies for a wide range of purposes. These are generally heavyweight and mostly of interest to researchers, academics, etc, though some aspects of this work are starting to seep through into consumer technology.

This is all background, however. What I wanted to talk about was the single most important thing we learned while working on Freebase, which is this:

Entities must have unique identifiers.

Here’s what I mean. Let’s say you know three people all called Mary Smith. Then someone says, “It’s Mary Smith’s birthday today.” Which one are they referring to? You don’t know. In any system based around knowledge, you need to have some kind of unique ID for each entity to avoid ambiguity. So instead you might say, “Mary Smith, whose employee number is E453425″ or “Mary Smith, whose email address is mary@example.com”, or “Mary Smith, whose primary key in our database is 789″.

When working on our proposal for phase 1 of Wikidata, one of the things we realised is that the Wikimedia community — all the languages of Wikipedia, the Wikimedia Commons, etc — lacked unique identifiers for real-world entities. For instance, Barack Obama was http://en.wikipedia.org/wiki/Barack_Obama on English Wikipedia and http://de.wikipedia.org/wiki/Barack_Obama on German Wikipedia and http://commons.wikimedia.org/wiki/Barack_Obama on Wikimedia Commons and http://en.wikinews.org/wiki/Category:Barack_Obama on Wikinews, but none of these was his definitive identifier.

Meanwhile, interwiki links — the links between English and German and French and Swahili and Korean wikipedias — were maintained by hand (or, actually, by a bot) that had to update every wikipedia whenever a page was added or changed on any of them. This was a combinatoric exercise: with 2 wikis, there are two links (A -> B and B <- A). With 5 wikis there are (4 + 3 + 2 + 1) * 2 links. With N wikis, there are N-1! * 2 links, or to put it another way, 50 wikis would mean 1.2165637e+63 links between them. This was wildly inefficient to maintain!

Wikidata’s “phase 1″ was to create an entity store for Wikimedia projects, where each concept or entity — “Barack Obama” or “semantic web” or “tomato” — would have a central identity which could be linked to. Then, each Wikimedia project could say “This page describes entity XYZ”, or conversely Wikidata could say “this entity is described on these pages”, and suddenly the work of the interwiki bot became much easier: it meant that each new wiki added would only mean one new link, not an exponentially-expanding web of links.

We are in a similar position with open food data at present. There are dozens of open source food projects and that list doesn’t even touch on the ones that are more connected to recipes/eating/nutrition. We’re talking about how to interoperate between our various projects, but the key to interoperability is entity identification. If someone wants to mash up Growstuff’s harvest data with Openrecipes recipe search or the US FDA’s nutrition data, they need to know that Growstuff’s tomato is the same as the tomato you use in spaghetti sauce or the tomato that contains some percent of your RDA of potassium.

So how do we do this? None of our projects are sufficiently established, mature, or complete to claim the right to be the central ID repository. Apart from that, many of us have different focuses — edible plants, all types of plants, all types of living things, and all types of food (including non-animal/non-plant food) are some of the scopes I can mention offhand. Even the wide-ranging species databases like the Encyclopedia of Life don’t capture such information as crop varieties (eg. roma tomato, habanero pepper) that are important to veggie gardeners like Growstuff’s members.

Here’s what I would propose as an interim measure.

All open food projects need to link their major entities (eg. “crops” in Growstuff’s case) to one or more large, open, API-accessible data stores.

Examples of these include:

  • Wikipedia (any language, but English has the most articles)
  • Wikidata
  • Freebase
  • Encyclopedia of Life

By doing this, we can match data between projects. For instance, if Growstuff’s “tomato” links to the same entity as OpenFarm’s “tomato” and OpenFoodNetwork’s “tomato” and OpenRecipes’ “tomato” then we can reasonably assume they’re all talking about the same thing.

Also, some of the above data sources provide APIs which allow us to pivot easily between data sets. For instance, Freebase’s query language allows you to ask questions like “given an entity that is identified as ‘tomato’ on English Wikipedia, what is its identify on the Encyclopedia of Life?”

To see this in action, paste the following query into Freebase’s interactive query editor:

    [{
      "a:key": [{
        "namespace": "/wikipedia/en",
        "value": "Tomato"
      }],
      "b:key": [{
        "namespace": "/biology/eol",
        "value": null
      }]    
    }]

As you’ll see, the result is “392557” or to put it another way http://eol.org/pages/392557 — the EOL page on tomatoes.

From day 1, Growstuff has been tracking Wikipedia links for all our crops, to enable this sort of query against Freebase and so easily pivot to other data sets that Freebase knows about. If other projects take similar steps, this means that we are well on our way toward interoperability.

(As an aside, this is why we’re also having this other discussion about what to do about crop varieties that don’t have their own Wikipedia page, as this messes up the 1-to-1 relationship between Wikipedia entities and Growstuff entities. This may be something we just have to deal with, however, as no external data set will exactly match ours.)

Next steps

  1. I strongly encourage all open food projects to link their “crops” or similar entities to one or more major, open-licensed, API-accessible data source (ideally one which has its keys in Freebase).
  2. We should all expose these links via our APIs, data dumps, or whatever other mechanisms we use to make our open data available.
  3. Developers should be able to request data from our APIs based on these identifiers, either through query parameters or through REST API resources like eg. /crops/eol/392557.json
  4. We should use semantic markup/links to denote this entity equivalence on our webpages, eg. if Growstuff links to a Practical Plants page on the same crop, there should be a standard way to say “we consider these pages to refer to the same entity”. I’m not sure exactly what this is, yet, but if we do this it will benefit web crawlers, search engines, and other non-API consumers of our websites.
  5. We should look into developing a microformat for expressing crop information on a webpage, in collaboration with microformats.org. I expect, however, that it will be very hard to develop a workable ontology, since (for instance) some of our projects are interested in planting information and some aren’t, some are interested in sale and distribution and others aren’t, some are dealing with non-edible plants and others aren’t, etc. It may have to be as simple as “this is a crop and here are the names we have for it”.
  6. It would be great to put together some kind of visualisation like the linked open data cloud to show which open food projects are providing interoperable identities and how they connect to each other.

I’d like to get buy-in from other open food data projects on at least the general idea of matching our “crop” entities (whatever we call them) against some of the big databases. Who’s in?

Syndicated 2014-09-30 02:11:13 from Infotropism

Two frogs in a bowl of cream

A story I got from someone who says she got it from an older Dutch woman. I wouldn’t mention the Dutch woman thing except that this story just seems so Dutch to me. Anyway.

Two frogs fell into a bowl of cream. They swam and swam trying to get out, round and around in the cream, for hours.

Eventually one frog gave up, stopped swimming, and drowned.

The other frog kept swimming, refusing to give up. Finally the frog’s activity, splashing around in the cream, turned it to butter. It became solid in the bowl, and the frog was able to climb out.

The moral, I’m told, is that sometimes if you just keep kicking things will magically solidify under you and you’re can step up out of the trouble and move on. Also, apparently I’m frog #2. Trust me when I say it’s exhausting.

Syndicated 2014-09-30 01:16:37 from Infotropism

Dinner, aka, too impatient to wait for a real loaf to rise

Testing Instagram/ifttt/wordpress/DW integration

The Pathway to Inclusion

Lately I’ve been working on how to make groups, events, and projects more inclusive. This goes beyond diversity — having a demographic mix of participants — and gets to the heart of how and why people get involved, or don’t get involved, with things.

As I see it, there are six steps everyone needs to pass through, to get from never having heard of a thing to being deeply involved in it.

pathway to inclusion - see below for transcript and more details

These six steps happen in chronological order, starting from someone who knows nothing about your thing.

Awareness

“I’ve heard of this thing.” Perhaps I’ve seen mention of it on social media, or heard a friend talking about it. This is the first step to becoming involved: I have to be aware of your thing to move on to the following stages.

Understanding

“I understand what this is about.” The next step is for me to understand what your thing is, and what it might be like for me to be involved. Here’s where you get to be descriptive. Anything from your thing’s name, to the information on the website, to the language and visuals you use in your promotional materials can help me understand.

Identification

“I can see myself doing this.” Once I understand what your thing is, I’ll make a decision about whether or not it’s for me. If you want to be inclusive, your job here is to make sure that I can imagine myself as part of your group/event/project, by showing how I could use or benefit from what it offers, or by showing me other people like me who are already involved.

Access

“I can physically, logistically, and financially do this.” Here we’re looking at where and when your thing occurs, how much it costs, how much advance notice is given, physical accessibility (for people with disabilities or other such needs), childcare, transportation, how I would actually sign up for the thing, and how all of these interact with my own needs, schedule, finances, and so on.

Belonging

“I feel like I fit in here.” Assuming I get to this stage and join your thing, will I feel like I belong and am part of it? This is distinct from “identification” because identification is about imagining the future, while belonging is about my experience of the present. Are the organisers and other participants welcoming? Is the space safe? Are activities and facilities designed to support all participants? Am I feeling comfortable and having a good time?

Ownership

“I care enough to take responsibility for this.” If I belong, and have been involved for a while, I may begin to take ownership or responsibility. For instance, I might volunteer my time or skills, serve on the leadership team, or offer to run an activity. People in ownership roles are well placed to make sure that others make it through the inclusion pathway, to belonging and ownership.


If you’re interested in participating in an inclusivity workshop or would like to hire me to help your group, project, or event be more inclusive, get in touch.

Syndicated 2014-08-12 00:42:32 from Infotropism

Grace Hopper prints now available

I’ve been making linocuts.

Meet Grace Hopper. She’s a complete badass.

Grace Hopper print by Alex Skud Bayley 2014

(click image for a larger view)

She was 37 years old and working as a mathematics professor when Pearl Harbour happened. She joined the Navy and was set to work on the first ever general-purpose electro-mechanical computer, the Harvard Mark I. She invented the compiler (used to translate computer programs written by humans into ones and zeroes that the computer can understand), created one of the most widely used programming languages of the 20th century, and was the first to use the term “bug” to describe computer errors, after a literal bug was caught in the relays of the machine she was working on.

After WW2 she left the Navy and worked for various tech companies, but kept serving in the Naval Reserve. As was usual, she retired from the Reserves at 60, but she was recalled to active duty by special executive order, and eventually rose to the rank of Rear Admiral. When she retired (again) she kept working as a consultant until the age of 85. She also did this great Letterman interview at the age of 80.

Don’t ever let anyone tell you women can’t computer, or that you’re too old to computer. Grace knows better.

Buy a print

I’m selling these prints as a fundraiser over on Indiegogo, in part to offset this Gittip bullshit and the costs associated with attending a bunch of tech/feminist conferences in the US just recently.

The basic print (black on white) is $40 including international shipping, and there are other options available. If you’d like one you’d better get in quick — there’s only 10 standard prints left (though the other options are still wide open).

Syndicated 2014-08-05 02:59:52 from Infotropism

237 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!