4 Sep 2012 ralsina   » (Master)

Improved Wordpress.com Importer and a Question

Thanks to the cooperation of Humitos who gave me his wordpress backup, I did some improvements in the wordpress.com import feature of Nikola, my static website/blog generator

So, if you were to try to use nikola_wordpress_importer from master now, it would:

  1. Not crash ;-)
  2. Download attachments
  3. Fix links to attachments so they work on the new site

However, I am now unsure of what exactly is in wordpress.com's export XML file. The posts themselves are in this form:

Muchas gracias Nico por hacer el video este. Groso, quedó muy bueno.


Two things jump to me:

  1. That's not HTML
  2. WTF is that youtube thing?

I am having some success processing it as markdown, since that handles the paragraph breaks and some other stuff. Maybe the youtube embedding is done with a markdown extension?

Anyone knows?

Syndicated 2012-09-03 21:47:01 from Lateral Opinion

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!