<?xml version="1.0"?>
<rss version="2.0">
  <channel>
    <title>Advogato blog for markpasc</title>
    <link>http://www.advogato.org/person/markpasc/</link>
    <description>Advogato blog for markpasc</description>
    <language>en-us</language>
    <generator>mod_virgule</generator>
    <pubDate>Tue, 7 Oct 2008 06:57:47 GMT</pubDate>
    <item>
      <pubDate>Mon, 19 Aug 2002 19:42:32 GMT</pubDate>
      <title>19 Aug 2002</title>
      <link>http://www.advogato.org/person/markpasc/diary.html?start=14</link>
      <guid>http://www.advogato.org/person/markpasc/diary.html?start=14</guid>
      <description>I don't think I've mentioned it, but my current project is &lt;a href="http://markpasc.org/code/winget/winget.html" &gt;winget&lt;/a&gt;, a Windows port of GNU wget with an actual Windows interface. This is about the biggest thing I've done to date, being an actual software project, so I'm very pleased with even just how much I've done so far!</description>
    </item>
    <item>
      <pubDate>Sun, 16 Jun 2002 08:06:21 GMT</pubDate>
      <title>16 Jun 2002</title>
      <link>http://www.advogato.org/person/markpasc/diary.html?start=13</link>
      <guid>http://www.advogato.org/person/markpasc/diary.html?start=13</guid>
      <description>&lt;p&gt;&lt;a href="http://markpasc.org/code/kit/" &gt;Kit&lt;/a&gt; 1.1.6 is out. It adds a Radio to the Past form bit to the weblog post page, and incorporates a couple minor minor fixes I'm going to let the Kit page claim I released as 1.1.5.

&lt;p&gt; &lt;p&gt;I've not been spending a lot of time in Radio-land lately, and will have to carefully consider it, since I may be ditching Windows in the not too distant future. I've invested enough in Radio that I should probably keep using it, but &lt;a href="http://www.dnalounge.com/backstage/log/2002/05.html#27-may-2002" &gt;sunk time is a bad decision-making factor&lt;/a&gt;.</description>
    </item>
    <item>
      <pubDate>Fri, 24 May 2002 06:18:10 GMT</pubDate>
      <title>24 May 2002</title>
      <link>http://www.advogato.org/person/markpasc/diary.html?start=12</link>
      <guid>http://www.advogato.org/person/markpasc/diary.html?start=12</guid>
      <description>&lt;p&gt;The next version of Stapler is chock-full
(&lt;i&gt;chockful?&lt;/i&gt;) of HTTP headery goodness.

&lt;p&gt; &lt;p&gt;So find some more bugs so I can put it out.

&lt;p&gt; &lt;p&gt;The headers in question are If-Modified-Since and
User-Agent. Stapler identifies itself to the server as
&lt;tt&gt;Stapler/x.y.z&lt;/tt&gt;, and remembers the Last-Modified and
Date headers (actually, all of them) so it can parrot it
back for a 304 Not Modified as the spec suggests. Voila:

&lt;p&gt; &lt;blockquote cite="uri:thingy"&gt;&lt;p&gt;&lt;tt&gt;x.y.z.w - -
[24/May/2002:01:58:29 -0400] "GET / HTTP/1.0" 304 - "-"
"Stapler/2.0.1"&lt;/tt&gt;&lt;/blockquote&gt;

&lt;p&gt; &lt;p&gt;Next step would be to honor robots.txt files. Suppose I
should put a referrer in, too, hmm. Might also be nice to
say I'm using HTTP/1.1, but I'm not sure if I can.</description>
    </item>
    <item>
      <pubDate>Thu, 23 May 2002 05:20:11 GMT</pubDate>
      <title>23 May 2002</title>
      <link>http://www.advogato.org/person/markpasc/diary.html?start=11</link>
      <guid>http://www.advogato.org/person/markpasc/diary.html?start=11</guid>
      <description>&lt;b&gt;Radio to the Whatever&lt;/b&gt;

&lt;p&gt; &lt;p&gt;It's rather depressing to find such a showstopping bug in
Kit's Radio to the Past tool. I hadn't heard about it and
didn't realize it was there, so that means &lt;em&gt;no one
whosoever&lt;/em&gt; used the thing and had the decency to drop me
a note about it. After all the noise in the groups about it
I figured &lt;em&gt;someone&lt;/em&gt; might at least &lt;em&gt;try&lt;/em&gt; the
thing... but not so.</description>
    </item>
    <item>
      <pubDate>Mon, 6 May 2002 07:57:59 GMT</pubDate>
      <title>6 May 2002</title>
      <link>http://www.advogato.org/person/markpasc/diary.html?start=10</link>
      <guid>http://www.advogato.org/person/markpasc/diary.html?start=10</guid>
      <description>&lt;p&gt;I've started planning for the next version of &lt;a href="http://markpasc.org/code/stapler/" &gt;Stapler&lt;/a&gt;, in which everything old is new again under a different name and in a different place. Meanwhile the version of Stapler on my desktop and the one on the website are different, so I release the former as a "bugfix" version, 1.7.4.

&lt;p&gt; &lt;p&gt;One big idea (as in "What's the big idea?") will cause most of the change and provide a convenient excuse for the rest: eliminating the source-feed dichotomy. Since this is quite a big change, the next version of Stapler will, at least for now, be numbered 2.0 (0 as in "oh, boy").

&lt;p&gt; &lt;p&gt;Most sources required a corresponding feed, which I obviously realized since I added a "Make feed for this source" button not too long ago. However, the entire difference is a holdover from Stapler's &lt;em&gt;original&lt;/em&gt; purpose being a feed of web comics, one of the &lt;em&gt;few&lt;/em&gt; cases where it's better to have multiple sources in one feed.

&lt;p&gt; &lt;p&gt;So out go sources vs feeds--but you'll still be able to do the same thing, of course. (I'm not giving up my web comics feed yet.) Stapler 2.0 will allow users to disable writing feeds to disk independently of toggling their actual updating, and will include an "aggregate" scraper that aggregates the items of other feeds--presumably ones with disk writing turned off--into one feed. Literally where you had a feed for one source because of Stapler's design, you'll have one feed, and where you aggregated four sources into one feed for some value &amp;lt;dfn&amp;gt;four&amp;lt;/dfn&amp;gt;, you'll have 4+1 feeds, only one of which has disk-writing enabled.

&lt;p&gt; &lt;p&gt;So maybe it's not such a hot idea, having a sourcefeed that can be sourcelike or feedlike or both; but it seems like a good idea at the moment.

&lt;p&gt; &lt;p&gt;In addition to that change, some things are changing name to make for (I hope) clearer nomenclature. Instead of the antiquated and scary &amp;lt;dfn&amp;gt;scraper&amp;lt;/dfn&amp;gt;, feeds will have &amp;lt;dfn&amp;gt;extractors&amp;lt;/dfn&amp;gt;. Instead of having &amp;lt;dfn&amp;gt;document types&amp;lt;/dfn&amp;gt;, feeds will have &amp;lt;dfn&amp;gt;formats&amp;lt;/dfn&amp;gt;. Those are the name changes I foresee now, but I'm sure one or two more will sneak in.

&lt;p&gt; &lt;p&gt;Oh, and the "ByNumbers" extractor becomes "By selector." Duh.

&lt;p&gt; &lt;p&gt;Ideally, of course, I would write a script that converts a 1.7.4 StaplerData table to a 2.0 one. In fact, that's how I refined the new data model, figuring out how to turn the old into the new. But I'd really rather not, since it's complicated, and anyone with custom scrapers or document types will have work to do anyway. (But then, I suppose that's actually very few people, so perhaps it is worthwhile.)

&lt;p&gt; &lt;p&gt;As is apparent, 2.0 is still very much in the planning stage, though it would be nice to have a copy to release 17 May, since that's the day I release version 1.0.1 last year. (I'm not sure when I released 1.0; I guess I could look it up in my blog archives, but I can't be arsed just now.) Just a heads up for y'all who actually care.</description>
    </item>
    <item>
      <pubDate>Fri, 3 May 2002 15:54:45 GMT</pubDate>
      <title>3 May 2002</title>
      <link>http://www.advogato.org/person/markpasc/diary.html?start=9</link>
      <guid>http://www.advogato.org/person/markpasc/diary.html?start=9</guid>
      <description>&lt;p&gt;Huh, so Kit is "popular" &lt;a
href="http://scriptingnews.userland.com/backissues/2002/05/03#l0b4f1453cbf69291b7944b3f426dd9f0"&gt;now&lt;/a&gt;:

&lt;p&gt; &lt;blockquote
cite="http://scriptingnews.userland.com/backissues/2002/05/03#l0b4f1453cbf69291b7944b3f426dd9f0"&gt;&lt;p&gt;&lt;a
href="http://radio.userland.com/discuss/msgReader$14239?y=2002&amp;m=5&amp;d=3"&gt;Mark
Paschal released&lt;/a&gt; Kit 1.0.1, a popular set of interfaces
and utilities for Radio 8.&lt;/blockquote&gt;

&lt;p&gt; &lt;p&gt;That's good to know.</description>
    </item>
    <item>
      <pubDate>Fri, 19 Apr 2002 02:55:20 GMT</pubDate>
      <title>19 Apr 2002</title>
      <link>http://www.advogato.org/person/markpasc/diary.html?start=8</link>
      <guid>http://www.advogato.org/person/markpasc/diary.html?start=8</guid>
      <description>&lt;p&gt;&lt;a href="http://markpasc.org/code/kit/" &gt;Kit 0.9.6.&lt;/a&gt; Two bugs fixed and &lt;a href="http://groups.yahoo.com/group/radio-dev/message/5881" &gt;a feature&lt;/a&gt;.</description>
    </item>
    <item>
      <pubDate>Wed, 17 Apr 2002 18:23:19 GMT</pubDate>
      <title>17 Apr 2002</title>
      <link>http://www.advogato.org/person/markpasc/diary.html?start=7</link>
      <guid>http://www.advogato.org/person/markpasc/diary.html?start=7</guid>
      <description>&lt;a
href="http://www.advogato.org/person/markpasc/diary.html?start=4"&gt;I
mentioned I&lt;/a&gt; was installing Debian 2.2. It was actually
pretty easy, because of some combination of the CD-ROM drive
actually working, more experience since I installed Linux
last, not having (read: bothering) to repartition the disk,
and Debian being awesome.

&lt;p&gt; I'm trying to install LiveJournal server next, but one of
the early steps is to make sure one's CPAN module is up to
date, and it caught me using the perl 5.005 that came with
Debian 2.2. Yadda yadda, now I'm trying to find a server
from which I can just apt-get it.

&lt;p&gt; I talked more about my Linux-reinstalling experience &lt;a
href="http://www.livejournal.com/talkread.bml?journal=markpasc&amp;itemid=29750"&gt;here&lt;/a&gt;,
&lt;a
href="http://www.livejournal.com/talkread.bml?journal=markpasc&amp;itemid=30108"&gt;here&lt;/a&gt;,
 &lt;a
href="http://www.livejournal.com/talkread.bml?journal=markpasc&amp;itemid=30398"&gt;here&lt;/a&gt;,
and &lt;a
href="http://markpasc.org/blog/2002/04/14.html#i32417PM"&gt;here&lt;/a&gt;.</description>
    </item>
    <item>
      <pubDate>Wed, 17 Apr 2002 17:51:39 GMT</pubDate>
      <title>17 Apr 2002</title>
      <link>http://www.advogato.org/person/markpasc/diary.html?start=6</link>
      <guid>http://www.advogato.org/person/markpasc/diary.html?start=6</guid>
      <description>&lt;b&gt;The Google API&lt;/b&gt;

&lt;p&gt; &lt;p&gt;Yeah, I've not mentioned &lt;a
href="http://www.google.com/apis/"&gt;it&lt;/a&gt; yet, but suddenly
several articles herd my thinking thataway.

&lt;p&gt; &lt;p&gt; &lt;p&gt;&lt;a
href="http://TheFlangyNews.editthispage.com/2002/04/16"&gt;Adam
Vandenberg discusses XHTML&lt;/a&gt; (look for &lt;cite&gt;See, there's
this Web, and it has these "standards"&lt;/cite&gt;). His argument
is eminently reasonable, though I've done my share of
rah-rahing for XHTML and whatnot.

&lt;p&gt; &lt;p&gt; &lt;p&gt;Here's the slightly-snuck assumption:

&lt;p&gt; &lt;p&gt; &lt;blockquote
cite="http://TheFlangyNews.editthispage.com/2002/04/16"&gt;&lt;p&gt;Well,
people are people and computers are computers, and Webpages
are primarily meant for communicating with other people and
&lt;em&gt;not&lt;/em&gt; communication with machines.&lt;/blockquote&gt;

&lt;p&gt; &lt;p&gt; &lt;p&gt;Web pages &lt;em&gt;have been&lt;/em&gt; for people to
communicate with people, but the whole point of XML, CSS,
and XHTML is that web documents &lt;em&gt;should&lt;/em&gt; be
communicable to machines. For example, if I only had to
specify particular paths along the DOMs of XHTML documents,
Stapler would be much simpler software (an alarm clock,
database, web fetcher, and the path walker). Also, machines
have to communicate this content to people; that's all well
and good if you have a standard way of doing that, such as
the visual web browser, but what if the human can't see? The
machine needs to be able to understand enough about the
content to convert it between different media--so that's how
the accessibility argument relates.

&lt;p&gt; &lt;p&gt; &lt;p&gt;It's a good argument and certainly nothing to ignore,
but the important part is:

&lt;p&gt; &lt;p&gt; &lt;blockquote
cite="http://TheFlangyNews.editthispage.com/2002/04/16"&gt;&lt;p&gt;The
web browser as a universal client is still a very powerful
idea. ... [N]on-HTML Internet APIs... are going to
complement web browsing, not replace it.&lt;/blockquote&gt;

&lt;p&gt; &lt;p&gt; &lt;p&gt;I certainly don't read all the web in RSS. Even if I
&lt;em&gt;could&lt;/em&gt; add everything in there, &lt;em&gt;would&lt;/em&gt; I?
Probably not, though I would read more there than most people.

&lt;p&gt; &lt;p&gt; &lt;p&gt;So, first off, HTML isn't going away any time soon.
Meanwhile, &lt;a
href="http://www.disenchanted.com/dis/technology/prefab.html"&gt;this
week's &lt;cite&gt;Disenchanted&lt;/cite&gt; article&lt;/a&gt; is specifically
on Google's SOAP API... by way of construction toys:

&lt;p&gt; &lt;p&gt; &lt;blockquote
cite="http://www.disenchanted.com/dis/technology/prefab.html"&gt;&lt;p&gt;Where
have all the young and amateur engineers gone? Apparently to
computers, where the philosophy of olde-time Lego, Meccano
and Heathkit is in super-overdrive.

&lt;p&gt; &lt;p&gt; &lt;p&gt;This philosophy is all about building personal
projects with easily understandable, easily connectable,
&lt;em&gt;pre-made&lt;/em&gt; parts, and the world of software is now
awash with hundreds of thousands of them.&lt;/blockquote&gt;

&lt;p&gt; &lt;p&gt; &lt;p&gt;The article is a comprehensive guide to where the
Google SOAP API came from, and while not explicitly saying
this is only throwing the doors open to the web services
world, it's so. Here I unveil my cynicism (or, perhaps,
optimism): specifically I agree with &lt;a
href="http://aaronland.info/weblog/archive/4231"&gt;Aaron
Straup Cope in that&lt;/a&gt; the Google API isn't
earth-shattering in and of itself. Gee, people can put
"top-ten Google hits for &amp;lt;dfn&amp;gt;foo&amp;lt;/dfn&amp;gt;" search
boxen on their Radio pages. Couldn't you do that before?

&lt;p&gt; &lt;p&gt; &lt;p&gt;Yeah, but it's &lt;em&gt;qualitatively easier&lt;/em&gt; now.
After all the moaning about how &lt;a
href="http://www.userland.com/"&gt;no one&lt;/a&gt; is deploying web
services, this throws the door wide open to them, full stop.
Now that Google's done it, will Dictionary.com do it?
Aaron's weblog &lt;a
href="http://aaronland.info/weblog/archive/4233"&gt;yields an
example&lt;/a&gt; of the utility of such a service, even though
you could do that with a more complex API too.

&lt;p&gt; &lt;p&gt; &lt;p class="note"&gt;(Aside: probably not, since the revenue
model remains to be seen. Might they start selling product
placement in example usage text?)

&lt;p&gt; &lt;p&gt; &lt;p&gt;I'd like to think this is, as I said, optimism. Maybe
I'm a victim of the hype, but if this is only the beginning
of web services, there are going to be so many even
&lt;em&gt;more&lt;/em&gt; amazing services, and they're all in the
future, awaiting invention.
</description>
    </item>
    <item>
      <pubDate>Tue, 9 Apr 2002 12:17:17 GMT</pubDate>
      <title>9 Apr 2002</title>
      <link>http://www.advogato.org/person/markpasc/diary.html?start=5</link>
      <guid>http://www.advogato.org/person/markpasc/diary.html?start=5</guid>
      <description>&lt;p&gt;&lt;b&gt;Stapler grousing&lt;/b&gt;

&lt;p&gt; &lt;p&gt;Should I be &lt;em&gt;bitter&lt;/em&gt; that &lt;a
href="http://radiotools.evectors.it/itstories/story$num=16&amp;sec=3&amp;data=stories"&gt;RssDistiller&lt;/a&gt;
gets more noise than &lt;a
href="http://markpasc.org/code/stapler/"&gt;Stapler&lt;/a&gt;? (I am,
at times.) Should I have chosen a better name, one that more
obviously screams "I TURN STUFF INTO RSS!"? Is it a design
and documentation issue? Is it because Stapler isn't
&lt;em&gt;pretty&lt;/em&gt; like RssDistiller, with its tabbed interface
and eVectors' bumblebee colors?

&lt;p&gt; &lt;p&gt;Am I wrong that it's difficult to specify what one wants
out of a page (ie, how hard is that in RssDistiller)? Was I
wrong to have a feeds concept that aggregate sources? Should
I reimplement feeds as a special source scraper?

&lt;p&gt; &lt;p&gt;Obviously I should figure how to share sources, since
RssDistiller does that. I should probably make Stapler not
autonumber new feeds and sources. I haven't worked on the
refined interface yet (and if I make a radical change to the
feeds thing, I shouldn't, yet).</description>
    </item>
  </channel>
</rss>
