Older blog entries for titus (starting at number 294)

SciPy '07: General Report

Last week, SciPy 2007 came to Caltech. Unfortunately, I wasn't able to attend many of the talks because I was busy with lab work and other deadlines, and because SciPy was held immediately upstairs from my lab I could just duck out to go back to work. However, I did attend a few of the talks and found all of them interesting -- and I heard that I seriously missed out on one or two. Hopefully other people will blog about those.

The talks I am qualified to comment on are the two other biology-based ones that I could attend. One was on pygr, given by Chris Lee, and the other was on Galaxy, given by James Taylor. Together with my talk on Cartwheel, I think they illustrated three entertainingly diverse approaches to bioinformatics and biology. Chris focused on addressing query paradigms (pygr implements a graphic database) as well as data size issues in alignment and annotation of large sequences. James talked a lot about building workflow interfaces to packages, thereby presenting a possible solution to many of the current problems in bioinformatics such as formatting, reproducibility, and software interaction. And I talked about how Cartwheel provided biologists with one specific way to answer one specific kind of problem, and did so in about as simple a manner as you can imagine. I already have plans for cross-fertilization...

Hopefully we can attract other bio talks to SciPy next year!

I also gave a rather low-key tutorial on "idiomatic Python", which went OK but was, well, low-key. I attended two birds-of-a-feather, one on testing (organized by fperez) and one on biology (organized by me). I'll write more about those later.

The BoFs and the after-talk interactions really convinced me that SciPy is a very useful conference for meeting other people who have similar problems to you (perhaps in slightly different fields). While I feel the talks are widely scattered by topic, ultimately a lot of the issues -- data handling, interfaces, visualization, parallelization -- are shared by everyone, and because we all use Python we can actually use each other's technology! Very cool stuff. Next year I'd suggest having more mixers and shorter presentations, so that we can get more of a sense of what's going on out there, but that's all I would suggest.


Syndicated 2007-08-20 07:56:56 from Titus Brown

Commenting on posts

I wanted to write a comment on timing unittests, but that blog does not allow anonymous comments and there is no obvious place to e-mail the author.


(The short version of my comment is that getting the basic data out with something like nose is trivial; see my pinocchio extensions.)


Syndicated 2007-08-10 22:03:06 from Titus Brown

Seen on a Ruby On Rails IRC channel

There was another mildly amusing incident during the recent SoCal Piggies meeting.

Michael Carter was showing us his incredibly neat Web 3.0 / HTTP PUSH software, orbited, by demoing an interactive IRC client on the Web. He signed onto the ruby-on-rails IRC channel and (this being a Python meeting) asked "Is Python better than Ruby?"

The response, interestingly enough, was (and I paraphrase): "No. Ruby is more modern, and because it doesn't use whitespace, you don't have to use templates."

Well, that didn't make any sense to me, so I asked Michael to follow up with "what are templates?"

Next response? "In Python/Django, you need to use templates because Python isn't good for this stuff."



p.s. Should I hedge? Sure. "No, I don't think every Ruby person is so silly as to conflate 'Ruby' with 'Ruby on Rails' or 'Python' with 'Django'."

Syndicated 2007-07-28 17:03:07 from Titus Brown

A "Biology in Python" mailing list

To get people talking, I've created a "biology-in-python" mailing list. You can subscribe here: http://lists.idyll.org/listinfo/biology-in-python, and you can post to it at bip@lists.idyll.org once you're a member.

This list is a tool/package/library-agnostic list, for people who use Python to work in the biological sciences. Join us!


Syndicated 2007-07-19 03:03:08 from Titus Brown

A BoF session on biological sequence analysis at SciPy?

Chris Lee and I would like to set up a Birds-of-a-Feather gathering at SciPy '07. We'll probably have an initial meeting on Thursday, August 16th, and then maybe work into sprint mode for that Saturday.

Contact me if you're interested. No reservations needed, but we should probably all plan to meet at the same time ;).


Syndicated 2007-07-06 14:03:03 from Titus Brown

Miscellaneous other SciPy stuff

While I'm thinking about SciPy '07, here are a few other notes:

  • Chris Lasher and I are thinking about doing a Software Carpentry sprint of some kind. Interested?
  • I'd be up for doing a half-day tutorial on "Testing for Scientists" or "Idiomatic Python". Interested?


Syndicated 2007-07-06 14:03:03 from Titus Brown

MSU Lab Website: Up

I've just put up a simple lab Web site for my future lab at Michigan State U.; I'm calling it the Lab of Genomics, Evolution, and Development.


Syndicated 2007-07-03 03:03:08 from Titus Brown

Reject Software Engineering?

Eric Wise asks, and I mostly agree.


Syndicated 2007-06-28 15:03:05 from Titus Brown

Python Global Interpreter Lock approach: Validated?

It's nice to see Python come out on top for threading.


Syndicated 2007-06-28 14:03:03 from Titus Brown

LLNL Course: Done!

On Tuesday (June 12), Wednesday, and Thursday I taught the course "Intermediate and Advanced Software Carpentry in Python" at Lawrence Livermore National Labs. This was intended to be an extension of some of the ideas from the Software Carpentry course.

The pre-course course advert, the handouts distributed at the course (day 1, day 2, and day 3), the corrected and revised handouts, and associated source code are all available for your perusal and use.

As you can see from the outline and handouts, I took a rather fast-paced and shallow approach to teaching this class. This was by direction and intent; because this was an intermediate class, I assumed basic familiarity with most things Python, and gave a tour of various features rather than an in-depth tutorial. It's by no means a complete exploration of the topics, and it's not intended to be! In particular, I felt free to offer up really stupid examples rather than justifying each Python feature, because I assumed that people would fit features to their needs. (This is why there are a lot of stupid examples!)

A few thoughts:

There are a lot of people that use Python at LLNL! Alas, very few of them are biologists, who mostly use Perl.

After a slightly rough first day, I switched to introducing and demoing features rather than requesting some hands-on work. Apparently it is difficult for most people to switch from listening to coding on short notice! My (badly planned) exercises were kinda bogging the class down, and someone pointed out that these people were going to take what I taught and use it immediately anyway, and that I should just point them in the right direction rather than making them do exercises.

As I saw with my PyCon'07 talk, demoing coding in front of people conveys more of a sense of the process and content than having prepackaged demos.

Teaching smart, motivated people is a joy. I don't know how this will compare to teaching undergrads at MSU, but my guess is that motivation will generally be lower!

Talking all day, every day, for three days, is bloody exhausting. Really, really, really, really exhausting.

I generated over 75 pages of text (no images, and not counting the code that I wrote but did not insert into the text). Doctests rock. So does ReStructured Text. Combining them is synergistic.

Doctests in tutorials keep me honest.

subprocess rocks. I would generalize and say that os.system is one of the big failure points in people's use of Python, and subprocess can solve the simple problems really easily. Unfortunately the current subprocess module documentation sucks. I have been inspired.

My discussion of multiprocessing was, I think, a hit. It was easy to make fun of Python's default "you don't need threading that much!" in front of a crowd of people that work on massively parallel CPU intensive applications! I explained how and why the Python approach was actually pretty good, and went through actually converting one of my library functions into a threadsafe & threadaware Python function.

parallelpython is neat although I find that the example code chafes my sense of aesthetics.

pyrex rocks. ctypes rocks. SWIG is a tad obsolete and buggy (at least for Python) when you have SIP and Boost!

I was introduced briefly to Babel, a cross-language interoperability system buil at LLNL.

pyMPI is also built at LLNL, and they use it fairly extensively.

I didn't get to talk about a few things in my original outline, either because I forgot, or couldn't fit it in. The biggest lack was the planned discussion of data presentation via the Web and data storage via databases. I think this technique is underutilized in scientific circles.

I tried to push testing, testing, testing. I don't know how successful I was.


Syndicated 2007-06-24 16:03:04 from Titus Brown

285 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!