wlach is currently certified at Master level.

Name: William Lachance
Member since: 2001-07-29 07:53:10
Last Login: 2011-12-02 04:37:49

FOAF RDF Share This

Notes:

28 year old software developer in Halifax, Nova Scotia. Worked on a whole gamut of projects, mostly desktop and/or network related. You can reach me at wrlach@gmail.com

Projects

Articles Posted by wlach

Recent blog entries by wlach

Syndication: RSS 2.0

mozregression: New maintainer, issues tracked in bugzilla

Just wanted to give some quick updates on mozregression, your favorite regression-finding tool for Firefox:

  1. I moved all issue tracking in mozregression to bugzilla from github issues. Github unfortunately doesn’t really scale to handle notifications sensibly when you’re part of a large organization like Mozilla, which meant many problems were flying past me unseen. File your new bugs in bugzilla, they’re now much more likely to be acted upon.
  2. Sam Garrett has stepped up to be co-maintainer of the project with me. He’s been doing a great job whacking out a bunch of bugs and keeping things running reliably, and it was time to give him some recognition and power to keep things moving forward. :)
  3. On that note, I just released mozregression 0.17, which now shows the revision number when running a build (a request from the graphics team, bug 1007238) and handles respins of nightly builds correctly (bug 1000422). Both of these were fixed by Sam.

If you’re interested in contributing to Mozilla and are somewhat familiar with python, mozregression is a great place to start. The codebase is quite approachable and the impact will be high — as I’ve found out over the last few months, people all over the Mozilla organization (managers, developers, QA …) use it in the course of their work and it saves tons of their time. A list of currently open bugs is here.

Syndicated 2014-05-08 22:31:50 from William Lachance's Log

PyCon 2014 impressions: ipython notebook is the future & more

This year’s PyCon US (Python Conference) was in my city of residence (Montréal) so I took the opportunity to go and see what was up in the world of the language I use the most at Mozilla. It was pretty great!

ipython

The highlight for me was learning about the possibilities of ipython notebooks, an absolutely fantastic interactive tool for debugging python in a live browser-based environment. I’d heard about it before, but it wasn’t immediately apparent how it would really improve things — it seemed to be just a less convenient interface to the python console that required me to futz around with my web browser. Watching a few presentations on the topic made me realize how wrong I was. It’s already changed the way I do work with Eideticker data, for the better.

Using ipython to analyze some eideticker data
Using ipython to analyze some eideticker data

I think the basic premise is really quite simple: a better interface for typing in, experimenting with, and running python code. If you stop and think about it, the modern web interface supports a much richer vocabulary of interactive concepts that the console (or even text editors like emacs): there’s no reason we shouldn’t take advantage of it.

Here are the (IMO) killer features that make it worth using:

  • The ability to immediately re-execute a block of code after editing and seeing an error (essentially merging the immediacy of the python console with the permanency / cut & pastability of an actual script)
  • Live-printing out graphs of numerical results using matplotlib. ZOMG this is so handy. Especially in conjunction with the live-editing outlined above, there’s no better tool for fine-tuning mathematical/statistical analysis.
  • The shareability of the results. Any ipython notebook can be saved and then saved to a public website. Many presentations at PyCon 2014, in fact, were done entirely with ipython notebooks. So handy for answering questions like “how did you get that”?

To learn more about how to use ipython notebooks for data analysis, I highly recommend Julie Evan’s talk Diving into Open Data with IPython Notebook & Pandas, which you can find on pyvideo.org.

Other Good Talks

I saw some other good talks at the conference, here are some of them:

  • All Your Ducks In A Row: Data Structures in the Standard Library and Beyond – A useful talk by Brandon Rhoades on the implementation of basic data structures in Python, and how to select the ones to use for optimal performance. It turns out that lists aren’t the best thing to use for long sequences of numerical data (who knew?)
  • Fast Python, Slow Python – An interesting talk by Alex Gaynor about how to write decent performing pure-python code in a single-threaded context. Lots of intelligent stuff about producing robust code that matches your intention and data structures, and caution against doing fancy things in the name of being “pythonic” or “general”.
  • Analyzing Rap Lyrics with Python – Another data analysis talk, this one about a subject I knew almost nothing about. The best part of it (for me anyway) was learning how the speaker (Julie Lavoie) narrowed her focus in her research to the exact aspects of the problem that would let her answer the question she was interested in (“Can we automatically find out which rap lyrics are the most sexist?”) as opposed to interesting problems (“how can I design the most general scraping library possible?”) that don’t answer the question. In my opinion, this ability to focus is one of the key things that seperates successful projects from unsuccessful ones.

Syndicated 2014-04-22 21:36:39 from William Lachance's Log

Upcoming travels to Japan and Taiwan

Just a quick note that I’ll shortly be travelling from the frozen land of Montreal, Canada to Japan and Taiwan over the next week, with no particular agenda other than to explore and meet people. If any Mozillians are interested in meeting up for food or drink, and discussion of FirefoxOS performance, Eideticker, entropy or anything else… feel free to contact me at wrlach@gmail.com.

Exact itinerary:

  • Thu Mar 20 – Sat Mar 22: Tokyo, Japan
  • Sat Mar 22 – Tue Mar 25: Kyoto, Japan
  • Tue Mar 25 – Thu Mar 27: Tokyo, Japan
  • Thu Mar 27 – Sun Mar 30: Taipei, Taiwan

I will also be in Taipei the week of the March 31st, though I expect most of my time to be occupied with discussions/activities inside the Taipei office about FirefoxOS performance matters (the Firefox performance team is having a work week there, and I’m tagging along to talk about / hack on Eideticker and other automation stuff).

Syndicated 2014-03-16 21:19:58 from William Lachance's Log

It’s all about the entropy

[ For more information on the Eideticker software I'm referring to, see this entry ]

So recently I’ve been exploring new and different methods of measuring things that we care about on FirefoxOS — like startup time or amount of checkerboarding. With Android, where we have a mostly clean signal, these measurements were pretty straightforward. Want to measure startup times? Just capture a video of Firefox starting, then compare the frames pixel by pixel to see how much they differ. When the pixels aren’t that different anymore, we’re “done”. Likewise, to measure checkerboarding we just calculated the areas of the screen where things were not completely drawn yet, frame-by-frame.

On FirefoxOS, where we’re using a camera to measure these things, it has not been so simple. I’ve already discussed this with respect to startup time in a previous post. One of the ideas I talk about there is “entropy” (or the amount of unique information in the frame). It turns out that this is a pretty deep concept, and is useful for even more things than I thought of at the time. Since this is probably a concept that people are going to be thinking/talking about for a while, it’s worth going into a little more detail about the math behind it.

The wikipedia article on information theoretic entropy is a pretty good introduction. You should read it. It all boils down to this formula:

wikipedia-entropy-formula

You can see this section of the wikipedia article (and the various articles that it links to) if you want to break down where that comes from, but the short answer is that given a set of random samples, the more different values there are, the higher the entropy will be. Look at it from a probabilistic point of view: if you take a random set of data and want to make predictions on what future data will look like. If it is highly random, it will be harder to predict what comes next. Conversely, if it is more uniform it is easier to predict what form it will take.

Another, possibly more accessible way of thinking about the entropy of a given set of data would be “how well would it compress?”. For example, a bitmap image with nothing but black in it could compress very well as there’s essentially only 1 piece of unique information in it repeated many times — the black pixel. On the other hand, a bitmap image of completely randomly generated pixels would probably compress very badly, as almost every pixel represents several dimensions of unique information. For all the statistics terminology, etc. that’s all the above formula is trying to say.

So we have a model of entropy, now what? For Eideticker, the question is — how can we break the frame data we’re gathering down into a form that’s amenable to this kind of analysis? The approach I took (on the recommendation of this article) was to create a histogram with 256 bins (representing the number of distinct possibilities in a black & white capture) out of all the pixels in the frame, then run the formula over that. The exact function I wound up using looks like this:


def _get_frame_entropy((i, capture, sobelized)):
    frame = capture.get_frame(i, True).astype('float')
    if sobelized:
        frame = ndimage.median_filter(frame, 3)

        dx = ndimage.sobel(frame, 0)  # horizontal derivative
        dy = ndimage.sobel(frame, 1)  # vertical derivative
        frame = numpy.hypot(dx, dy)  # magnitude
        frame *= 255.0 / numpy.max(frame)  # normalize (Q&D)

    histogram = numpy.histogram(frame, bins=256)[0]
    histogram_length = sum(histogram)
    samples_probability = [float(h) / histogram_length for h in histogram]
    entropy = -sum([p * math.log(p, 2) for p in samples_probability if p != 0])

    return entropy

[Context]

The “sobelized” bit allows us to optionally convolve the frame with a sobel filter before running the entropy calculation, which removes most of the data in the capture except for the edges. This is especially useful for FirefoxOS, where the signal has quite a bit of random noise from ambient lighting that artificially inflate the entropy values even in places where there is little actual “information”.

This type of transformation often reveals very interesting information about what’s going on in an eideticker test. For example, take this video of the user panning down in the contacts app:

If you graph the entropies of the frame of the capture using the formula above you, you get a graph like this:

contacts scrolling entropy graph
[Link to original]

The Y axis represents entropy, as calculated by the code above. There is no inherently “right” value for this — it all depends on the application you’re testing and what you expect to see displayed on the screen. In general though, higher values are better as it indicates more frames of the capture are “complete”.

The region at the beginning where it is at about 5.0 represents the contacts app with a set of contacts fully displayed (at startup). The “flat” regions where the entropy is at roughly 4.25? Those are the areas where the app is “checkerboarding” (blanking out waiting for graphics or layout engine to draw contact information). Click through to the original and swipe over the graph to see what I mean.

It’s easy to see what a hypothetical ideal end state would be for this capture: a graph with a smooth entropy of about 5.0 (similar to the start state, where all contacts are fully drawn in). We can track our progress towards this goal (or our deviation from it), by watching the eideticker b2g dashboard and seeing if the summation of the entropy values for frames over the entire test increases or decreases over time. If we see it generally increase, that probably means we’re seeing less checkerboarding in the capture. If we see it decrease, that might mean we’re now seeing checkerboarding where we weren’t before.

It’s too early to say for sure, but over the past few days the trend has been positive:

entropy-levels-climbing
[Link to original]

(note that there were some problems in the way the tests were being run before, so results before the 12th should not be considered valid)

So one concept, at least two relevant metrics we can measure with it (startup time and checkerboarding). Are there any more? Almost certainly, let’s find them!

Syndicated 2014-03-14 23:52:58 from William Lachance's Log

Eideticker for FirefoxOS: Becoming more useful

[ For more information on the Eideticker software I'm referring to, see this entry ]

Time for a long overdue eideticker-for-firefoxos update. Last time we were here (almost 5 months ago! man time flies), I was discussing methodologies for measuring startup performance. Since then, Dave Hunt and myself have been doing lots of work to make Eideticker more robust and useful. Notably, we now have a setup in London running a suite of Eideticker tests on the latest version of FirefoxOS on the Inari on a daily basis, reporting to http://eideticker.mozilla.org/b2g.

b2g-contacts-startup-dashboard

There were more than a few false starts with and some of the earlier data is not to be entirely trusted… but it now seems to be chugging along nicely, hopefully providing startup numbers that provide a useful counterpoint to the datazilla startup numbers we’ve already been collecting for some time. There still seem to be some minor problems, but in general I am becoming more and more confident in it as time goes on.

One feature that I am particularly proud of is the detail view, which enables you to see frame-by-frame what’s going on. Click on any datapoint on the graph, then open up the view that gives an account of what eideticker is measuring. Hover over the graph and you can see what the video looks like at any point in the capture. This not only lets you know that something regressed, but how. For example, in the messages app, you can scan through this view to see exactly when the first message shows up, and what exact state the application is in when Eideticker says it’s “done loading”.

Capture Detail View
[link to original]

(apologies for the low quality of the video — should be fixed with this bug next week)

As it turns out, this view has also proven to be particularly useful when working with the new entropy measurements in Eideticker which I’ve been using to measure checkerboarding (redraw delay) on FirefoxOS. More on that next week.

Syndicated 2014-03-09 20:31:18 from William Lachance's Log

151 older entries...

 

wlach certified others as follows:

  • wlach certified hub as Master
  • wlach certified cinamod as Master
  • wlach certified msevior as Master
  • wlach certified Uraeus as Journeyer
  • wlach certified ariya as Journeyer
  • wlach certified voltron as Journeyer
  • wlach certified Bram as Master
  • wlach certified tromey as Master
  • wlach certified AlanHorkan as Apprentice
  • wlach certified DaveMalcolm as Journeyer
  • wlach certified caolan as Master
  • wlach certified apenwarr as Master
  • wlach certified pphaneuf as Journeyer
  • wlach certified ppatters as Journeyer
  • wlach certified dcoombs as Journeyer
  • wlach certified pcolijn as Journeyer
  • wlach certified uwog as Journeyer
  • wlach certified glasseyes as Journeyer
  • wlach certified CharlesGoodwin as Apprentice
  • wlach certified sfllaw as Journeyer
  • wlach certified andrewmp as Journeyer
  • wlach certified dag as Journeyer
  • wlach certified saul as Apprentice
  • wlach certified JoeNotCharles as Journeyer
  • wlach certified pvanhoof as Journeyer
  • wlach certified lkcl as Master
  • wlach certified Burgundavia as Journeyer
  • wlach certified Alphax as Apprentice
  • wlach certified robsta as Journeyer
  • wlach certified dobey as Master
  • wlach certified louie as Master
  • wlach certified bradfitz as Master

Others have certified wlach as follows:

  • fxn certified wlach as Journeyer
  • hub certified wlach as Journeyer
  • maragato certified wlach as Apprentice
  • ztf certified wlach as Apprentice
  • DraX certified wlach as Apprentice
  • sl0th certified wlach as Apprentice
  • stevej certified wlach as Journeyer
  • neurogato certified wlach as Apprentice
  • cinamod certified wlach as Journeyer
  • bgeiger certified wlach as Journeyer
  • ariya certified wlach as Journeyer
  • bjgm certified wlach as Apprentice
  • Uraeus certified wlach as Journeyer
  • voltron certified wlach as Journeyer
  • cwinters certified wlach as Journeyer
  • AlanHorkan certified wlach as Master
  • caolan certified wlach as Master
  • mdupont certified wlach as Journeyer
  • dcoombs certified wlach as Journeyer
  • sfllaw certified wlach as Journeyer
  • pphaneuf certified wlach as Journeyer
  • apenwarr certified wlach as Journeyer
  • pcolijn certified wlach as Journeyer
  • glasseyes certified wlach as Journeyer
  • uwog certified wlach as Journeyer
  • CharlesGoodwin certified wlach as Apprentice
  • andrewmp certified wlach as Journeyer
  • saul certified wlach as Journeyer
  • freax certified wlach as Journeyer
  • pvanhoof certified wlach as Journeyer
  • dangermaus certified wlach as Master

[ Certification disabled because you're not logged in. ]

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

X
Share this page