Name: Robert Collins
Member since: 2003-07-13 13:41:43
Last Login: 2008-07-23 00:41:50
Homepage: http://www.robertcollins.net/
Notes:
So advogato seems to have stuck....
I spend most of
my time hacking OSS/FS, with the primary projects
being:bzr
ubuntu
I've created a couple of utility
projects:
libgetopt++ (c++ getopt adapter)
freegen
(a freeswan config generator)
A few other projects I've created shall remain nameless, as they have been obsoleted or are otherwise currently non-viable.
And
projects contributed to (non-complete, not always source,
top of head list..):
libtool
automake
arch
Cygwin XFree86
libxml
libxslt
Squid (core
developer but mostly idle)
Cygwin (ex core developer -
idling for 4-5 years so I can't really claim core status :P
(yay much time on linux :})
Cygwin setup (ex project
maintainer)
What the git vs bzr discussion is about IMO is usability. The following blog post about DTrace on linux talks about the same issue, and I'd like to use Bryan's words:
"Over and over again, we made architectural and technical design decisions that would yield an instrumentation framework that would be not just safe, powerful and flexible, but also usable. The subtle bit here is that many of those decisions were not at the surface of the system (where the discussion on the Linux list seems to be currently mired), but in its guts."
->
"Over and over again, we made architectural and technical design decisions that would yield a Distributed VCS be not just safe, powerful and flexible, but also usable. The subtle bit here is that many of those decisions were not at the surface of the system (where the discussions going on at the moment seem to be currently mired), but in its guts."
I keep running into folk whom I knew of, that use bzr, but I did not know that they use bzr.
Right now there is a lot of discussion going on about DVCS in various projects. While I imagine most bzr users just want to get on with their coding (after all thats what bzr is good at :))... it would be fantastic if you could blog that you use it, and folk at GUADEC wear the T-shirt!
Also, I'm at GUADEC, and I'm extremely happy to answer questions from anyone, bzr user, git user, or even svn user :)
4 Jul 2008 (updated 4 Jul 2008 at 10:10 UTC) »
Well, the gauntlet is down (BTW - desktop power integration. Cool!). The use case Ted talks about is actually quite interesting - we were at UDS last month, waiting on a SVN server that was apparently so slow we could have walked to it and copied stuff onto harddisk more quickly. (Really. No kidding). bzr was idling and blocked on network IO the whole time... kudos for the plugin Ted!
For my response, may I present a new index format, (branch url) 70% smaller than bzr's current default, equally fast at most workloads, up to 20 times faster at others. I started this this week, and John jumped in in overlapping time periods, but I think it counts!
Note that the perfromance wins are a component improvement - other things we haven't addressed yet can make the index improvements less visible. But several early adopters have told me that they see a 25-30% reduction in 'time bzr log > /dev/null' or other commands.
To install:
bzr branch http://bazaar.launchpad.net/~lifeless/+junk/bzr-index2 ~/.bazaar/plugins/index2
bzr branch https://bazaar.launchpad.net/~jameinel/+junk/pybloom ~/.bazaar/plugins/pybloom
To use:
cd <repository you want to experiment on>
bzr upgrade --btree-plain
(or --btree-rich-root for bzr-svn users).
A version of this will be going to trunk soon, and it will be able to upgrade from any repository that you have that uses the plugin as long as you keep the plugin installed.
Dear lazyweb number 3.
So far, I've asked:
high latency net simulations - great answers.
python friendly back-end accessible search engines - many answers, none that fit the bill. So I wrote my own :).
Today, I shall ask - is there a python-accessible persistent b+tree(or hashtable, or ...) module. Key considerations:
- scaling: millions of nodes are needed with low latency access to a nodes value and to determine a nodes absence
- indices are write once. (e.g. a group of indices are queried, and data is expired altered by some generational tactic such as combining existing indices into one larger one and discarding the old ones)
- reading and writing is suitable for sharply memory constrained environments. ideally only a few 100KB of memory are needed to write a 100K node index, or to read those same 100K nodes back out of a million node index. temporary files during writing are fine.
- backend access must either be via a well defined minimal api (e.g. 'needs read, readv, write, rename, delete') or customisable in python
- easy installation - if C libraries etc are needed they must be already pervasively available to windows users and Ubuntu/Suse/Redhat/*BSD systems
- ideally sorted iteration is available as well, though it could be layered on top
- fast, did I mention fast?
- stable formats - these indices may last for years unaltered after being written, so any libraries involved need to ensure that the format will be accessible for a long time. (e.g. python's dump/marshal facility fails)
sqlite, bdb already fail at this requirements list.
snakesql, gadfly, buzhug and rbtree fail too.
Launchpad, please stop mailing me mine own comments on bugs. I know what I said.
kthxbye
robertc certified others as follows:
Others have certified robertc as follows:
[ Certification disabled because you're not logged in. ]
FOAF updates: Trust rankings are now exported, making the data available to other users and websites. An external FOAF URI has been added, allowing users to link to an additional FOAF file.
Keep up with the latest Advogato features by reading the Advogato status blog.
If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!