Python Consulting
This is an announcement that I will be doing Python
consulting from now. My expertise covers Python, wxPython,
NumPy and SQLAlchemy; and the primary area of my work is on
numeric analysis / statistics, though of course you get a PhD in
human-computer interaction thrown in if you want interfaces
made.
If anyone has any Python work they would like help with, I
can offer a discount on open source code. I can work
internationally as long as requirements can be sent
electronically. The best way to contact me as salmoni - at -
gmail.com
Apart from that, all is well here in the Philippines! The
coding on the new project is going well and I'm considering
farming off the database viewer/importer tool as a separate
component for database management. I'm not exactly sure what
functionality would be necessary for this, but suffice to
say that the basics should be easy to implement (and the
middling / advanced stuff a nightmare!).
Factorial ANOVA of large sets
I've also solved all the problems concerning factorial
analysis of variance for extremely large datasets (ie, those
too large to fit into memory). I will crack on with this
code now to get it done and to make an industrial quality
heavy-weight data analysis tool. This will be open sourced
in time, after testing anyway. The real problems that I have
are a) getting hold of an environment (ie, a machine with a
massive database on it), and b) getting comparison results,
though SAS should be able to deliver on this. I understand
that SPSS will face problems if the data are too big for
memory; but SAS can work around this just like my code can.
Moore's Law makes this of decreasingly utility; but it's
nice to have software that you know can handle any task.
Article
I've also enquired about submitting an article to a Python
journal about how to use the code module to implement an
interactive interpreter and embed it within a Python
program. This comes from work on the statistics program
where I wrote one for quick debugging and found it so good
that I extended it a little to be used as a permanent tool.
One problem we found is that when declaring and using a
variable, a user would have to write:
x = newvar()
or
newvar("x")
x.data([3,4,5,6])
It would make more sense (to novices) to write
newvar(x)
x.data(3,4,5,6)
It does this now. What I did was override the
code.InteractiveInterpreter.showtraceback method to catch
NameErrors (which are risen when x is sent to newvar because
x doesn't exist). Then the code works out the command and
sends it again to the newvar method but with the x in
quotes. It's minor stuff but less annoying to users.
And if a database has awkward variable names that are not
valid variable names in Python, they cannot be used: so I
added a catcher to showtraceback that catches
AttributeErrors and tests to see if a string has been issued
with a program method:
"Variable 1 (2000)".variance()
This would never work normally within Python without
overriding the string class (which is another possibility).
However, the catcher above can catch this attribute error
and redirect the 'variance()' bit to the proper variable
definition.
All this just means that the application is beginning to
work around its users instead of demanding that they work
around it.
I also added lots of alternative names for descriptive
tests so:
x.samplestandarddeviation()
x.standarddeviation()
x.stddev()
x.stdev()
x.sd()
all call the same function. This helps because when I've
used a new statistics program, I have to find out the exact
name for the functions. This way, I don't have to remember
which one: I just pick a common one, and away I go! :-)