pyv8: python v8 bindings and a python-javascript compiler

Posted 28 Sep 2008 at 14:03 UTC (updated 28 Sep 2008 at 16:33 UTC) by lkcl Share This

pyv8 is an experimental project to combine two-way python bindings to v8 with the python-to-javascript compiler from pyjamas. a simple test has shown a ten times performance increase of python code converted and executed as javascript, when compared to running the same program as python. (to be fair, cython gives a 100 times performance increase).

sloisel has created two-way python bindings to v8: pyv8 download a version with a Makefile for linux users is here: pyv8 with Makefile

pyv8 is the project which ties pyjamas' (pyjd.org) python-to-javascript compiler (called pyjs.py) together with python bindings to v8, and, due to v8 being able to call back out, it's also been possible to bind back in to standard c python modules and third party c-based python modules.

the simple test, test-pyjs.py, containing a deliberately inefficient fibonacci series algorithm, gives a ten times performance increase compared to running python fib.py. the output is shown via a python-tk popup window, to demonstrate that it's possible to call out to c-based python modules.

this is an experimental project. a much better version of pyv8 would do away with the need to use python to perform the conversion of its incoming programs into javascript - presumably by bootstrap-compiling pyjs.py itself into javascript, and running that native through something based on the v8 "shell" example rather than using python-boost to wrap libv8. under these circumstances, python-boost would only really be required to perform the bootstrap-compilation process, required for pyv8 developers and maintainers only: standard _use_ of pyv8 as a drop-in replacement for /usr/bin/python would use the precompiled javascript and the "shell".

so whilst pyv8 is a way off from being a formal release, the significance of the performance gain, the benefits of being able to bind to standard c-based python modules. the outright simplicity of pyjs.py (only 1200 lines) makes it worthwhile letting people know that the experiment was a resounding success.

it's also worth mentioning that pypy with its javascript back-end would also definitely benefit from using v8.


A question and an open request, posted 30 Sep 2008 at 08:53 UTC by chalst » (Master)

First, to lkcl, to clarify: two-way bindings means that the runtime supports both calls out from the compiled code and calls back? It's impressive, then, that the whole python runtime can be ported to the browser through light hackery, but I guess that this must be non- portable?

Second, I'm losing track of *-language->javascript translators. Ideally I'd like to see a comprehensive survey of all such notable bridges, with careful attention made to the space of design choices made, but since that document does not exist, it might help to assemble a list. I'll start with six languages chosen by theme:

  1. Lkcl's compiler pyjamas;
  2. pyv8 is a compiler together with a fixed bridge: is this article currently the goto document on this?
  3. Idiomatic reprepresentations of js: eg. ParenScript, a lispish representation of javascript which attempts to model common lisp idioms;
  4. Compilers for other languages: Ruby seems the most active, and rb2js seems to be the most important;
  5. Bridges to js engines exist for several languages; eg. the Perl JavaScript::SpiderMonkey module;
  6. Language interpreters inside js: eg. Oberon Script; this LtU story has some interesting further links.
I have researched only a few of the above in any depth, so doubtless there are mistakes; I'm more interested in hearing of more projects, and what makes them distinctive.

Mistake #1, posted 30 Sep 2008 at 08:58 UTC by chalst » (Master)

I described Oberon Script as an interpreter; it is, in fact, a compiler; I was actually thinking of jsscheme; an interpreter for a near approximation of R5RS scheme.

OberonScript is an interesting ->js compiler, in that Oberon paradigmatically takes a runtime more like C than the scripting languages, such as Ruby, usually compiled to js.

clarification, posted 30 Sep 2008 at 09:07 UTC by lkcl » (Master)

First, to lkcl, to clarify: two-way bindings means that the runtime supports both calls out from the compiled code and calls back?
two-way bindings means that not only has sloisel managed to bind python to v8, but also, thanks to v8's callback mechanism and boost, the converted javascript is capable of calling _back_ out to anything.... where in this case, that means "other python modules that are still c-based".

e.g. python-numeric, or python-tkinter.

It's impressive, then, that the whole python runtime can be ported to the browser through light hackery, but I guess that this must be non- portable?

this has nothing to do with browsers. absolutely nothing.

_one_ file has been borrowed from pyjamas - the python-to-javascript compiler. pyjamas comprises a python-to-javascript compiler, some DOM model libraries and a "build" script which glues the two together in an easy fashion, providing some builtin classes (such as List, Dict, set, math module).

so, sloisel's experiment is a trifle oversimplistic for real-world usage, because he didn't include any of the builtin libraries from pyjamas. if anyone wants to progress this further they will need to look at build.py, note the way that includes are done by extending the library path, and make sure that sprintf.js, builtins.py and math.py are all in the path.

regarding portability: portability is actually restricted not by pyjs.py but by google's v8 engine. v8 supports 32-bit x86 and arm processors only.

# Lkcl's compiler pyjamas;

it's not mine, i'm just someone who's interested in making sure it works and is useful, and, again, it's worth re-emphasising: pyjamas is a combination of a DOM model based widget set library _and_ pyjs, a python-to-javascript compiler.

javascript as an intermediate language, posted 30 Sep 2008 at 10:49 UTC by lkcl » (Master)

so - to emphasise: the sole exclusive purpose of converting to javascript is to get at the machine code executor in v8. that makes the javascript kind-of the equivalent of ".S" assembler files in gcc (albeit a pretty damn high level assembly language).

things which would be reaaally interesting to try out would be to make v8 take in python bytecode as input, instead of javascript. that would mean being able to dump pyjs.py entirely, which would be nice.

pyv8 is a compiler together with a fixed bridge: is this article currently the goto document on this?

yep. that and pyjamas-dev. pyv8 is definitely an early-days experiment (with a startlingly impressive empirical result).

Mistakes #2 to #N, posted 30 Sep 2008 at 12:43 UTC by chalst » (Master)

Thanks for the answers. Some minor points:

  • I meant js engine when I said browser, but there is no reason I can think of why one couldn't create a custom Chrome browser that supports pyv8's extensions. When I said non- portable, the point is that only such a modified browser could handle js that makes calls out to the non-js runtime.

    To be more concrete, a call to python_tkinter, say, is certainly code that will need specialist support in either the js engine or browser;

  • Credit where credit is due:
    1. Who started the pyjamas project?
    2. I'm right that you are behind Pyjamas Desktop?
    3. I've at least got a handle for the developer of pyv8. But who is sloisel? Google won't tell me.
  • I guess my list of projects would benefit from talking about what they make of DOM.

:), posted 30 Sep 2008 at 13:02 UTC by lkcl » (Master)

credits:

1) james tauber.
2) pyjd.org. yes. and the glib bindings to webkit, without which pyjamas-desktop doesn't happen.
3) see pyv8 0.1 release notice on pyjamas-dev.

ok so let's ask the original question as intended :)

It's impressive, then, that the whole python runtime can be ported to the v8 javascript engine through light hackery, but I guess that this must be non- portable?

none of the python runtime has been ported - not a single bit. what _has_ been done is that (in pyjamas, not the pyjs.py compiler) some javascript implementations of various classes, functions and modules have been provided - and there are more, better ones in the llpamies branch of pyjamas than there are in the mainline version, but that's another story.

so, the pyjamas builtins include a javascript implementation of sprintf, List, Dict, set, string and a little bit of the math module - enough to "get most applications working as web front-ends".

the neat thing about the v8 callback bindings is that there is far more available to pyv8 developers than there is for pyjamas developers, because pyv8 can do access to c-based python modules (and pyjamas apps are restricted to running in web browsers).

# I meant js engine when I said browser, but there is no reason I can think of why one couldn't create a custom Chrome browser that supports pyv8's extensions. When I said non- portable, the point is that only such a modified browser could handle js that makes calls out to the non-js runtime.

so... what you're saying is that (leaving aside the fact that pyv8 demonstrates that you can call out to python from command-line, demonstrating that standard python apps can be speeded up by a factor of ten) pyv8 shows that it would be possible to access standard c-based python modules, amongst other things, from inside a browser that used v8 (e.g. google chrome).

i can see that it would be incredibly powerful and useful to allow access to other languages. i kinda worry somewhat at the security risks associated with doing so, though. particularly as the addition of __class__.__new__ (or something like that) was added to python at about python 2.2 which totally broke rexec.py (an escape-route, via the optimised c implementation, was implemented in such a way that the standard restricted python module was irrevocably broken).

i do think it says a lot about the design of v8 that it allows this kind of mix-and-matching, though. it's fantastic.

pyv8 is a command-line tool (nothing to do with browsers), posted 30 Sep 2008 at 13:11 UTC by lkcl » (Master)

i think it's worth re-emphasising that pyv8 is a command-line tool, not a browser-based javascript engine. there's absolutely no involvement, dependency or link to browsers, at all, other than "where the code came from".

google's v8 engine is a stand-alone library, implemented in c++, where _one_ usage of v8 _happens_ to be in google chrome's browser.

pyjs.py is a stand-alone python-to-javascript compiler, which _happens_ to be a necessary tool required to make pyjamas be an effective browser-based applications widget set (which can be written and developed in python, not javascript).

ultimately, with quite a lot of wizardy, pyv8 could become a drop-in replacement for /usr/bin/python (or python.exe if you use windows).

personally i'd much rather see the v8 code executer / compiler integrated into python itself (doing away with the javascript intermediary) working off python's FORTH-engine-inspired bytecode.

compilers for other languages, posted 30 Sep 2008 at 15:15 UTC by lkcl » (Master)

so - chalst - all of those other languages, which have -to-javascript compilers, could benefit from the performance gain of v8, and the means to link back in to "real" code. i don't know if jssscheme runs _purely_ in a web browser, but it could conceivably be extracted from that environment and made standalone, if it already isn't.

pyv8's callback mechanism benefits somewhat in the extreme from the existence of python-boost: i wouldn't know where to begin to suggest to other language writers how they should go about doing external callbacks, but the more of the language that is written _in_ that language (unlike python, which has a lot of c-based modules) the better.

Thanks..., posted 30 Sep 2008 at 15:48 UTC by chalst » (Master)

...for your patient explanations. You wrote, of the idea of the custom Chrome browser — let's call it pychrome— i can see that it would be incredibly powerful and useful to allow access to other languages. i kinda worry somewhat at the security risks associated with doing so.

Well, quite. It would need some sort of security model to be anything other than a dangerous toy.

questions, posted 30 Sep 2008 at 16:33 UTC by lkcl » (Master)

...for your patient explanations.
no problem - thanks for voicing the questions: if you weren't to ask, then there would be other people - who didn't ask - getting a mistaken impression.

Just curious, why Javascript (V8)?, posted 30 Sep 2008 at 17:54 UTC by atai » (Journeyer)

Just a curious question: while V8 is great for making Javascript fast, is Javascript really suitable as an intermediate language? For example, there must be other compilers (or JITs) faster than V8 that can serve as the intermediates? You mentioned "cython", but can Java or C# or Lisp be better choices as well?

because it's there, posted 30 Sep 2008 at 18:52 UTC by lkcl » (Master)

atai - i'll be honest: i think other people are better qualified to answer the questions you raise. i think it honestly doesn't matter what language is picked, as long as there are "one-to-one and onto" mappings for language concepts. bizarrely, other than the scoping rules (which pyjs really has to work quite hard at to avoid breaking), javascript seems to be a pretty good fit for conversion to python.

have you ever seen that visual studio tool which does language conversion? i wouldn't believe it if i hadn't seen it: it's a plugin for visual studio, and it literally converts any language to any language. my friend demonstrated conversion of java to c sharp to b to ruby - the only significant language missing was python. from what i can gather, it's as simple as converting to-and-from CLR.

overall, however, to answer "why javascript, and why v8?" is a simple answer: because it's there - because someone _has_ written a blindingly quick javascript-to-machine-code converter/engine. it would be genuinely daft to _not_ make python - or any other language - take advantage of that.

connect the two chains together.

cython is a different starting point: it's not _quite_ "real python", it's a subset of python, with extensions. for a fair comparison, pyjs.py definitely isn't perfect: there are bugs in the use of the sprintf.js library that stop you from having more than one % substitution in the specifier string - "%s %d" % ("hello", 5) doesn't work, but "%s " % hello + "%d" % 5 does - and there are little niggling incompatibilities that make it really awkward to work with (which is why i did pyjamas-desktop, so that you could actually develop your application with the "real" python interpreter).

the cython compiler therefore is kinda cheating, as this really good cython tutorial demonstrates. it shows you how you should manually convert your initial python source over to cython, what sorts of things you should look for; avoid calling out to the standard python libraries (those that cython supports) in inner loops, that sort of thing. many people simply don't want to go to that kind of effort, which is why in many cases they simply plump for a c-based module for the inner loops - or just avoid python altogether.

regarding "faster compilers / JITs", i did look on codespeak.net's site, for the pypy project - and found LLVM. LLVM is one of the back-end targets supported by pypy (as is javascript, java runtime, CLR and one other, i forget what it is).

so yes, there are definitely a lot of options - they've just been joined by one more.

my gut feeling tells me that java and c# would definitely be intermediate languages to avoid using, but that lisp might be a good one. ultimately, though, it boils down to whether someone's spent the time and money going to the effort of making a particular engine/compiler. there's a clear need and benefit to making javascript run blindingly quick, and, there's a clear need for google to have v8 generate ARM assembler for example (android, anyone? 200mhz and 400mhz embedded processors with only 128mb of memory?). if, through pyv8, other people get to leverage that work, thus making their lives easier, _great_.

diary post containing links to other javascript compilers, posted 3 Oct 2008 at 14:38 UTC by lkcl » (Master)

mr blueish coder writes about some javascript converters he's found.

interesting article on javascript, posted 8 Oct 2008 at 20:40 UTC by lkcl » (Master)

john resig on extreme javascript performance.

the vast amount of attention that javascript is getting makes it a _really_ good choice of intermediate language.

an additional answer to the question raised a couple of days ago, "why javascript, not c# or java or lisp as a conversion target for python"?

RJS, posted 12 Oct 2008 at 11:31 UTC by lkcl » (Master)

not sure if RJS is the same as rb2js - see RJS, here

Thinking alike, posted 5 Nov 2008 at 17:41 UTC by cananian » (Master)

I thought of hooking python to v8 as soon as I saw the v8 announcement, too. Glad that sloisel actually did it!

I think going from Python bytecode is the way to go long-term: that way you don't have to worry about parsing python, and the python bytecode is actually pretty small and straightforward.

To answer other posters' questions: why javascript? why v8? --- The simple answer is that Javascript's dynamic object model is a very good fit to Python's, and v8 was explicitly written to take advantage of this. As a basic example: in python I can add methods and fields to an object "on the fly" after it has been created. You can't do this in (say) Java. A JVM is built to assume that method dispatch is fixed at object creation time, and targeting Python towards such a virtual machine is complicated and inefficient. v8 was explicitly built to allow these types of object mutation, and to still use efficient method and slot dispatch: basically, the type of the object is invisibly changed by v8 as you mutate the object, so that you always get efficient method dispatch, even for dynamically-added methods. So taking advantage of this feature of v8 allows you to make a really efficient Python runtime.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

X
Share this page