PyPy goes Squeak: October 2007

Sunday, October 28, 2007

More Sprint Pictures

Some more pictures of the Bern sprint.

Discussing Shadows: Oscar Nierstrasz, Lucas Renggli, Marcus Denker, Tudor Girba, Niko Matsakis, Armin Rigo.

Marcus Denker, Adrian Kuhn, Armin Rigo, Toon Verwaest

Armin Rigo, Toon Verwaest, Carl Friedrich Bolz

At about one in the night, finally trying to leave: Toon, Armin, Adrian

Working on primitives: Oscar, Niko, Lukas

Trying to translate the Smalltalk interpreter: Carl Friedrich, Armin, Toon, Adrian and Lukas (in the front).

Saturday, October 27, 2007

The Bern sprint is finished, all the non-local sprinters have gone home and everybody is resting. The week was amazingly intense and productive, but also lots of fun. Thanks to all the participants! Many thanks also to the University of Bern to host the sprint, especially to Adrian Kuhn for putting lots of effort into the organization.

I am not quite awake yet to completely summarize the sprint, but some of the things we managed to do are:

Define a simple representation of the Smalltalk object model in Python
implement the bytecode dispatcher and all the Squeak bytecodes
implement helper functionality for defining primitives in Python
implement many of the essential primitives
implement an image loader that can load Squeak images
translate all of the above to C (or .NET, or Java, for that matter)

We managed to load the Squeak mini-image and successfully run the tiny benchmark at around a tenth of the speed of Squeak.

The area where the most work is left is obviously the primitives. We haven't even started on the graphical primitives yet and there cannot really start an image (just load it and call selected methods on some functions from the outside). On the other hand, this is all rather straightforward work so it probably won't take too hard thinking to do it. Another big question is image saving where it is unclear whether we should define our own format or try to be able to write Squeak images again.

I will maybe post some more detailed information about what we did during the week, so if you have some specific questions, just ask them. Also, I hope some of the sprinters (including myself) will work a bit more on the project in the next time. We are considering a followup sprint, let's see where that is going.

Friday, October 26, 2007

Translating the Smalltalk interpreter to C, Java or .NET

Yesterday, we started to point PyPy's translation tool-chain on the Smalltalk interpreter loop. In practice, this means starting the tool, waiting for a few seconds while it performs type inference, and looking at the error message that we get. The error message is sometimes a bit obscure (it's not a trivial job to report good error messages from type inferencers). Once understood, we fix the corresponding place in the RPython source (i.e. in pypy/lang/smalltalk/*) and try again. This try-and-error process can go on for half an hour, at the end of which the RPython source is eventually accepted by the translation tool-chain. We then get a nice executable file, produced by gcc compiling the generated C source. The executable does the same job as the original RPython source running on top of the standard CPython - it is just much faster.

First a WARNING: as we are actively hacking on the Smalltalk code, all the examples below might be broken in some revisions. It is a bit too early in the project to have tests for translatability, given that not everybody hacking on the source knows how to use RPython or understand the error messages. For now a subgroup of people is trying to run translation from time to time and fixing the problems. (It works in revision 48070.)

The first thing we translated is an interpreter with an "entry point" that contains, hard-coded, the Squeak bytecode for the Fibonacci series:

    cd pypy/translator/goal
    ./translate.py --gc=generation targetfibsmalltalk.py
    (lots of output...)
    (Ctrl-D to exit the debugger prompt at the end of translation)
    ./targetfibsmalltalk-c 25
    121393

Yay! In initial benchmarks, this ran some 10-15 times slower than the same bytecode in Squeak, which is not too bad for a first try. Since yesterday evening we added some tweaks here and there, so the current numbers might be better.

We can also translate the image loader code:

    cd pypy/translator/goal
    ./translate.py --gc=generation targetimageloadingmalltalk.py
    (lots of output...)
    (Ctrl-D to exit the debugger prompt at the end of translation)
    ./targetimageloadingmalltalk-c ../../lang/smalltalk/mini.image
    (lots of #)

The final executable loads the mini.image and runs the tinyBenchmark from there - and the tinyBenchmark even runs to completion since one hour ago :-)

An interesting feature of the translation tool-chain is that it is happy with targets that contains prebuilt data - even a LOT of prebuilt data. This means that we can also preload a Smalltalk image into the Python process - simply by calling the image loader before translating - and translate only the interpreter, as an RPython program, with all the Smalltalk objects from the image already loaded. The objects turn into static C data, which gives a large executable that contains essentially a built-in "image" - tons of static data that the OS will just mmap into the new process and only lazily load from the disk when the interpreter accesses the memory pages. Example:

    cd pypy/translator/goal
    ./translate.py --gc=generation targettinybenchsmalltalk.py
    (lots of output...)
    (Ctrl-D to exit the debugger prompt at the end of translation)

Be sure to look at the size of targettinybenchsmalltalk-c. It runs without taking any input argument - the image is already in there.

By the way, don't take any performance numbers too seriously so far. The point is that we actually managed to write a reasonably good base for a Smalltalk virtual machine in just five days of work, with a team of 4 to 10 people, depending on the hour of the day or night :-)

Oh, by the way, all the examples above can also be translated to Java bytecode or .NET bytecode (use the --backend=jvm or --backend=cli option to translate.py, respectively). For Java you need to install the Jasmin bytecode assembler (and make sure that "jasmin" is in your $PATH).

Armin Rigo

Toolchain applied to interpreter, Fibonacci running

Just a quick note (we were busy hacking and are now too tired for anything more): A few hours ago the interpreter was successfully translated to C and was running the Fibonacci function. It is around 15 times slower than Squeak, which is amazingly fast for the first try (PyPy's Python interpreter was at least 200 times slower than CPython when we first managed to translate it).

More details will follow tomorrow (in particular how to try it yourself).

Carl Friedrich & AA

Thursday, October 25, 2007

Shadows

Today we report on the internal representation of objects and classes in the VM we are working on. Since everything is an object in Smalltalk, even classes, were are confronted with two diverging forces. From the Smalltalk point of view all objects are created equal, hence it would be most natural to model them using one class W_SqueakObject. From the VM point of view we would like to represent some objects using special classes: classes, stack frames, compiled methods, method dictionaries, method contexts, block context and so on. If those objects would not be exposed to the smalltalk view, that would not be a problem at all. However, as they are exposed to plain Smalltalk we had to find a better solution.

This post will quickly describe the outcome of various discussions and refactorings that we had yesterday and this morning addressing this issue. After we wrote a prototype yesterday, several (heated) discussions with all sprinters and, finally, a complete rewrite of the prototype we arrived at the following solution.

Every Smalltalk object may have an associated "shadow" object. These shadow objects are not exposed to the Smalltalk world, they are used by the WM as internal representation and can hold arbitrary information about the actual object. If an object has a shadow the shadow is notified whenever the state of the actual object changes, to keep them in sync. One way of looking at shadows is that they are a general cache mechanism, however, the approach is way more powerful, hooking into the notification of shadow arbitrary meta-behaviour may be attached to objects. As an example, think of immutable objects which reject any modification.

In the current implementation, the shadows are used to attach nicely decoded information about classes to all objects which are used as classes. This allows to use any object as a class, even if they are not a subclass of Smalltalk.Class, which is very Smalltalkish. The shadow of the class stores all required information about classes in a nice, easily accessible data structure (as opposed to the obscure bit format used at the Smalltalk level). The class shadow mirrors format and size of instances, a Python dictionary containing the compiled methods (mirroring the method dictionary), and the name of the class (if it has one), etc. Storing the methods in a Python dictionary instead of the Smalltalk method dictionary allows the tool chain to generate better code for method lookups, taking advantage of its highly optimized builtin dictionary implementation.

Wednesday, October 24, 2007

Third day, work in progress

It is 19:36 local time and the sprint is still running. While I am taking a break, Armin Rigo and Carl Friedrich Bolz are working on the internal representation of classes for the VM. This is ongoing and tricky work, as in Smalltalk any object can (potentially) be used as class. Lukas Renggli just left, he continued today on the implementation of primitives. While Toon Verwaest is poking around on the loaded mini.image, printing all strings in the image and trying to execute random methods using pypy.lang.smalltalk.interpreter.Interpreter. He selects all compiled methods having no arguments and tries to execute them, some of the methods even run successfully.

You can find the image poking hacks in

    dist/pypy/lang/smalltalk/tool

I will now join Toon to pair up, thinking about and hopefully implementing some benchmarks for the loaded image.

cheers,
Adrian AA Kuhn

Second Day, Fibonacci Numbers running

Yesterday was a day of excessive coding. After finishing the W_Object model together on the projector, we worked in four teams on three parallel tracks:

loading a real Squeak image (Adrian K, Carl Friedrich)
interpreter loop and bytecode dispatch (Armin, Adrian L, Lukas, Toon)
implementing numbered primitives (arithmetic) (Niko, Tudor, Oscar)

Image Loading

There are certainly worse save file formats than Smalltalk images, however to conserve bits must have been very fashionable back then in the 80s. There are three formats for object headers, the compact one being the most crazy:

(2 bits) header type
(6 bits) word size
(4 bits) object format
(5 bits) compact class id
(12 bits) identity hash
(3 bits) used by gc

with that, memory footprint of empty instances can be as low as 4 bytes only, but at the expense of having 12 bits for identity hashes only! We are thinking about integrating a transition from 12 bit to full word hashes in the VM. Legacy instances from the Squeak image file will retain their 12 bits (to not corrupt existing hash tables) but any newly allocated object will be initialized with full 32 bit hashes. The limitation being that we lose backwards compatibility to the Smalltalk-80 save format.

At the end of the day, we were able to load in the raw data of all objects in the image, and to verify that all header information and pointers are correctly set. Today the image loader will be finished by instantiating one W_Object (or subclasses) for each of these image chunks.

Interpreter and Bytecodes

Armin, Adrian L, Lukas, Toon started implementing the bytecode dispatch loop and the context frame class. Once they got this they split in two groups and started implementing the bytecodes (most of which are there by now, apart from some which call primitives). At one point during the afternoon they managed to copy Squeak's bytecode for the Fibonacci function and run it.

Primitives and Base Image

The trio of Niko, Oscar and Tudor began implementing the base image, including class objects. They implemented templates for the math operations. This part is more or less complete, except for promoting from small to large integers.

However there are still plenty of primitives to go for the next days.

Tuesday, October 23, 2007

Some sprint pictures

Here are some pictures of the sprint today (of course all sprint pictures are really pretty similar, it's people in rooms around laptops).

Above, people are (clockwise, starting lower left corner): Niko Matsakis, Armin Rigo, Toon Verwaest, Tudor Girba, Oscar Nierstrasz, Adrian Lienhard.

Above, Armin Rigo, Lukas Renggli, Toon Verwaest, Adrian Kuhn, Carl Friedrich Bolz (absent, taking the photo), Adrian Lienhard, Niko Matsakis, Oscar Nierstrasz.

This important picture is showing the attempt of teaching the Swiss German speakers to pronounce "interpreter" correctly.

How to check out and try PySqueak

To try it out, you need subversion and Python version 2.4 or 2.5. You first have to check out PyPy:

    svn co http://codespeak.net/svn/pypy/dist pypy-dist

Warning, this checkout is relatively big. Another way if you only want to look at the sources is via https://codespeak.net/viewvc/pypy/dist/pypy/lang/smalltalk/.

The Smalltalk code is in the pypy/lang/smalltalk subdirectory; you can run its tests (72 so far) as follows:

    cd pypy/lang/smalltalk
    ../../../py/bin/py.test

Happy poking around :-)

Codespeak is back up again

Codespeak's repository is back up again and we imported the work that we did today into the pypy/lang/smalltalk directory.

Anonymous' questions answered

The questions that an anonymous commenter asked on yesterdays post are worthwhile to answer, so I decided to put them into a post of its own.

If someone could explain the differences between the three goal options in terms of what they yield, that would assist my understanding of PyPy no end!

To implement a Squeak-bytecode interpreter in RPython

This yields an interpreter, on PyPy's back ends, that executes user level Squeak code (compiled in Squeak) -yes?

Yes. The interpreter will be runnable after translating them with PyPy's backends, but also be executable on top of CPython, of course (which is slow, but good for testing). It seems we will be able to load (at least rudimentary) Squeak images.

The Squeak bytecode would be PyPy jitable -yes?

Very likely, yes.

Question: Would this interpreter necessarily need to implemented the interpreter-objectspace separation (plus flowspace)? Or is this separation only required if we want to translate (including jit) Squeak bytecode?

No, we are not implementing this separation in the Smalltalk interpreter. The separation is specific to PyPy's Python interpreter because we want to use it to analyze RPython. The Smalltalk interpreter (as most other custom interpreters) will be jittable without this separation.

* to define and implement an RSqueak as PyPy frontend

This allows translation of RSqueak code to PyPy backends -much as slang does to C (but allowing more dynamism than slang). Yes?

Absolutely. Of course it really depends on lots of factors, so who knows, without trying :-). The RSqueak would have been to be really object-oriented as opposed to disguised C (like slang).

One might then write an interpreter for Squeak in RSqueak -and the result would, ceteris paribus, be identical to 1 above in terms of capabilities. Yes?

Probably, yes. Of course for some Smalltalkers it will probably seem preferable, since they don't have to deal with this icky Python language too much :-).

This second option could also be used as a starting point for a completely Smalltalk-based toolchain.

* to write a Squeak backend for PyPy

So we would have a Python interpreter running on the Squeak VM? -executing Python bytecode on top of Squeak (which would be pretty slow) -or is there a way of getting from Python source to *Squeak* bytecode a la Jython/IronPython on JVM/CLR?

The former. And yes, the main reason for not doing this is because it would be too slow.

Codespeak repository down

The codespeak server is currently down, so it's not possible to look at the code at the moment (and also not at the images and talks linked to in the last post). The sprinters have switched to a non-public repository, we will post something here if the situation changes.

Monday, October 22, 2007

First day, Discussions

Bern Sprint, First Day, October 22

So, it seems customary for the Squeak community to use such incredibly advanced technology as RSS feeds and blogs to disseminate their sprint reports, which means us old-fashioned PyPyers just have to follow suit.

Today started the first ever Pypy-Squeak-collaboration sprint, kindly hosted in Bern by the Software Composition Group. The sprint aims to bring people from two different communities together (not without the unavoidable clash of culture, the first wars on zero- versus one-based indexing were already fought today) to learn about each others projects and to explore collaboration possibilies.

Morning was spent setting up the sprint room and general computer infrastructure like Squeak's VMMaker and Pypy. It seems that bootstrapping Squeak is certainly faster then bootstrapping Pypy (and uses way less memory) but not without certain problems. Adrian Kuhn tried to distribute an USB stuck containing a zip archive created under Windows. This turned out to be a bad idea, as obviously the notion of execution access permission is not known to Windows but crucial for Unices. In the end we finally succeeded. The PyPy checkouts of the Squeak guys on the other hand went quite smooth. Nobody tried to translate PyPy yet, though.

After lunch, Armin and Carl Friedrich gave various talks and demo for the SCG crowd (including some increasingly confused students). The first talk covered sprint-driven development in a two-slide presentation, including visualizations of software evolution, then we delved into the technical topics by re-giving our Dyla talk and demos.

After a break, we started brainstorming. We identified the following options as possible main goals for this week's sprint

* to implement a Squeak-bytecode interpreter in RPython

* to define and implement an RSqueak as PyPy frontend

* to write a Squeak backend for PyPy

even though option 2 was favoured as long term goal, we decided to go for the first option. To complete the second option, we felt, that we miss yet an understanding of the second option's trade-offs. Now, we do option one to gain insight into and a feel for the second option. We broke the first option down into smaller steps, as shown on this picture

Next Carl Friedrich quickly introduced the Squeak crowd to RPython, whereas Adrian Kuhn gave a quick and dirty introduction into the Smalltalk object model: "objects all the way down".

Finally, everyone gathered around Armin's laptop, who started hacking the very beginning of said object model into a set of RPython classes. Always writing tests first, true to Test Driven Development (while the rest of the crowed watched in amazement how the svn comments are announced live on PyPy's IRC channel). If you are interested in this very first prototype, please refer to the svn repo.

Please stay tuned for more news tomorrow.

Carl Friedrich Bolz and Adrian Kuhn

PyPy goes Squeak