Archive for March, 2005

Mar 31 2005

A little compiler

Published by matt under Uncategorized

During the last two weekends, along with some good help from Christian and Damian, I wrote a compiler.

The compiler takes a subset of the occam programming language, and compiles it to Transputer bytecode. SEQ, PAR, ALT, channel communication, and some basic arithmetic and logical expressions can be dealt with by this compiler. Written as “one to throw away,” the compiler was an exploration into some design decisions and their potential impact in a larger (full) compiler. Furthermore, it provides a starting point for the discussion of what languages and tools are most appropriate for authoring a compiler today.

A lab seminar and tech report on the compiler will follow shortly.

Related, but not quite, to this are a number of links I don’t want to loose track of. Mostly, they’re just the results of a Google search that this two-weekend project inspired. No doubt, they’ll inspire other searches as well.

Comments Off

Mar 30 2005

Zire 72?

Published by matt under Uncategorized

Lets do a little math today.

If I wanted to purchase a Zire 72 from the Palm UK store, it would cost £189. If I purchased the same product from Palm US, it would cost $300, minus a $50 rebate. Multiply $300 by .531276, and you get £159. Additionally, if you purchase in the US, you

  • Receive a 128MB SD card with your purchase (free)
  • Can mail in an old handheld for a $50 gift certificate
  • Can get another $100 off the handheld by joining Audible for a year ($150 cost, so this is a net $50 loss)

That is, the Zire 72 in the US is $250 (£132), I can get a $50 gift certificate for the Zire 21 I paid $40 for at Amazon, and can consider paying $50 for a 1-year subscription to Audible.com, which is an MP3 audiobook download site.

Or, I can pay £189 in the UK for the same device.

(This is, of course, assuming I even decide to upgrade.)

Comments Off

Mar 20 2005

More R/C aircraft

Published by matt under Uncategorized

So, I’m enjoying surfing the web for this. Given that I’ve flown twice in my life—once was a failed flight of a Kyosho Serenede, and the second was a powered foam delta wedge. In the second case, I had to claim I was an experienced flyer, but I did quite well, actually…

  • The Mugi Evo is a twinwall craft. This is made out of twinwall plastic/sheeting, and it’s a cut/fold/glue operation to build the craft. So, the cost (outside of the radio/servos) is minimal. I’m guessing they’re unstable in the air, and not an ideal choice for a beginner. That is, they’re probably relatively fast flyers and unstable in strong winds.
  • The Wattage Micro Flyer
    is just funny. It flies on an oversize capacitor (no doubt) for 6 minutes, and then you “recharge” it with the R/C transmitter/battery pack. Cute.
  • The Indoor Flying Object is an interesting little craft. Made of flexed carbon fiber and… kevlar, this little craft looks like it could be fun and take a beating. Granted, I imagine that it’s not an easy beginner craft. As an “acrobatic slowflyer,” I suspect that means “dynamically unstable, even at low speeds.” However, cool looking little aircraft.
  • A Carbon Falcon is an appealing style of craft in many ways—although, the rudderless design seems to imply some potentially dangerous flight behavior. It’s an appealing design because the carbon-fiber construction can be broken down and rolled up. As long as I’m living abroad, that has a certain appeal. That, and the kite-like construction is, apparently, nearly indestructible.
  • Handlaunch gliders look fun—limited in scope, challenging to learn, and certainly don’t require much infrastructure. However, once I start looking at gliders, I find myself lusting after a 2m sailplane… Looking at a few handlaunches (the Zuni, for example) and DLGs (discus launch gliders, like the Osiris) I’m impressed with the minimalism of the construction. DLGs are especially appealing to me, I must admit, as I threw discuss for 7 years, through middle/high school and college. Lots of links here and here.

Ah. Fantasies of flight. And, after watching a few videos of sport gliders flying at 60-80 MPH, I know I don’t need one of those!

Comments Off

Mar 20 2005

Toy

Published by matt under Uncategorized

I think I’d like a toy. In particular, I think a remote control powered parachute would be very cool.

Not that I imagine this in the immediate future, but… well, anyway.

Comments Off

Mar 16 2005

Earthly destruction

Published by matt under Uncategorized

This means it is time for bed.

Comments Off

Mar 14 2005

O(n) operations…

Published by matt under Uncategorized

And I quote:

“Dude, if your examiners ever find out you were surprised that O(n) operations, like indexing into a list, were slow, your PhD is totally revoked! ;-P”

In light of yesterday’s post, this is a fair comment, but let’s put it in context, yes? (Fear not, gentle reader: ninjas have already been dispatched to deal with the commenter. And, I think the commenter was just teasing me… but, the ninjas will torture that fact out of him at some point or another, no doubt.)

The Transterpreter project has, from the start, been intended to be as simple as it can be in execution and implementation. Our primary goal is to create a portable runtime that can be easily ported to any architecture we encounter. Therefore, throughout the development of the runtime, we have consistently made semantically clear but often computationally expensive choices.

Our architecture is, generally speaking, easily optimized locally, and in some cases globally. However, optimizing sooner (rather than later) is a good way to be left with code that A) you don’t understand, or B) is potentially inflexible or inappropriate in light of later extension. In the long run, our goal is to support an evolving programming language; how do you design tools to support that? Guy Steel gave a brilliant talk on this at Indiana while I was there, and this short article on the Sun website touches on some of the issues involved. In short, the number of factors involved in supporting a growing and evolving programming language are many, and complex.

So, no, we weren’t surprised as such that indexing into a list was expensive. Rarely have I written code where the time/space difference between referencing into a linked list is significantly different than indexing into a vector. If anything surprised me, it was the immediate and obvious disparity between the two, and how (as a Scheme programmer) I often completely forget about the underlying performance concerns. Put another way, I am more often concerned with correctness than performance, or am writing code that A) is evolving rapidly and B) has limited application. Therefore, a list (which is, in a dynamically-typed language like Scheme) becomes a “universal type” that gives me flexibility and freedom to evolve interfaces quickly and without large changes to my core program logic.

In the end, you could argue that we should have known better from the start. However, you could also argue that we’ve used appropriate tools in appropriate ways to get us quite far into a project that otherwise might have become bogged down in details long ago. It is flexible, runs most everywhere, and while it did have a 60x speedup lying around, that speedup took … two, three hours of time to realize fully? This implies that our design was simple and clear enough that we could pluck that 60x off the tree as low-hanging fruit, and not a major architectural overhaul. While some of our local choices may have been naive, our global, or architectural choices have been, I think, sound.

Comments Off

Mar 12 2005

The Power of Little Languages, Good Design

Published by matt under Uncategorized

For over a year now, I’ve been collaborating with Christian on a lightweight and portable runtime for the occam programming language which we call the Transterpreter. This has been a fun project, because occam is a clean little language, and I enjoy architecting and writing compilers, interpreters, and similar kinds of software.

We wrote our first version of the bytecode interpreter in Scheme. Because we were reverse engineering the Transputer (a processor that ran compiled ocacm bytecodes) from a variety of sources, we felt that we had enough concerns without worrying about, say, memory management or pointer arithmetic. Our first version was slow, but it was clear, and the first version of the rewrite into C was almost a line-for-line translation.

Early in the project, Christian had written a Python script to clean up the bytecodes emitted by the existing compiler. It did the job, but with the rewrite into C we began thinking about the infrastructure for supporting a virtual machine that should be able to run anywhere. The occam compiler runs under any *NIX-like environment, and can be compiled for Windows using Cygwin or an old Borland toolchain, so any extensions to the toolchain should be at least as portable, if not more-so.

We considered a number of alternatives, but because of my experience programming in Scheme, we decided that PLT Scheme would be the implementation platform of choice. PLT Scheme (which encompasses MzScheme and DrScheme) runs damn near everywhere; there is excellent support for executing Scheme programs (interpreted or bytecode compiled with a run-time support library) on Windows, Mac (OS 9 and X), most flavours of *NIX, and even some PDAs. This way, the backend of our toolchain would be very portable, and let us program in a clean, well-understood and well-used language.

So, the slinker was born. Although, it is more than a linker—it’s like a compiler backend-ish, really. It started off with a 15-20 micropass (pipelined) architecture (inspired by this work), has grown in reasonable ways, currently needs some cleanup, but the architecture has proved to be flexible enough that we’ve been able to accommodate interesting additions without having to do ugly things like, say, scrap the codebase. We successfully designed for extension, and that’s good.

Unfortunately, when we started running the CG test suite, we suffered; the slinker was taking as much as 20-30 minutes on some tests. The compiler took seconds, the slinker took half-an-hour, and then running the test took a few seconds. This was, we agreed, bad, but optimization takes time, and we didn’t have that. We did have machines laying around that could run the test suite over night, which was good enough.

An exercise in optimizing Scheme

Yesterday, we started optimizing. Now, I learned some interesting things (I should have known) about PLT Scheme in the process. For example, we had a loop that looked like

(while (

This code was killing us. Slaughtering us. On 150,000 element lists, we would run for 200,000ms or more (and this was a large, but not huge, bytecode array). All the other passes in the slinker ran in 0 to 60 ms, and the DrScheme profiler was really quite clear that our problem was buried in this loop---not elsewhere.

This unrolled to a 'named let' in Scheme, which should be every bit as efficient as a for in most other languages, so we assumed our loop was not a problem. Or, put another way, we had perfectly fast code elsewhere that used the same macro, so that wasn't the problem.

Immediately, I decided the consing to build a list was certainly a problem. So, we got rid of that. To find out what was a good structure to use, I wrote a little "benchmarking" script to test insertions into vectors and hash-tables. It was faster to do an insert into vectors, so the structure build-up was modified.

(while (

Now, this code ran faster. We got a barely perceptible speedup at this point. This struck me as odd, as we had just benchmarked this, and clearly we should have seen a large speedup.

So, that wasn't the bottleneck. Keep in mind now, we only have a few things going on in this loop:

  1. The test (,
  2. A vector-set!
  3. A list-ref on the bytecode list
  4. A variable increment (the set! on pos)

Lets eliminate a few things right now. First, the increment on the variable is not the source of the problem. The vector-set! had just been benchmarked, and we revisited our benchmarking script to verify, but decided that we certainly were OK on the vector insert. The test, also, isn't a problem: even with dynamic typing, the check to see if one integer is smaller than another is still fast. Which left our list lookup.

We achieved a 5x speedup by changing one data structure and one function call in our code above.

(while (

See it? vector-ref instead of list-ref. Why? I haven't looked at the MzScheme source yet (I will, as I'd like to know what's going on) but I have a good idea as to why this accounted for a massive speedup in our code. Vectors are probably treated as contiguous locations in memory, and so the entire structure is allocated on the heap in one big chunk. When you operate on it, and in particular, reference into a vector, it's a simple index into memory from the head pointer.

Lists in Scheme are different; they're a bunch of cons pairs (or, they can be). I suspect that MzScheme handles lists very differently from vectors, and when we build a list with 150,000 elements, the cons pairs end up all over the heap. Referencing into the list is certainly not a constant-time operation; in fact, we were being badly, badly abused by our references into the list. I have no idea how many trips out to main memory had to be made to reference into element 126449 of our list... perhaps dozens.

This morning, Christian worked on a few other passes that we suspected had similar ugliness---as they, too, took many minutes to run. Now, our largest test case, which previously took 30 minutes to link, now links in approximately 30 seconds. That's a 60x speedup.

Lessons learned

I don't think we've done anything wrong here; we wrote simple, clean code to start that allowed for flexibility while developing. When we finally reached a good place to stop and look at the entire project, we made some key optimizations that, perhaps, could have been implemented from the start. However, early in the project, speed didn't didn't matter. Good design and sound implementation mattered. The software worked. And, we think, it continues to prove itself as extensible and maintainable.

And, now, we see that our linker is easily optimized as well. Our entire compiler test suite of several thousand cases runs in 30 minutes, and perhaps even less before the day is over. A few days ago, we had to run it overnight because it took so long; now you can start a run before going to lunch, and have results when you get back. Sweet

But yes. I believe it's true: premature optimization continues to be the root of all evil.

(PS. I'll drop a note here when I take some time out to look at how MzScheme handles vectors vs. lists; that was an interesting revelation.)

Comments Off

Mar 09 2005

Chaos

Published by matt under Uncategorized

I’ve missed two meetings already this week. The chaos of moving has not been kind. I think the chaos is exacerbated by the fact that I returned from two weeks travel three days before moving.

However, I’m doing well on my daily writing regime. For the moment, I’ll console myself with the fact that every day (typically morning, occasionally evening) I’ve sat down and written several pages of the dissertation. If I keep writing, every day, it will get done.

It still doesn’t make me feel better about missing meetings, though.

Comments Off

Mar 07 2005

Missed meeting

Published by matt under Uncategorized

I was very upset to miss the LISP meetup this weekend, where I could have met Bill face-to-face. Bugger, as they say over here.

Travel in the SouthEast has been miserable with the snow that hit, as there is absolutely no infrastructure for dealing with it. The third-rail system (where the train draws power from a third rail on the ground for power) tends to give up the ghost in all kinds of weather. For example, in the autumn, leaf accumulation can bring trains to a halt, as they have no way of obtaining power…

So, the snow and ice did a number on travel in Kent, and I was unsure and unable to verify if I’d be able to get up and back to London. Given that I had to move out of my place Sunday, and I couldn’t get guarantees, I was in a bad way.

Next time, Bill.

In the meantime, us Schemers have a new xml-match macro to play with thanks to Jim Bender’s fine work. I downloaded it this morning and took it for a quick spin. I must admit, the quasiquoted syntax made me feel like I was back home programming in the IUB matcher. Thanks, Jim!

Comments Off

Mar 04 2005

Re: On syncing AirPort Expresses

Published by matt under Uncategorized

Sorry, Geoff… I don’t have my RSS reader set back up yet, and was travelling, so generally was out of touch.

Over at Tip of the Quill, Geoff asks “[Why can't a] utility be created that sends a four-note pattern to each set of AirPort Express boxes in turn, hooks into the microphone on the machine and autolags the signals until the four beat patterns match up, then automatically caches the video or plays the audio a second or two ahead of the video, as they mention above?”

It could. However, there’s probably a few technical problems that get in your way:

  1. Streaming to the Airport Express likely involves recoding the audio stream in Apples loss-less codec. This is a computationally expensive operation. Streaming to multiple Expresses at the same time is probably not (generally) feasible, and certainly doesn’t scale well. Somewhere, something has to give (CPU limits, bandwidth on the wireless network, etc.)
  2. Autolagging would need to be done by each individual app. Just considering a few obvious pieces of software with this problem (VLC, MPlayer, iMovie, iDVD, Quicktime Player, Apple DVD Player), how do we solve this problem for all the apps? Each needs the ability to play video ahead of audio if you’re going to stream to a lagged audio device. Apple DVD player almost certainly was not written to do this: it was instead written to simply read, buffer, and output DVD content. Do you really think Apple is going to modify all their production apps to be flexible in this way?

In short, most apps assume that when you ask for an audio and video event to take place at time X, you get them both at the same time. As a result, most software is not written to to separately request audio events at time X, and video events at time X+1. Why, exactly, VLC has this feature, I don’t know.

There are a few other solutions that they hadn’t considered over at Rogue Amoeba, however, that solve these problems; in some cases, they cost as much as the software, and most all of them cost less than the cost of an Airport Express and AirFoil:

  1. Use an audio cable. This lets you transmit audio from your Mac over long distances without lag.
  2. Use a splitter. This lets you transmit one audio stream from your Mac to several different audio output points.
  3. Use an amp. Combined with a cable (#1), you often get the features of a splitter (#2), but it is more robust, and often lets you mix-and-match the output devices you are using (eg. “Speaker Set A”, “Speaker Set B”, and so on).
  4. Use an FM transmitter. Radio travels, by-and-large, at the speed of light, being electromagnetic radiation and all. Combined with an amplifier (#3), you can get wireless operation and a selection of output devices.

OK. So now I’m being a full-on smartass. However, my point is that the Airport Express is a useful gadget: it provides a base station, wireless printer port, and wireless 1/8″ audio jack in one little package. Streaming the audio portion of video was not one of the use-cases for the device, clearly. Perhaps Steve Jobs is going to do a “one more thing” with the next rev, selling us on how great it is that they fixed this problem, and he won’t even admit that the first version is, for an obvious use case, broken.

In another 6 months, someone will release a version that lets you stream audio over HTTP (or some other open, well-established protocol; it will be a relatively open knock-off of the AE), and will likely be fast enough to let you do just what you want: stream any app to the remote station without lag.

I don’t know if any of that helps.

Comments Off

Next »

  • Ancient History

  • Meta

  • CC License