Archive for February, 2007

Feb 28 2007

The sorry state of embedded development under Linux

Published by matt under Uncategorized

This… this is a rant.

It is incredibly frustrating being a Mac and Linux user if you’re interested in embedded systems development. In particular, if you are working with the MSP430, one of the only viable parts if you’re really, really concerned about power consumption. There simply aren’t, at this time, any parts (that I know of) that come close in terms of power saving modes.

On the Mac and under Linux, it is possible to build the MSP430 GCC toolchain. That’s a big step, and it integrates nicely into build systems like make and Scons. However, that’s only part of the problem. Once you’ve built the executable for your embedded platform, you need to make the code go from the PC to the embedded device.

By far, the most common way for this to take place is via JTAG. This multi-wire protocol provides a mechanism by which the flash in these little processors can be rewritten from the host PC, as well as (if the tools are available) in-circuit emulation. Of course, both of these assume certain hardware and software connected to and on the host PC. The biggest problem I’m running into right now is that the JTAG programmer MUST be USB-based, or I can’t use it—because my Mac doesn’t have a parallel port, so even if I’m going to build a Linux virtual machine, it still needs to be a USB-based solution.

For example, both Rowley and Softbaugh produce USB-based MSP430 JTAG programmers; neither provide support for Linux. They have Windows XP and 2000 drivers, but they don’t support any open-source platforms; I guess embedded systems developers don’t use Linux. (As long as you’re paying £150 for a programmer, you might as well pay for your OS as well.) I’m not having much luck discovering which platforms the MSP-FET430UIF does or does not work with; are there Linux drivers? Aren’t there?

The problem, quite simply, is that we are an open-source project that builds under Windows, Mac, and Linux; somehow, I’d like to easily build the MSP430 version of the Transterpreter and install code on a development board. Currently, I’ve been working on the Tmote Sky simply because it has USB—but that’s not acceptable looking forward. And, more specifically, I travel a fair bit—I have to have a mobile solution that can work from my MacBook.

It seems to me that most embedded systems developers must not travel much, and have a big workbench where everything is set up just so for the work that they do. These developers all use Windows, and they pay a lot for their tools—because they are not trying to do embedded development under the auspices of an open-source research project. They are, instead, charging what their time is worth, and they can (therefore) afford the tools and bench space to do what they need to do.

I can run a Windows virtual machine on my Mac, and I’ll have to; it will run the lite tools provided by Softbaugh, and I’ll be able to program MSP430 boards using Tom’s (Windows only) programmer over the Texas Instrument flavor of JTAG. (Why, oh why, do embedded developers put up with the lack of standards on flash programmers? Every single chip should be programmable and inspectable over the same protocol.) I can mount my MacBook via Samba, and my development process will look like this:

  1. Edit source on Mac. (Using freely available, open-source tools.)
  2. Compile on Mac. (Using freely available, open-source tools.)
  3. Flip to Windows, and upload the code to the MSP430. (Using expensive, closed-source tools.)

Rinse and repeat.

If anyone has any thoughts on how we can improve this situation, please let me know. It seems to be we should build a device that has USB on one side, a 20-pin header on the other, and we build cables for each chip family we want to support. There is one protocol on the USB side, and the MPU in the middle of those two connectors handles the JTAG translation for each device family on the other side. Then, we demand that TI, AVR, and every other embedded chip developer provide the driver for that device that sits in the middle—and nothing else. If they don’t provide that software, we don’t use their product. This kills the JTAG programmer market (proprietary hardware that is often tied not only to a single OS, but also a single compiler), but that’s fine by me.

This is the end of this rant… for now.

Comments Off

Feb 27 2007

Ketchup

Published by matt under Uncategorized

I’ve won an award! My housemate Poul has decided I am an excellent Ketchup Shopper!

20070227-Ketchup

I’m going to go find out what it is…

Goes downstairs, outside, inside, and back again…

They’re Greenfoot t-shirts! SIGCSE must be coming up…

3 responses so far

Feb 23 2007

A DOSterpreter?

Published by matt under Uncategorized

Ohthehumanity-1

Although it sounds odd, Damian built a DOS version of the Transterpreter last week.

Oh, the humanity!

This may sound wacky, but the reasoning is sound. We have a large suite of tests that we run the Transterpreter through on a regular basis. Actually, we use buildbot to run all of the tests anytime someone checks code into the trunk. However, this only runs the tests on “big” platforms—it runs them on a Linux box or two, and a Solaris/SPARC machine. We cannot, at this time, automatically run all of our tests on an embedded 16-bit target; while we do have a development board or two we could dedicate to this task, I’m not interested (right now) in doing the scripting to upload code over a JTAG port, run the tests, and check (on the device) to see if we are doing the right thing. It’s a lot of scripting.

H8300

The H8/300, the processor at the heart of the LEGO RCX

So, when Jon was having a hard time with the LEGO RCX recently, it became clear that we really needed to run our tests on 16-bit platforms. This would help shake out whether or not the compiler, linker, or run-time were having a hard time when there were fewer ones and zeros running around. As a result, Damian setup the Open Watcom 16-bit DOS compiler, built the VM, booted up a FreeDOS image in QEMU, and ran a bunch of the tests through the VM. We passed all the tests we could easily run, and are working to get the rest compiling. It turns out we need to build some libraries that we’ve never needed before…

So, yes, the Transterpreter can run in 16-bit mode in DOS. There’s a number of features that are untested at this time, but hopefully we’ll find a way to automate the testing of the VM under the FreeDOS image.

Comments Off

Feb 23 2007

The iLiad Improves

Published by matt under Uncategorized

Slowly, the iLiad continues to improve.

I can now rotate an A4, zoom, and set my viewer to a continuous scroll. This means I can easily zoom in to half of the A4 (that is, A5), and then thumb through the document. This is almost good enough for viewing research content (articles from the ACM and so forth).

And, I now can annotate PDFs.

20070223-Iliad-Annotations

The PDF annotations are saved on the iLiad in the same place as the PDF; no doubt I could write the program that merges those annotations into the PDF myself, and I may look into it. However, for the moment, I’m content that it saves my annotations, and someday those annotations will be automatically merged into my PDF.

Carrie and I are off to London for the weekend; we’re seeing Patrick Stewart in The Tempest. Hence, I need a copy of the Tempest to read on the train on my way up to London. Whee!

One response so far

Feb 21 2007

Floating point is in da house

Published by matt under Uncategorized

I have a backlog of things I can write about in this space, so I’ll do my best to do a bit of catching up.

To start, Damian has been working on floating point for some time, and we now have full floating-point support in the runtime on large, little-endian targets. To be fair, the Transterpreter has always supported floating-point operations from day one. Unfortunately, it has carried out floating point operations in emulation. So, whenever the Transterpreter did floating point operations in the past, it would be emulating floating-point support and be doing all of the operations using integer maths at the interpreted level.

This was slow.

Words like molasses, glacial, and election reform come to mind as good words to describe the pace at which we used to carry out floating-point operations in the VM.

Now, due to Damian’s efforts, we have full floating-point support in the VM. This was one of the few remaining parts of the instruction set that had not been implemented yet. Now, the floating-point instructions that were introduced in later generations of the Transputer (T8 and above, I think?) are supported natively by the runtime, and we don’t have to go out for coffee every time we touch a number with a decimal point.

Comments Off

Feb 16 2007

The Busy Writer: Backups (followup)

Published by matt under Uncategorized

Tom responded to my post on keeping backups; between the two posts, I think there’s a nice combination of information. As he points out, my comments are quite technical. I do my best not to get caught in the details, but it sometimes comes with the profession. It is also interesting to compare the posts: Tom’s is more discursive, while mine was more analytical—focusing on the details and mechanisms by which you could backup your work, and very little on how those mechanisms might fit into a writer’s workflow.

I like the combination of the two posts. I think we could continue by exploring just backup strategies further, but it is probably best that we move on, unless there are questions and comments from People Out In The World. Between Tom’s post and my post, there is a lot of information to work with, and a lot of fertile ground for questions to be born in.

… actually, on further thought, I think there is a prescriptive post to be done between these two. As it stands, we’ve left the Busy Writer to figure out, for themselves, how to create their own (good) backup strategy. That just won’t do. So, I’ll go ahead and give you a bit of a prescription for keeping your data safe. If you don’t have a backup strategy, this is a good starting point—you can read Tom’s account of how he keeps his data safe for some additional insight into how this fits into his workflow as a writer.

20070216-Backup-Flowchart

  1. Save all of your work in one place

    What does “one place” mean? It means you have a single folder on your hard drive called “work”. Everything you do that matters to you, in any way, shape, or form, goes under that folder. For example, you are allowed to create as many folders inside of your “work” folder as you like… but they can’t go anywhere else on the hard drive. You see, if you have more than one folder that matters to you, you’ll have to remember to back them all up. And any time that you enter the equation, there is a big point of failure. So, one folder for all of your work.

  2. Create a DVD every (week, month)

    How much do you care about your work? A little? A lot? Regardless, back your work up to a CDR or DVDR every week. You should make two copies, and one of them should be taken Somewhere Else. Keeping both at home doesn’t count as a “backup”. You can, if you like, stop right here, and do nothing else. You may stop here. If this is all you do, then you can stop here, as long as you are making weekly copies of your work, in duplicate, and keeping one DVD at home, and one at work (or Somewhere Else, wherever that may be). Keep in mind, you can still loose seven days worth of work if you are only keeping a weekly backup.

    If you only want to make DVD backups on a monthly basis, then you MUST read on.

  3. Setup off-site, on-line backup

    Your data will not be “safe” if you buy an external hard drive and make copies to it. That hard drive will be sitting next to your computer, and will be destroyed in the same fire as the PC, or it will be stolen by the same thief who wants your laptop. Therefore, the only way to protect your data is to get it away from you!

    You MUST subscribe to an online backup service. Tom points to two different services that I’ve heard good things about: Mozy and Carbonite. If you are a bit more technically minded, you might consider Bingo! or Amazon S3 via either JungleDisk or S3 Backup. If you are not 100% comfortable researching these options yourself, ask a friend for help. Hell, pay someone if you have to. Your work is worth far more than paying a local 16-year-old $20-$50 to help you out. At that rate of pay, you can even require that they work with you to produce written documentation that guides you through every step of the process. It will be educational for them, and invaluable for you.

    Off-site backup is not an option if you care about your data. And, it MUST run on a daily basis.

  4. Test your backups

    This is the hardest part. You must, on a regular basis, make sure that your backups are working. At the least, once a month, download one or two folders from your backup. Make sure that they’re intact. You should do this, though, after you have finished making a DVD. That is, if you screw up, you could (inadvertantly) download an old version of something over a more recent version. This would be bad. Therefore, on a monthly basis (after making your DVD backup of your work), do a quick check to see that your daily, off-site backup is behaving the way you expect.

Creative Commons License

This post is licensed under a

Creative Commons Attribution-Noncommercial-Share Alike 2.5 License.

2 responses so far

Feb 14 2007

The cost of backups

Published by matt under Uncategorized

Yojimbo-Crumb

I recently started using Yojimbo for managing bookmarks, PDFs, and other things that I find around the net. It seemed like a good choice of tool for the quick archival of websites and data that I’d like to be able to find later.

Unfortunately, my Yojimbo database has already grown to 50MB in size. Now, this is not such a problem—I have a 160GB disk in my MacBook. I could have 3000 of those files on my hard drive, and still have space for… well, a few things. However, my commitment to backup makes me realize that 50MB is really quite substantial when your backups run over a slow broadband connection. By “quite substantial,” I mean it takes 30 minutes and costs 1 US cent.

The problem is, if I add one bookmark to Yojimbo on a given day, it will force the upload of the entire database. This means that adding one bookmark requires me to copy 50MB to Amazon’s servers, and they’ll charge me a penny. If I add one bookmark a day, it will cost me roughly $4 in upload costs over the year (assuming the database doesn’t grow substantially in size, which is a bad assumption).

I dropped a note to Bare Bones Software to see if they would consider storing data in a different way to accommodate this kind of backup pattern. More likely, I’ll just have to create a weekly backup schedule for things like the Yojimbo DB, and accept that if something happens to my machine, I’ll loose a few days worth of data ninjaing.

4 responses so far

Feb 13 2007

The Busy Writer: Backups

Published by matt under Uncategorized

This is the second in a serious of posts where Tom Colvin and I explore how The Busy Writer can make their process more robust in the face of uncertain technology.

Scenario

In this series of posts, I’m going to take Tom’s situation as a case study (unless some other writers chime in with scenarios they feel are substantially different):

For several years, I’ve been researching, and now finally writing, a rather huge, little-known story about a scientific/medical expedition sponsored by King Carlos IV of Spain. While I’ve been writing professionally all my life, I’ve never attempted anything of this scale before.

And Tom’s specific question regarding data backup:

How to be sure I’ve backed up all my research and writing, which, without vigilance, gets scattered all over my hard disk and into some online repositories.

Here, I’ll explore the philosophy of a good backup strategy, a series of increasingly robust solutions to the challenge of backing up digital data, and leave some teasers for things that are actually tricky to archive that I might come back to in a later post.

Backups are critical

In reply to my last post, perhaps the most critical question Tom asked about the writing process in the digital age has to do with backups. Your computer is no more reliable than your backup strategy. If you have no backup strategy, then your data is toast when the computer is toast. Crying, at that point, is a good strategy.

I don’t know exactly what kind of data lives on a writer’s computer, but I can guess. There are documents in a variety of formats, be it Word, or plain text, or AppleWorks, or Pages, or Framemaker, or Pagemaker, or InDesign… any of a host of possible applications generated the content that lives on The Author’s computer. Furthermore, there are webpages and PDFs that are saved all over to support that writing process, all references and notes of one sort or another.

What makes for good backups?

The first rule is they MUST be automatic. Why? Because you cannot forget to do automatic things—the computer remembers for you. If, every day at 9PM, your computer backs up all of your critical data to a Magic Archive (a magical computer Somewhere Else where your data is safe for all time) you can go to bed early and sleep soundly knowing that you aren’t in danger of loosing any work from that day. (Somewhere Else is a magical place that is removed from the worries and dangers of the world as we know it.)

Second, they MUST be off-site. If your house burns down, and your backups are in your closet… you didn’t achieve anything. Tornados, hurricanes, floods, locusts… anything that can destroy your computer can destroy your backups as well (if they are in the same place). Therefore, having a backup of your data that lives in the same room as the computer is only a partial solution, and doesn’t really represent a good backup strategy.

Third, good backups MUST be redundant. A single backup is unsafe for a variety of reasons, and therefore redundancy is one of your best strategies for ensuring the recoverability of digital data. Lets pretend you have your computer and one backup, perhaps a CDR or DVDR in a safe-deposit box in a bank Somewhere Else. Although Somewhere Else is technically safe from the dangers of the world, your DVDR might have been faulty from the start. Therefore, little did you know that your data was actually never backed properly—there are errors in the DVDR that you created. So, when your computer crashes, and you send for your DVDR, you can’t actually restore.

Finally, a good backup strategy MUST be tested. This means two things. First, it means that you should test the integrity of the data after it is backed up, and (perhaps more importantly), you should know you can recover your workspace from the backups that exist. For example, you could print all of your data to paper as zeros and ones, and your recovery strategy would be to type those zeros and ones back into the computer. This would be fairly robust, if you used good paper, with good inks, and stored the paper under ideal conditions. However, recovery would take decades.

Strategies: From quick-and-dirty to somewhat-robust

In light of these requirements for safe and reliable backups, I think there are several ways for The Author to proceed with backing up their data. I’ve used many of them at one point or another, and will try and highlight my frustrations with each as I go. I’ll also discuss the requirements in light of each different strategy, because it is often the case that automatic, off-site, redundant, and tested backups are difficult for us wee human beings to achieve on our own.

CDR / DVDR

Tom can back his data up to CDRs and DVDRs. They’re cheap, they’re easy to obtain, and there are lots of potential problems with them as a storage medium.

First, they’re not automatic. You need to load a disc into your computer and then tell it to burn a backup of all your data. Second, it is difficult to move the discs off-site. You can easily burn multiple discs to make things redundant, and testing of your strategy is difficult at best.

Part of Tom’s problem in using CDRs and DVDRs is this statement regarding how his data “… without vigilance, gets scattered all over my hard disk and into some online repositories.” If your data truly gets scattered everywhere, then there is only one acceptable backup solution: backup everything. If you cannot be sure that you didn’t save things in a weird, random place, then you must back up your entire computer, always, to be sure that you didn’t miss anything that is important. In this regard, we cannot use DVDRs; although they hold 5GB of data, a typical computer hard drive now is 80GB or more. Even if the computer only had a 40GB drive, it would still require 8 DVDs to back it up entirely—or 16 DVDs to back it up redundantly.

If we assume a little bit of vigilance, things get better. For example, Tom could do some organization of his data, creating a system that might look like this:

writing /
    20060315 /
      Article /
        Files for Article ...
    20070101
        Book /
            Files for Book ...
research /
    20070214 /
        Stuff found and saved on Valentines Day 2007 ...

First, he could create a series of dated folders, where each project ends up in a folder that is dated with the day it was started. Likewise, the research supporting a project is dated as well. This might not work in practice, exactly, as the supporting research for many projects will get scattered throughout all time. However, not all research is specific to one project… therefore, there is no one, good taxonomy for all of this supporting data. I’ll come back to this later.

At the least, though, if all of the writing and supporting material is under one or two folders on your computer, then you know that backing up the “writing” folder and the “research” folder captures everything you can’t live without. This is the minimum level of vigilance that you really need to get to. Now, things like bookmarks in a browser are problematic… but I’ll include those in due time.

Typical CDR/DVDR Backups
Criterion ?
Automatic No
Off-site No
Redundant With Effort
Recovery Likely tedious
Cost Drive ($20-$50) + $0.05/GB

(Cost based on ukdvdr.co.uk, 20070215)

Experience Report

I have a lot of backups on CDR/DVDR. These are hard to search and hard to manage. One of my projects for the next few months is to go through them, one at a time, and copy them onto my laptop. Then, I’ll synchronize that data to an online archive. Then, I’ll delete the data, and throw away the disk. The next time I want to “own” a copy of all of it, in one place, I’ll buy a hard drive (or whatever technology we have then), and copy it all off of the online storage site.

Regardless, these are hard to manage in the long run.

An extra hard drive

It is possible to set up automatic backups with an extra hard drive. You can buy one for cheap (compared to the cost of rewriting everything on your computer, from scratch, and from memory), plug it in, and have a second drive that is as large (or larger) than the one in your computer to begin with.

So, using software like that described at free-backup.info (for Windows) or at pure-mac.com/backup (for Mac), you could set up a nightly job that copies your entire internal hard drive to your external hard drive. You now have automatic, redundant backups, and can be pretty sure that you can easily restore from it, giving reasonable recovery even if you don’t test the process of main drive failure. However, when someone breaks in, they’ll take that external drive as well as the computer itself, so that doesn’t help you with the problem of having off-site backups.

Experience Report

Either way, an extra hard drive is a good step. If the main drive dies, you’re not out-of-luck. And, recovering should be straight-forward: get a new computer, plug in the external drive, and drag-and-drop stuff from your last backup. I say this from experience; when my Powerbook was stolen, I was fortunate to have a full disk image made with Carbon Copy Cloner from just a few weeks before. This, combined with some other tools that I use, meant that I lost no data that I could think of. It was still tedious to have lost the Powerbook, and I am still in the process of sifting through the external drive for things that I might want several months later… but it was some protection. Keep in mind, though, that the external drive was in the office, and the theft happened at home, meaning that it was technically an off-site backup.

Extra Hard Drive
Criterion ?
Automatic Yes
Off-site No
Redundant No
Recovery Manual, straight-forward
Cost $0.35/GB

(Cost based on pricewatch.com, 20070215)

Network Attached Storage (NAS)

Here’s one that’s growing in popularity: a special little box that you plug into your home network that just contains hard drives. In fact, some of these things are even wireless, and you just copy data to them.

NAS units are largely the same as the single, external hard drive, with one exception: you can get some redundancy from a good NAS box. You see, an external hard drive typically has one disk in it; a NAS unit can have two drives (or more), and those drives can be set up as mirror images of each-other. This is great, because (in theory), if one of the drives in the NAS unit fails, you can go buy another, insert it into the box, and it will automatically recopy everything onto the new drive—leaving you with two drives (in one box) that have two copies of your data.

Obviously, you still have the unit in your house (not off-site), but you can stick it somewhere other than your desk, meaning that it is less likely to be found by thieves when they break in to steal things. And it is somewhat redundant; while everything is in one little box, you do have your data on two drives instead of one. This provides some protection, but trust me: there are still ways for both drives to fail at once, killing your backup solution.

Amazon.co.uk has a whole section of their store dedicated to network attached storage. For example, the Buffalo TeraStation is one unit that allows multiple drives to be grouped together and turned into mirror images of each-other. There are, though, many others, and I am not about to go into all of the things you should concern yourself with when purchasing a NAS unit for your own use—not right here, right now anyway. For example, here’s one story of how a NAS might be used—including some off-site rotation. However, this requires commitment.

NAS (mirrored)
Criterion ?
Automatic Yes
Off-site No
Redundant Yes*
Recovery Manual, straight-forward
Cost $1.25/GB

(Cost based on Amazon.com, 20070215)

* By “redundant”, I mean “on more than one hard drive.” Obviously, it isn’t in more than one location in the world, so you can still loose everything in a house fire or tornado.

Online Backup: Bingo! and Amazon S3

There are lots of ways to back your data up online. Too many require too much complexity on the part of The Author. If a backup solution is going to work, it needs to be simple and straight-forward.

I see two viable ways of doing online backup right now: Bingo! and Amazon S3.

Bingo!

A solution using Bingo! would look like the following:

  1. Mount a network drive
  2. Use a backup program (Windows, Mac) to do an automatic backup of one or more directories on a nightly basis
  3. Sleep easier

This is a reasonably good solution; it costs a flat rate per year, and is certainly more robust than anything you can do yourself, in your house. That is, Bingo is using the Sun Fire X4500 series of data servers; this is basically a big NAS device, but it costs far more than you can afford; put another way, with 24 TB (where 1 TB = 1000 GB) of storage, an X4500 costs somewhere between $30,000 and $48,000. And to think, it doesn’t even have a 0-60 MPH figure you can quote to your friends at the bar…

So, the Bingo folks have spent big cash on good hardware, and are selling disk space. You can rent that space from them, and although it isn’t duplicated in more than one place around the world, it is far more reliable than any hard drive you can purchase for your home. And, the cost to you is cheap: $50/year for 25 GB of storage. This is probably more space than Tom needs to backup his critical documents and notes, but I could be wrong.

Using the space does look simple: you would right-click on “My Computer”, say “Map Network Drive,” and then enter the details you get from the Bingo folks. Likewise, on the Mac, you would do “Apple-K” form the finder, and then enter the information. Then, if you use an automatic backup program, it can copy things to that networked drive just like there was a second drive sitting on your desktop. That’s the point, of course—using the networked drive should be that easy.

Both backup and recovery are slow compared to a local hard drive; it must copy all your data over the Internet. Depending on your connection speed, this could take days the first time you do a backup. However, once you’ve done an initial backup, most backup programs will only copy things that change. This kind of incremental backup is quick, and often you’ll have only changed a handful of backed up documents in a given day, meaning that the end-of-day backup will only take a few minutes, even over the network.

Bingo!
Criterion ?
Automatic Yes
Off-site Yes
Redundant Yes*
Recovery Manual, straight-forward
Cost $0.50/GB (min $50/year)

(Cost based on bingodisk.com, 20070215)

* Although your data isn’t stored redundantly around the world, it is stored on more than one HD in a professional data center, which is far better than the NAS solution you might buy and put under your desk.

Amazon S3 + Jungle Disk

Another way to do online backup is with Jungle Disk. I’ve written about this previously. You can follow those instructions to get Amazon S3 setup. And, you can use it with automatic backup software just like Bingo!. The difference is that you only pay for what you use with Amazon S3, whereas you pay for a big chunk of space on Bingo, and get charged for it whether there is data in it or not.

For example, lets say you only have 5GB worth of data; that means a 25GB space from Bingo! is 5x bigger than you need (at the moment, anyway). This will cost you $50/year, flat rate. Another way to do this is to use Amazon S3, which costs $0.15 per GB per month. Or, put another way, your 5GB of data will cost $0.45 per month, or $5.40 over the course of a year. However, unlike Bingo!, Amazon charges you when you copy the data to or from their servers at a rate of $0.20 per GB. Assuming you copy your data once (to their server), it will cost you a total of $6.40 over the course of a year to store 5 gigabytes of data on Amazon’s servers. Of course, if I copy all of that data back to my computer on a regular basis, it will cost me $0.40 each time. (Really, the transfer costs for a typical backup scenario are absolutely negligible.)

Using JungleDisk, you can get a mountable drive (like Bingo!), and copy things to it. You can even use automatic backup software to synchronize data to the servers, meaning you can do incremental backups to Amazon’s servers. This cuts down on your costs—you don’t want to constantly copy all of your data to their servers every day. But what I really like is that Amazon’s solution copies your data to multiple hard drives in multiple places around the planet. At least, they say they do, and I suspect they’re not lying.

This is true redundancy.

And, the important thing for me is that JungleDisk now has a little backup utility built into it, so I can specify a few folders, and say “Backup Now”. It will automatically synchronize my folders to my Amazon S3 account—and will happily pick up in the middle of a backup if I quit, or shut down, or whatever.

Experience Report

I’ve committed now to using Amazon’s S3 service with JungleDisk, especially since they announced a roadmap and added the backup features. Also, with a likely $20 pricetag, I’m happy to buy the program when it goes “1.0.” The software does what I want, and I feel better knowing that my iPhoto Library, email, and critical documents are all backed up “somewhere else”. I think my mother is even using Amazon S3 now (I set my parents up with it), and that is a Good Thing. I sleep even better knowing that my parent’s machine is backed up, since it is probably my fault (somehow) if it crashes and burns.

Another neat feature of JungleDisk is the encryption. JungleDisk will, upon request, encrypt all of my data before sending it over the net and storing it on Amazon’s servers. This way, my data isn’t (casually) accessible to anyone who manages to intercept the data on its way to Amazon, or to Amazon themselves… however, I’m suspecting that they have better things to do than look at my photo archive.

Jungle Disk + Amazon S3
Criterion ?
Automatic Yes
Off-site Yes
Redundant Yes
Recovery Manual, straight-forward
Cost $0.15 GB/Month, $0.20 GB/transferred ($0.16/GB/year)

(Cost based on Amazon S3, 20070215)

(The cost model is a bit wacky, but simply put it is a one-time transfer cost and a monthly storage cost.)

Conclusions

A little organization on the desktop makes it much easier to back your data up to a second location; at least being consistent in saving work in one (or possibly two) places is a good start. Even if you don’t, it is still possible to back up many different directories using software on a regular basis, automatically.

Online backup is now cheap enough that it doesn’t make sense to do anything else if you really, really care about your data. If you just want to go spend $100 for an external drive, this is still a good start—but you can pay less, and get better reliability in the long run, by using an online service. Even if it costs you $200/year to back everything up somewhere else, that is worth it if your house burns down or someone breaks in and steals everything. And besides, storing it somewhere else makes it someone else’s problem to manage the maintenance of the hardware, not you.

Because, after all, you’re a writer, not a system administrator, damnit!

Looking Forward

What next? I might write a little bit about how to manage the explosion of data that comes from the research process. That’s a shorter post, I suspect. We’ll get to version control shortly, though. Or, perhaps Tom or someone else will have a comment or question on the article that will lead to the next post in this “thread”.

Update 20060215, 17:00: I discovered JetS3t, an open-source set of tools for browsing and synchronizing to the Amazon S3 filestore. I’m personally going to switch to these, despite being pleased with JungleDisk at the moment. My primary reason for this is that I have the complete source to the tools that are managing my backups if I use JetS3t. Therefore, I can (confidently) use the encryption features and know that the algorithm by which my data was encrypted is completely known to me… in a language I know, and that runs on all of my machines. (The JungleDisk equivalent is written in C#, and therefore only works under Windows at the moment, as far as I can tell.)

I’ll have to spend more time experimenting with these tools to give a conclusive report on which I think are the way to go. Certainly, I’m pleased with what I’ve seen of the JetS3t tools so far. Once I’ve committed and really know how they all work, I’ll post something on how I chose to do my backups. I’m pretty sure it will involve the Java tools, as they have command-line versions. However, JungleDisk will probably work fine for most uses.

Creative Commons License

This post is licensed under a

Creative Commons Attribution-Noncommercial-Share Alike 2.5 License.

(I’m reasonably confident that an automated weblog that harvests these posts solely for the purpose of generating advertising revenue could be called commercial purposes. My post yesterday was “harvested” and copied for just this purpose. Grr. And it was copied without attribution. Grr.)

4 responses so far

Feb 12 2007

Version control and other tools for writers

Published by matt under Uncategorized

I previously wrote about Amazon S3 and its use for people who just need a way to cheaply/safely backup and archive content off-site. This came up at a CE-L dinner I was attending with my wife some time ago; people were discussing the relative costs of backup solutions for their work. I’ll be updating this, as JungleDisk has recently been updated with some new features that I think make it a complete no-brainer for use as a backup solution for Normal People Like Us.

In putting up a note about tools for collaborative scientific writing, I tweaked the interest of Tom Colvin over at Becoming A Writer Seriously. I’ve offered to handle more questions in this vein, as I think there is an important space here—how can you set up a writing environment that allows you to:

  1. Sleep at night knowing that your working copy isn’t only on your computer,
  2. Wake up in the morning and be able to revert back to a previous version,
  3. Collaborate with others, anywhere, without having to worry about keeping track of which version is attached to which email, and
  4. Sleep even better knowing that it is all backed up somewhere

Tom (or others) might actually have additional questions, and those will evolve through discussion on this post and others. When we’re done, hopefully Tom will have answers to his questions about using version control and straight-forward backup solutions to keep his work safer, and others will perhaps benefit from the dialogue as well.

I’ll be categorizing these posts under ‘writing’ as well as any other categories that seem relevant. That, and I’ll make a point to see that the posts are Creative Commons licensed so that others can do with them as they please.

Creative Commons License

This post is licensed under a

Creative Commons Attribution-Noncommercial-Share Alike 2.5 License.

2 responses so far

Feb 11 2007

Still getting organized

Published by matt under Uncategorized

I think I want a dedicated printer for generating 3×5 cards and A5 inserts. And, while I’m at it, I might as well get one that does rudimentary scanning as well.

 Images 14 388722

We’ll see. 40GBP isn’t cheap, but it isn’t killer. I’ll think about it, anyway. The Canon Pixma MP160 (Amazon UK, review) looks reasonable for what I want to do with it. Maybe.

Update: And, as long as I’m keeping track of things…

 Euro Img Full 297137

I think I’ll need to order some index cards and boxes. Perhaps. I do have a whoop-ass index card file on my desk right now, which might serve just fine. I think it could take a hit from a 10MT nuclear warhead and keep my 3×5s safe…

No responses yet

Next »