On Samizdata, the comment was made that the UEA has confessed to having “lost” all the original data. This was a revelation known some time ago, as a result of the FOI responses that have been received. It’s not quite that straightforward, I suspect. I commented there, but this is an issue that is not in the forefront of the coverage yet, so I thought was worth repeating here.

When they say "lost", they mean lost in a special way. Imagine you are collecting a database of temperature statistics. You input the data, and dispose of the no longer needed input media. You are tweaking and correcting them as you go. Now you run a program to collect a summary, and publish a paper. And then you carry on adding (and tweaking) statistics to the database. Adjusting and improving your programs, and so on.

Ten years later, somebody asks for the data you used to generate the original paper. Problem is, you’ve got no idea of what the state of the database was at the time. That was thousands of additions and hundreds of tweaks ago. You have no regular backups, no archives, no records of what you’ve done. Because after all, why should you keep a record of the old data when you know the new version is so much better? It makes no sense, and would be a waste of valuable storage space.

And so you tell the enquirers to get knotted. You know what you’re doing, why should these amateurs be allowed to come in and rip 25 years of work to shreds?

But then some fool politician passes the FOIA, and you’re suddenly told you have to share your data. "Oh God!" What are you going to do?! You don’t have it any more, and it’s going to be a nightmare to reconstruct.

This, I believe, is the desperate genesis of the HARRY_READ_ME.txt file. Some poor sod (called Ian "Harry" Harris, it is generally suspected) was given the miserable job of going back through the stinking tangled mass of code and intermediate results to try to reconstruct what was done. We gather that several years of late nights and working weekends were involved. Most of the software people who have looked at the file have expressed deep sympathy for the poor fellow. Many of us have been there.

You are utterly trapped! On the one hand, reconstructing the data is impossible, on the other, admitting what you’ve done is unthinkable, and increasing legal pressure is getting ever closer to forcing a choice, and the politics in the run-up to Copenhagen is doubling and redoubling the stakes. And yet at no point in the whole process did you do anything you consider to be wrong.

I wonder, did the release of the emails maybe even come as something of a relief? Yes, you are now going to be pilloried before the entire world, but at least the agonising wait is over. And in a few months time, it might even pass. And you can relax in your enforced retirement in that little cottage with roses over the door, instead of that sword of Damocles. Looking at it on a human level, wouldn’t it be nice if that was so?


  1. Pa Annoyed says:

    Yes, Rob. Exactly.

    Have you seen HARRY_READ_ME.TXT? I think everybody should. A lot has been made of the emails, but I suspect that it is the code and the comments that will prove the real killer. You don’t have to be an expert to read HARRY_READ_ME – there’s a lot of geek speak and technical bits, but there is a commentary threaded through it that anyone could understand.

  2. Philip Painter says:

    I’ve read the whole thing and as a retired programmer have every sympathy for Harry.

    I suspect the root problem that CRU had was that they got scientists to write the code and debug it, manage the databases, do the backups [if they did], write the software documentation [if they did] etc. They may be ace climate scientists but they probably didn’t even realize they weren’t doing a good software engineering job until Harry discovered the problems he did.

    As a technical detail, much has been made elsewhere of the “adjustments” of +/- 0.5 degree assuming these are temperatures and 0.5C changes over a few years is a big number. Well, I think that these 0.5 degree changes are to do with location – latitude and longitude. The data [outside Europe] is divided into boxes of lat/lon [0.5 degrees I think]. These adjustments of up to 0.5 degrees then make sense and are not the big fiddle others have claimed.

    In the coming inquiry I hope they strongly suggest that the software be re-developed and managed by software professionals – maybe from the UEA computer department.

