Aside

Just a note, I’ve done some serious cleanup on this domain. Mostly anything that gone should redirect here after about 5 seconds. So, if you are missing something, tough luck – I didn’t bother making backups since it was all junk anyway. :)

A year on Ubuntu

As I was closing tabs in Firefox, I noticed a had my Twitter profile opened. When I saw the page, I was reminded of the fact that I’ve been using Ubuntu for over a year now. I wish I could add “exclusively” to the end of that sentence, but alas, my PC at work has Windows on it.

I’ve gone trough some changes with Ubuntu in the year. I started with Xubuntu 10.04. In time, I moved away from XFCE to KDE then then quickly to Gnome. Somewhere in between I also upgraded to 10.10, but I’m not sure about the chronology. With Ubuntu 11.04 came Unity, which I have a fair chance for about 2 weeks. Not really liking it, I have Gnome 3 / Gnome Shell a try. Anybody who gave Gnome Shell a try knows its not anywhere near as customisable as Gnome 2 was. That’s okay though, ’cause it rocks as-is. :)

The power of streams

A while ago at work, a customer asked that we add an export of a specific table in his database to his website. It wasn’t a difficult export, just get every row in the database and output it in a format MS Excel understands. I was assigned the task, so I went at it.

I started to just get the data out of the database. That meant having something like this:

require 'php/bootstrap.php';
$rResultset = $oDatabase->query("SELECT * FROM mytable");

Simple enough, right? The next thing to do was decide what format the export should be in. All they said was a format MS Excel can understand, so the simples thing I could think of was the CSV format. And since I had recently used fgetcsv to read some stuff from a csv export, I took a wild guess and typed fputcsv in my editor.

So, I had to take the data I already got from the database, create a csv file, and send that to the user with the correct HTTP headers. Easy enough, that means this:

// Put the data in a file
$rHandle = fopen("export.csv", "w+");
while($aResult = $oDatabase->fetch($rResultset)) {
    fputcsv($rHandle, $aResult);
}
 
// Send the headers
header("Content-type: application/csv");
header("Content-Disposition: attachment; filename=mytable-export.csv");
header("Pragma: no-cache");
header("Expires: 0");
 
// Read the file, from the beginning
rewind($rHandle);
 while (($sOutput = fgets($rHandle, 4096)) !== false) {
        echo $sOutput;
 }

Okay, so that should work pretty well. Unless two people run the export at the same time. Better make the filename unique. Oh but then I’ll get huge load of files in my tmp directory, better delete the file afterwards.

Luckily, at this point, I hadn’t actually written any code, it was all just pseudo-code in my head. Writing to a file just so I can output it, and then delete that file… that’s stupid. (Not to mention a waist of IO, which is expensive enough as it is without me throwing it out the door like this.)

This meant I had a mission: I wanted to output the data right away, but using fputcsv since I don’t want to bother with all the csv stuff myself.

Enter fopen(“php://output”, “w”); which allows you to write to stdout (read: the webserver, read: the browser) as if it where a file. With that, my script became as simple as this:

// Send the headers
header("Content-type: application/csv");
header("Content-Disposition: attachment; filename=mytable-export.csv");
header("Pragma: no-cache");
header("Expires: 0");
 
$rHandle = fopen("php://output", "w");
while($aResult = $oDatabase->fetch($rResultset)) {
    fputcsv($rHandle, $aResult);
}

Rather simple if you ask me. And to think I considered messing with files.

After that, the script needed some tweaking: as first row, they wanted the column names. And “America style” csv (where the separator as a comma) doesn’t work here in Europe, since we use the comma for decimal numbers. Neither of which was a problem:

$bHeadersDone = false;
$rHandle = fopen("php://output", "w");
while($aResult = $oDatabase->fetch($rResultset)) {
     if(!$bHeadersDone) {
          $bHeadersDone = true;
          fputcsv($rHandle, array_keys($aResult), ";");
    }
    fputcsv($rHandle, $aResult);
}

Et voila, it does exactly what the customer wants, its just 29 lines long, and it learned me something new. :)

Stupid limitations

Today at work, I was tasked to move a website from an external hosting, to one of our servers. They gave me the FTP username and password, so I started to download the whole website. I then grabbed the database info from the configuration, and made an export of the database as well. Uploaded everything to our server, restored the database, and voila, almost everything worked.

The website in question is a news website. Its content is basically text+images. The text, obviously, was in the database. The images where, obviously, on the filesystem. So logic would suggest that transferring the files and database would do the trick, right? (Ignore the DNS for a moment, okay?)

And sure, normally it would. And all the files where there, except for the images. Taking a closer look, it turned out a part of the images (the first 1998 to be exact) where. in fact, there. Taking an even closer look made me notice that a part of the response text when doing an FTP LIST of the images directory was “TRUNCATING RESULT AT 2000 MATCHES”. Pretty fucking awsome, right, they limited their FTP LIST to 2000 items (‘.’ and ‘..’ being 1 and 2).

I called the support phonenumber, but they couldn’t help me – apparently increasing that 2000 would be bad for server performance. They suggested I grab 2000 files, delete them on the server and then get the next 2000. Nice idea, except that’d mean the website would be broken while I’m getting the files. (The actual switch will be monday/thuesday, so removing all but 2000 images now is not an option.)

Long live FTP access though – I wrote a small PHP script that would list (links to) all the files in the directory, install the firefox plugin ‘download them all’, and about two hours later, I had all the files sitting on my harddrive nicely. :) (And I’m sure the 14.000 HTTP requests where a much bigger hit on their server then allowing me to grab them trough FTP would have ever been.)

So, in the end, this was just a dumb limitation that (in this specific case) did more harm then good.