Transforming a Data Layout

Diane posed me a problem. She had a large spreadsheet filled with data items in three columns, like so:

LekID Year Number
Well 2001 1
Well 2002 2
Well 2003 3
Mill 2001 4
Mill 2002 5
Mill 2003 6
Plain 2001 7
Plain 2002 8
Plain 2003 9

But what she wanted was a table like this:

Year Mill Plain Well
2001 4 7 1
2002 5 8 2
2003 6 9 3

Now think of having about thirty sites and years going back to the 1960s. This is something tedious and error-prone to rearrange by hand.

Enter a Perl script to help the job along.

Step 1 is to export the data in a comma-delimited file format so that Perl can read it easily. Include the header line.
Step 2 is to figure out which column in the CSV file holds row keys, which hold column keys, and which holds data. For the example, the first column has the column keys, the second column has the row keys, and the third column has the data.
Step 3 is to call the Perl script to transform the input data. Output is on STDOUT, so redirect it to a new CSV file. (Don’t redirect to the same file; the OS will wipe the file out in preparing it for output, and the Perl script will see nothing when it tries to open it.) The first parameter is three letters giving the info on which column does what job, and the second is the input CSV filename.

Here’s the command line I used for this:

xform.pl crd leks.csv > leks.xform.csv

The Perl script itself is here.

I tried three different plugins to try and make it show up verbatim, as code should, here in the display. None of them worked to my satisfaction.

Wesley R. Elsberry

Falconer. Interdisciplinary researcher: biology and computer science. Data scientist in real estate and econometrics. Blogger. Speaker. Photographer. Husband. Christian. Activist.

2 thoughts on “Transforming a Data Layout

  • 2006/04/22 at 9:11 pm
    Permalink

    I’m thinking that if you imported Diane’s info into a relational database table (even keeping the original columns of LekID, Year, Number),
    you could get your information formatted the way you want it in a query with a single Select statement.

    Nice to see an example of Perl, though. Some of the syntax seems familiar to that of c/c++. (But that “==” gets us all at least once!)

    -Karen

  • 2006/04/22 at 11:47 pm
    Permalink

    I don’t know SQL well enough to do the query that you are talking about…

    My response to being gotten by the “==” relop is shown in the script:

    if (1 == \$#ARGV) {

    When I have a comparison to a literal, I put the literal on the left-hand side. That way, if I have messed up and left an assignment in there, the compiler or interpreter throws a hissy fit about trying to assign to a literal, and I fix it.

Comments are closed.