In data analysis, we need to read the data in tabular format. We can do this easily and elegantly in Perl. (Refer to Using Perl for statistics – Giovanni Baiocchi for more)

Let us take a tabular format data in the file “longley.reg“.

The data looks like following:

X1 -52.99357014 129.54487 -.409 .6911
X2 .7107319907E-01 .30166400E-01 2.356 .0402
X3 -.4234658557 .41773654 -1.014 .3346
X4 -.5725686684 .27899087 -2.052 .0673
X5 -.4142035888 .32128496 -1.289 .2263
X6 48.41786562 17.689487 2.737 .0209

The first column is the string with two characters. The second, third, fourth and fifth column is the data of the floating type or exponential type format. The precision of this data is varying and somewhere very  high which might not be always required. We will try to round off to 3 digits after decimal point.

The Perl code for reading this data and saving it in another file (formatted_table.dat) is following:

#!/usr/bin/perl

open( TABLE, "longley.reg" );
$filename = 'formatted_table.dat';
open($h1, '>', $filename) or die "Could not open file '$filename' $!";

$prec = 3; # sets number of decimals
$width = 8; # sets the width of the field
while () {
chomp;
@line = split;
printf $h1 "%2s", $line[0]; # prints variable name
for ( $i = 1 ; $i <= $#line ; $i++ ) {
printf $h1 "%${width}.${prec}f", $line[$i]; # prints all other fields
}
print $h1 " \n"; 
}
close(TABLE);
close $h1;
print "Finished writing!\n";

The formatted output is following:

X1 -52.994 129.545  -0.409   0.691 
X2   0.071   0.030   2.356   0.040 
X3  -0.423   0.418  -1.014   0.335 
X4  -0.573   0.279  -2.052   0.067 
X5  -0.414   0.321  -1.289   0.226 
X6  48.418  17.689   2.737   0.021

This output is much more organized than the unformatted one.