Find Unique Columns from Text Files

Question

I have a large number of individual files that contain six columns each (number of rows can vary). As a simple example shows:

1     0     0     0     0     0

0     1     1     1     0     0

 

I am trying to identify how many unique columns I have (i.e. numbers and their order match). In this case it is 3. Is there a simple one-liner to do this? I know it is easy to compare one column with another column, but how to find identical columns?

 

Answer

Besides Awk, you can do this in SPL (Structured Process Language), which is better at handling complicated logic. To solve your problem (count unique columns in all files under /data directory), you can use the following one-liner:

A

1

=directory@p("F:\\files\\data").new(~:file,(a=file(~).import(),a.fno().(a.field(~)).id().count()):count)

A1: Count unique columns in each file in order and write the results to a two-dimensional table consisting of file field and count field.