Quantcast
Channel: Count the maximum character length for all the data fields in a simplified csv file and output to txt - Unix & Linux Stack Exchange
Viewing all articles
Browse latest Browse all 5

Answer by Kusalananda for Count the maximum character length for all the data fields in a simplified csv file and output to txt

$
0
0

Using Miller (mlr) to calculate the max length of each field's values. The input is read as CSV, and the output is produced as an "xtab" file (one key+value pair on each):

$ mlr --c2x stats1 -a maxlen --fr . fileThese_maxlen                                                       10are_maxlen                                                         11the_maxlen                                                         12column_headings_which_may_be_very_long_but_they_don't_count_maxlen 13

The --fr . arguments to the stats1 operation is to calculate the maximum length of all fields with names that match the regular expression . (i.e. every field that is named).

As you can see, Miller retains the field names and adds a _maxlen suffix to each.

To read the CSV file as if its first line was a record rather than the headers, then remove that first line and do the same max calculation:

$ mlr --c2x -N filter -x 'NR == 1' then stats1 -a maxlen --fr . file1_maxlen 102_maxlen 113_maxlen 124_maxlen 13

With an additional rename operation, we can remove the _maxlen suffix from the names of all fields:

$ mlr --c2x -N filter -x 'NR == 1' then stats1 -a maxlen --fr . then rename -r '(.*)_maxlen$,\1' file1 102 113 124 13

Viewing all articles
Browse latest Browse all 5

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>