----------------------------------------------------------------------------- One liners Tee-pipe (pipe output into two commands) command_source | awk '{print; print | "command1"}' | command2 Uniq (consecutive lines) awk 'a != $0; { a = $0 }' Uniq without sorting - also see "info shell/file.txt" (not solaris) awk ' !x[$0]++' More efficent uniq without sort awk '!($0 in a) { a[$0]; print }' ----------------------------------------------------------------------------- Simple and Common AWK Examples (data is a biome count in minecraft) 145174951280 Ocean 153013367088 Ocean 48860644080 Desert 13567315200 Desert 2053622912 Desert 4627422256 Forest 89607678496 Forest 24159066112 Forest 6246795152 Ice 19021276480 Ice 859479312 Ice 11721673776 Jungle 982828512 Jungle 3917632 Jungle 4059119632 Jungle 526812784 Jungle 4653212096 Mesa 259682528 Mesa 1088787328 Mesa 2516103104 Mesa 132590528 Mesa 56755600 Mesa 305219136 Mushroom 206617808 Mushroom # ------------ # # Math on a single column... # Maximum Value of all values awk 'BEGIN { m=-1e8 } $1 > m { m=$1; i=$2 } END { print m,i; }' Minimum Value of all values awk 'BEGIN { m=1e8 } $1 < m { m=$1; i=$2 } END { print m,i; }' Total of column 1 awk '{t+=$1} END { print t "\n"}' \ Average of column 1 awk '{t+=$1} END { print t "/" NR " => " t/NR "\n"}' \ Median Value This is tricky as you need to sort and store, so you can get the 'center' sort -n | awk '{arr[NR]=$1} END { if (NR%2==1) print arr[(NR+1)/2]; else print (arr[NR/2]+arr[NR/2+1])/2} ' # ------------ # # Math by unique items (output order not important) # Count Unique Values in Column 2 (without using a slow sort|uniq pipline) awk '{a[$2]++} END {for(i in a) print i, a[i]}' Minumum of a specific item ???? Maximum of a specific item ???? Add Up the values of same item awk '{a[$2]+=$1} END {OFS="\t"; for(i in a) print a[i], i}' Find Largest Added-up Count awk '{a[$2]+=$1} END {for (i in a) if (a[i]>a[m]) m=i; print a[m], m}' Percentage of Total awk '{a[$2] += $1; t+=$1} END { OFS="\t"; for(i in a) print a[i], i, sprintf("%6.2f%%", 100*a[i]/t); print "-------------------"; print t, "Total", "100.00%"; }' Sort by Item ??? Sort by Added Count ??? Preserve original order of values ??? # ------------ # # You can also use Gnu "datamash" but it is not a standard install # Though you may have to use a proper field seperator character # Minimum value datamash -t\ min 1 Comma seperated list of unique names sed 's/ \+/\t/g' | datamash unique 2 Desert,Forest,Ice,Jungle,Mesa,Mushroom,Ocean get count,min,max,sum values grouped by second column sed 's/ \+/\t/g' | datamash -g 2 count 1 min 1 max 1 sum 1 | column -t Ocean 2 145174951280 153013367088 298188318368 Desert 3 2053622912 48860644080 64481582192 Forest 3 4627422256 89607678496 118394166864 Ice 3 859479312 19021276480 26127550944 Jungle 5 3917632 11721673776 17294352336 Mesa 6 56755600 4653212096 8707131184 Mushroom 2 206617808 305219136 511836944 ----------------------------------------------------------------------------- Variable Settings and Awk processing two files differently This could be used to read a config or a table of data before the data to awk. Variable settings take effect when they are encountered on the command line, so, for example, you could instruct awk on how to behave for different files using this technique. For example: awk 'data==0 { print "process config_data here" } data==1 { print "process data_file here" } ' config_data data=1 data_file Note that some versions of awk will cause variable settings encountered before any real filenames to take effect before the BEGIN block is executed, but some won't so neither way should be relied upon. Solaris 9: you must use "nawk" for this to work. ------------------------------------------------------------------------------- Using/modifying Awk Arguments (in BEGIN{}) During the BEGIN{..} code block, ARGV has not been looked at, but has been read into an array. As such you can look through it an act or modify on those variables. =======8<-------- awk 'BEGIN { print "data = " data for (i=0;i<]+" '....' When a record is seperated Gawk sets RT to the trailing text of the record that RS matched. Paragraph Records.... A empty string RS="" is a 'blank line' as a field seperator Leading blank lines in a file is ignored. If the FS field is also set to a single character the new line character will also act as a field seperator even if not specified. Typical useage is RS=""; FS=" "; If you want to avoid this convert the single FS character into a Regex. For multi-blank line paragraphs use RS="\n\n+" However leading blank lines at the top are then not ignored, so may need special handling. ----------------------------------------------------------------------------- Changing the record seperator inside an awk script. Awk reads and record separates its input before the script properly starts as such you need to get it to re-evaluate that input. For example... awk 'BEGIN { RS=";" } { $1=$1; printf "%s%1s\n",$0,";" } ' infile The $1=$1 is required to get awk to re-compute the $0 for the first record. ----------------------------------------------------------------------------- Determine column numbers of relevant 'ps' fields =======8<--------CUT HERE----------axes/crowbars permitted--------------- #!/bin/sh CMDCOL=`ps -e | awk ' NR == 1 { for (i = 1; i <= NF; i++) if ($i == "COMMAND" || $i == "CMD" || $i == "COMD") cmdcol = i } END { print cmdcol } '` if [ "$CMDCOL" = "" ] then echo "$0: Unrecognised ps format for COMMAND field" exit 1 fi TTYCOL=`ps -e | awk ' NR == 1 { for (i = 1; i <= NF; i++) if ($i == "TTY") ttycol = i } END { print ttycol } '` if [ "X$TTYCOL" = "X" ]; then echo "$0: Unrecognised ps format for TTY field" exit 1 fi # # Print list of all terminals running program # echo -n "Terminals running vim : " ps -e | awk ' $'"$CMDCOL"' == "vim" { print $'"$TTYCOL"' }' =======8<--------CUT HERE----------axes/crowbars permitted--------------- -----------------------------------------------------------------------------