Click here to load reader

CSCI 330 UNIX and Network Programming Unit IIX: awk, Part I

  • View
    232

  • Download
    2

Embed Size (px)

Text of CSCI 330 UNIX and Network Programming Unit IIX: awk, Part I

AWK Utility

CSCI 330UNIX and Network ProgrammingUnit IIX: awk, Part IThe Bash ShellCopyright Department of Computer Science, Northern Illinois University, 200509-1What is awk?created by: Aho, Weinberger and Kernighanscripting language used for manipulating data and generating reports

versions of awk:awk, nawk, mawk, pgawk,

GNU awk: gawkCSCI 330 - The UNIX System2The AWK/NAWK UtilityCopyright Department of Computer Science, Northern Illinois University, 20042What can you do with awk?awk operation:reads a file line by line splits each input line into fieldscompares input line/fields to patternperforms action(s) on matched linesUseful for:transform data filesproduce formatted reportsProgramming constructs:format output linesarithmetic and string operationsconditionals and loopsCSCI 330 - The UNIX System3Basic awk invocationawk 'script' file(s)

awk f scriptfile file(s)

common option: -Fto change field separator

CSCI 330 - The UNIX System4The AWK/NAWK UtilityCopyright Department of Computer Science, Northern Illinois University, 20044Basic awk scriptconsists of patterns & actions: pattern {action}

if pattern is missing, action is applied to all linesif action is missing, the matched line is printedmust have either pattern or action

Example:awk '/for/ { print }' testfileprints all lines containing string for in testfileCSCI 330 - The UNIX System5The AWK/NAWK UtilityCopyright Department of Computer Science, Northern Illinois University, 20045

awk variablesawk reads input line into buffers: record and fields

field buffer:one for each field in the current recordvariable names: $1, $2,

record buffer:$0 holds the entire recordCSCI 330 - The UNIX System6More awk variablesNRNumber of the current recordNFNumber of fields in current record

also:FSField separator (default=whitespace)

CSCI 330 - The UNIX System7Example: Records and Fields% cat empsTom Jones 4424 5/12/66 543354Mary Adams 5346 11/4/63 28765Sally Chang 1654 7/22/54 650000Billy Black 1683 9/23/44 336500

% awk '/Tom/ { print }' empsTom Jones 4424 5/12/66 543354CSCI 330 - The UNIX System8The AWK/NAWK UtilityCopyright Department of Computer Science, Northern Illinois University, 20048Example: Records and Fields% cat empsTom Jones 4424 5/12/66 543354Mary Adams 5346 11/4/63 28765Sally Chang 1654 7/22/54 650000Billy Black 1683 9/23/44 336500

% awk '{print NR, $0}' emps1 Tom Jones 4424 5/12/66 5433542 Mary Adams 5346 11/4/63 287653 Sally Chang 1654 7/22/54 6500004 Billy Black 1683 9/23/44 336500CSCI 330 - The UNIX System9The AWK/NAWK UtilityCopyright Department of Computer Science, Northern Illinois University, 20049Example: Space as Field Separator % cat empsTom Jones 4424 5/12/66 543354Mary Adams 5346 11/4/63 28765Sally Chang 1654 7/22/54 650000Billy Black 1683 9/23/44 336500

% awk '{print NR, $1, $2, $5}' emps1 Tom Jones 5433542 Mary Adams 287653 Sally Chang 6500004 Billy Black 336500CSCI 330 - The UNIX System10The AWK/NAWK UtilityCopyright Department of Computer Science, Northern Illinois University, 200410Example: Colon as Field Separator% cat emps2Tom Jones:4424:5/12/66:543354Mary Adams:5346:11/4/63:28765Sally Chang:1654:7/22/54:650000Billy Black:1683:9/23/44:336500

% awk -F: '/Jones/{print $1, $2}' emps2Tom Jones 4424CSCI 330 - The UNIX System11The AWK/NAWK UtilityCopyright Department of Computer Science, Northern Illinois University, 200411Special PatternsBEGINmatches before the first line of inputused to create header for report

ENDmatches after the last line of inputused to create footer for reportCSCI 330 - The UNIX System12example input fileJan 13 25 15 115Feb 15 32 24 22Mar 15 24 34 228Apr 31 52 63 420May 16 34 29 208Jun 31 42 75 492Jul 24 34 67 436Aug 15 34 47 316Sep 13 55 37 277Oct 29 54 68 525Nov 20 87 82 577Dec 17 35 61 401Jan 21 36 64 620Feb 26 58 80 652Mar 24 75 70 495Apr 21 70 74 514CSCI 330 - The UNIX System13awk example runsawk '{print $1}' input

awk '$1 ~ /Feb/ {print $1}' input

awk '{print $1, $2+$3+$4, $5}' input

awk 'NF == 5 {print $1, $2+$3+$4, $5}' inputCSCI 330 - The UNIX System14awk example scriptBEGIN { print "January Sales Revenue"}$1 ~ /Jan/ { print $1, $2+$3+$4, $5}END { print NR, " records processed"}

CSCI 330 - The UNIX System15Categories of Patternssimple patternsBEGIN, ENDexpression patterns: whole line vs. explicit field match

whole line/regExp/field match$2 ~ /regExprange patternsspecified as from and to:example:/regExp/,/regExp/16CSCI 330 UNIX and Network Programmingawk actionsbasic expressions

output: print, printfdecisions: ifloops: for, while

17CSCI 330 UNIX and Network Programmingawk Expressionconsists of: operands and operatorsoperands:numeric and string constantsvariablesfunctions and regular expressionoperators:assignment: = ++ -- += -= *= /=arithmetic: + - * / % ^logical: && || !relational: > < >=