Perl Tutorial

Although Purdue's courses rarely require any knowledge of Perl, it can be an extremely useful tool for completing class or personal projects.  It is particularly useful for its regular expression (regex) capabilities, which allow it to perform string parsing and manipulation.

If you're like me, you'll often find yourself in a situation where you know Perl would provide a quick solution, but you can't remember all of the details of Perl programming.  This file is intended as a quick syntax refresher in Perl programming for those who have used it before.  If you're interested in learning Perl, there are links to several tutorials at the end of this reference.

Syntax:
End statements with semicolons
If/while/etc. blocks marked by { } braces
Separate array entries or function arguments with commas
# designates a comment until end of line
First line should contain #!/usr/local/bin/perl or appropriate variation
Language is case sensitive.

Variables:
$ marks scalar variables (untyped)
@ marks an array variable
$#array returns index of last element of an array
Define arrays with parentheses, with comma-delimited entries.
Access scalar entries of an array with $ and square brackets []
E.G:
@color = ("red", "green", "blue");
$color[2] returns "blue"
$size = @color;  - sets size to 3
$size = $#color;  - sets size to 2

Language-defined symbols:
$_ - default variable for most operations
@ARGV - arguments array
@_ - argument arry to functions (access with $_[0], etc.)

Constants:
Strings enclosed in single quotes are taken literally
Strings enclosed in double quotes interpret variables or control characters ($var, \n, \r, etc.)

Operators:
Mostly follows C/Java rules:
= is assignment
== is conditional test
+= etc. are defined
. is the string concatenation operator
x copies a string multiple times (e.g. $b x $c returns c copies of b)
eq is the string comparison operator
=~ and !~ are string regex match operators (do not perform assignment usually)

Control structures:

if, while, for, and do-while all mimic C, with a few changes:
-braces are not optional
-use "elsif" instead of "else if"
Additional control structures:
until (or do-until) - loop until statement is true
foreach $var (@array) - loop over each element of array - if $var is omitted, $_ is used.

Regex 1: regex syntax
Normal characters are matched as themselves
. - match any character
^ - match beginning of a line
$ - match end of a line
\ - escape character.  Use to match an actual . ^ or $, etc.
\d - match any digit (equivalent to [0-9])
\D - match any non-digit (equivalent to [^0-9])
\s - match whitespace
\S - match non-whitespace
\n - newline
\t - tab
\r - carriage return

Grouping/etc.:
() - group a series of characters as a single element
[] - create a character class that allows any of the characters
[^] - invert a character class
[a-z] - match any lowercase character
[^a-z] - match any character other than a lower-case character
$1-$9 - when a match succeeds, $1-$9 are set to parenthesized elements that were matched.
\1-\9 - match the same expression that was previously matched.  (e.g. /^(a*)b\1$/ matches lines with the same number of a's before and after a b)

Repetition:
* - match 0 or more copies of the previous element
+ - match 1 or more copies of the previous element
? - match 0 or 1 of the previous element
{n} - match exactly n repetitions of the previous element
{n,} - match at least n repetitions
{n,m} - match at least n but no more than m repetitions

Regex 2: using regex in perl

$var =~ /regex/   - returns true if var contains the regex
$var !~ /regex/   - returns true if var does not contain the regex
$var =~ tr/regex1/regex2/   - translates characters in var corresponding to characters from regex1 into those in regex2 (e.g. tr/[a-z]/[A-Z]/ will capitalize a string)
$var =~ s/regex1/regex2/   - substitutes an occurrence of regex1 in var with regex2

Closing arguments:
Follow the final slash with:
i - to ignore case (e.g. $var =~ /hello/i matches HELLO or Hello etc.)
g - when substituting, substitutes all occurrences (not just one)

Files, etc:
open(HANDLE, 'file'); - open file for input
open(HANDLE, '<file'); - open file for input
open(HANDLE, '>file'); - open file for output
open(HANDLE, '>>file'); - open file for append
close(HANDLE); - close file handle
$var = <HANDLE>; - read one line of a file
@array = <HANDLE>; - read all lines of a file into an array
-r HANDLE  - is file readable?
-w HANDLE  - is file writeable?
-x HANDLE  - is file executable?
-e HANDLE  - does file exist?
print 'text';   - print text to stdout
print HANDLE 'text';  - print text to HANDLE
<> - read a line from files specified on command line (stdin if none specified)

Functions/subroutines:
sub FUNC {BLOCK} - define a subroutine with the specified name.  Parameters arrive in @_

Misc. functions:
chomp($var) - remove a \n from the end of $var if one exists
chop($var) - removes the last character from $var
split(/pattern/, $string) - splits $string around pieces that match the pattern, and returns an array of the results.
shift(@array) - removes the first element from the array and returns it
unshift(@array,...) - adds an element to the head of the array
pop(@array) - remove the last element from the array and returns it
push(@array,...) - appends additional arguments to array

Additional Perl references and tutorials:
http://www.comp. leeds.ac.uk/Perl/start.html - simple tutorial
http://www-cgi.cs. cmu.edu/cgi-bin/perl-man - searchable, categorized reference
http://www. squirrel.nl/pub/perlref-5.004.1.pdf - large PDF reference