Introduction to PerlKirrily RobertTraining co-ordinator
Netizen Pty Ltd
Copyright © 1999-2000 by Netizen Pty Ltd

Open Publication License

This work (Netizen "Introduction to Perl" training module notes) is licensed under the Open Publication License.

LICENSE

Terms and Conditions for Copying, Distributing, and Modifying

Items other than copying, distributing, and modifying the Content with which this license wasdistributed (such as using, etc.) are outside the scope of this license.

1. You may copy and distribute exact replicas of the OpenContent (OC) as you receive it, in anymedium, provided that you conspicuously and appropriately publish on each copy an appropriatecopyright notice and disclaimer of warranty; keep intact all the notices that refer to this License andto the absence of any warranty; and give any other recipients of the OC a copy of this License alongwith the OC. You may at your option charge a fee for the media and/or handling involved in creatinga unique copy of the OC for use offline, you may at your option offer instructional support for the OCin exchange for a fee, or you may at your option offer warranty in exchange for a fee. You may notcharge a fee for the OC itself. You may not charge a fee for the sole service of providing access toand/or use of the OC via a network (e.g. the Internet), whether it be via the world wide web, FTP, orany other method.

2. You may modify your copy or copies of the OpenContent or any portion of it, thus forming worksbased on the Content, and distribute such modifications or work under the terms of Section 1above, provided that you also meet all of these conditions:

a) You must cause the modified content to carry prominent notices stating that you changed it, theexact nature and content of the changes, and the date of any change.

b) You must cause any work that you distribute or publish, that in whole or in part contains or isderived from the OC or any part thereof, to be licensed as a whole at no charge to all third partiesunder the terms of this License, unless otherwise permitted under applicable Fair Use law.

These requirements apply to the modified work as a whole. If identifiable sections of that work arenot derived from the OC, and can be reasonably considered independent and separate works inthemselves, then this License, and its terms, do not apply to those sections when you distributethem as separate works. But when you distribute the same sections as part of a whole which is awork based on the OC, the distribution of the whole must be on the terms of this License, whosepermissions for other licensees extend to the entire whole, and thus to each and every partregardless of who wrote it. Exceptions are made to this requirement to release modified worksfree of charge under this license only in compliance with Fair Use law where applicable.

3. You are not required to accept this License, since you have not signed it. However, nothing elsegrants you permission to copy, distribute or modify the OC. These actions are prohibited by law ifyou do not accept this License. Therefore, by distributing or translating the OC, or by deriving worksherefrom, you indicate your acceptance of this License to do so, and all its terms and conditionsfor copying, distributing or translating the OC.

NO WARRANTY

4. BECAUSE THE OPENCONTENT (OC) IS LICENSED FREE OF CHARGE, THERE IS NOWARRANTY FOR THE OC, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHENOTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIESPROVIDE THE OC "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED ORIMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OFMERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK OF USEOF THE OC IS WITH YOU. SHOULD THE OC PROVE FAULTY, INACCURATE, OR OTHERWISEUNACCEPTABLE YOU ASSUME THE COST OF ALL NECESSARY REPAIR OR CORRECTION.

5. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILLANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MIRROR AND/OR REDISTRIBUTETHE OC AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANYGENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USEOR INABILITY TO USE THE OC, EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEENADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

Additionally:

6. If you offer training based upon this OpenContent, you must prominently display a notice stating whether or notyou are a Netizen Certified Training Organisation on the Open Contentitself and on any material advertising or publicising your training.

a) If you are a Netizen Certified Training Organisation, you must statethat you are a Netizen Certified Training Organisation and display theNetizen Certified Training Organisation logo. You must also provide aURL for more information, namely http://netizen.com.au/services/training/ncto/

b) If you are not a Netizen Certified Training Organisation, you muststate that you are not a Netizen Certified Training Organisation. Youmay not use the Netizen Certified Training Organisation logo. Youmust also provide a URL for more information, namely http://netizen.com.au/services/training/ncto/

Netizen and the N logo are trademarks of Netizen Pty Ltd. All rightsreserved.


Table of Contents
1. Introduction
Course outline
Assumed knowledge
Module objectives
Platform and version details
The course notes
Other materials
2. What is Perl
In this chapter...
Perl's name
Typical uses of Perl
Text processing
System administration tasks
CGI and web programming
Database interaction
Other Internet programming
Less typical uses of Perl
What is Perl like?
The Perl Philosophy
There's more than one way to do it
A correct Perl program...
Three virtues of a programmer
Three more virtues
Share and enjoy!
Parts of Perl
The Perl interpreter
Manuals
Perl Modules
Chapter summary
3. Creating and running a Perl program
In this chapter...
Logging into your account
Using perldoc
Using the editor
Our first Perl program
Running a Perl program from the command line
The "shebang" line
Comments
Command line options
Chapter summary
4. Perl variables
In this chapter...
What is a variable?
Variable names
Variable scoping and the strict pragma
Arguments in favour of strictness
Arguments against strictness
Using the strict pragma
Scalars
Double and single quotes
Exercises
Arrays
A quick look at context
What's the difference between a list and an array?
Exercises
Advanced exercises
Hashes
Initialising a hash
Reading hash values
Adding new hash elements
Other things about hashes
What's the difference between a hash and an associative array?
Exercises
Special variables
The first special variable, $_
@ARGV - a special array
%ENV - a special hash
Chapter summary
5. Operators and functions
In this chapter...
What are operators and functions?
Operators
Arithmetic operators
String operators
Exercises
File operators
Other operators
Functions
Types of arguments
Return values
More about context
Some easy functions
String manipulation
Numeric functions
Type conversions
Manipulating lists and arrays
Hash processing
Reading and writing files
Time
Exercises
Chapter summary
6. Conditional constructs
In this chapter...
What is a block?
Scope
What is a conditional statement?
What is truth?
Comparison operators
Existence and Defined-ness
Boolean logic operators
Using boolean logic operators as short circuit operators
Types of conditional constructs
if statements
while loops
for and foreach
Exercises
Practical uses of while loops: taking input from STDIN
Named blocks
Breaking out of loops
Chapter summary
7. Subroutines
In this chapter...
Introducing subroutines
Calling a subroutine
Passing arguments to a subroutine
Returning values from a subroutine
Exercises
Chapter summary
8. Regular expressions
In this chapter...
What are regular expressions?
Regular expression operators and functions
m/PATTERN/ - the match operator
s/PATTERN/REPLACEMENT/ - the substitution operator
Binding operators
Metacharacters
Some easy metacharacters
Quantifiers
Greediness
Exercises
Grouping techniques
Character classes
Alternation
The concept of atoms
Exercises
Chapter summary
9. Practical exercises
10. Conclusion
What you've learnt
Where to now?
Further reading
A. Unix cheat sheet
B. Editor cheat sheet
vi
Running
Using
Exiting
Gotchas
Help
pico
Running
Using
Exiting
Gotchas
Help
joe
Running
Using
Exiting
Gotchas
Help
jed
Running
Using
Exiting
Gotchas
Help
C. ASCII Pronunciation Guide

Chapter 1. Introduction

Welcome to Netizen's Introduction to Perl training module.This is a one-day training module in which you will learn how to program in thePerl programming language.


Course outline

  • What is Perl? (30 minutes)

  • Creating and running a Perl program (45 minutes)

  • Morning tea (15 minutes)

  • Variable types (45 minutes)

  • Operators and Functions (60 minutes)

  • Lunch break (60 minutes)

  • Conditional constructs (45 minutes)

  • Subroutines (30 minutes)

  • Afternoon tea (15 minutes)

  • Regular expressions (45 minutes)

  • Practical exercises (until finish)


Assumed knowledge

To gain the most from this course, you should:

  • Be able to use the Unix operating system

    • Move around the file system

    • Create and edit files

    • Run programs

  • Have programmed in least one other language and

    • Understand variables, including data types and arrays

    • Understand conditional and looping constructs

    • Understand the use of subroutines and/or functions


Module objectives

  • Understand the history and philosophy behind the Perl programming language

  • Know where to find additional information aboutPerl

  • Write simple Perl scripts and run them from the Unix command line

  • Use Perl's command line options to enablewarnings

  • Understand Perl's three main data types and how to use them

  • Use Perl's strict pragma to enforce lexical scoping and better coding

  • Understand Perl's most common operators and functions and how to use them

  • Understand and use Perl's conditional and looping constructs

  • Understand and use subroutines in Perl

  • Understand and use simple regular expressions for matching and substitution


Platform and version details

This module is taught using Unix or a Unix-like operating system. Mostof what is learnt will work equally well on Windows NT or otheroperating systems; your instructor will inform you throughout the courseof any areas which differ.

All Netizen's Perl training courses use Perl 5, the most recent majorrelease of the Perl language. Perl 5 differs signficantly from previousversions of Perl, so you will need a Perl 5 interpreter to use what youhave learnt. However, older Perl programs should work fine under Perl5.


The course notes

These course notes contain material which will guide you through thetopics listed above, as well as appendices containing other usefulinformation.

The following typographical conventions are used in these notes:

System commands appear in this typeface

Literal text which you should type in to the command line or editorappears as monospaced font.

Keystrokes which you should type appear like this:ENTER. Combinations of keys appear like this:CTRL-D

Program listings and other literal listings of what appears on thescreen appear in a monospaced font like this.

Parts of commands or other literal text which should be replaced by your own specific values appears like this

Note: Notes and tips appear offset from the text like this.

Advanced: Notes which are marked "Advanced" are for those who are racing ahead orwho already have some knowledge of the topic at hand. The informationcontained in these notes is not essential to your understanding of thetopic, but may be of interest to those who want to extend theirknowledge.

Readme: Notes marked with "Readme" are pointers to more information which can befound in your textbook or in online documentation such as manual pagesor websites.


Other materials

In addition to these notes, you should have a copy of the requiredtext book for this course: Programming Perl (2nd edition) byLarry Wall, Tom Christiansen and Randal L. Schwartz, more commonly referred to as "the Camel book". The Camel book will beused throughout the day, and will be a valuable reference to take homeand keep next to your computer.

You will also have received a floppy disk containing these notes inHTML form (with working links to external resources etc) and all theexercises and data used in this course.


Chapter 2. What is Perl

In this chapter...

This section describes Perl and its uses. You will learn about thishistory of Perl, the main areas in which it is commonly used, and alittle about the Perl community and philosophy. Lastly, you will findout how to get Perl and what software comes as part of the Perldistribution.


Perl's name

Perl has been said to stand for "Practical Extraction and ReportingLanguage" (by it's fans) or "Pathologically Eclectic Rubbish Lister"(by its detractors). In fact, Perl is not an acronym; it's a shortenedversion of the program's original name, "pearl", and when you're talkingabout the language it's spelt with acapital "P" and lowercase "erl", not all capitals as is sometimesseen (especially in job advertisements posted by contract agencies).When you're talking about the Perl interpreter, it's spelt in all lowercase: perl.

Perl has been described as everything from "line noise" to "theSwiss-army chainsaw of programming languages". The latter of thesenicknames gives some idea of how programmers see Perl - as a verypowerful tool that does just about everything.


Typical uses of PerlText processing

Perl's original main use was text processing. It is exceedinglypowerful in this regard, and can be used to manipulate textual data,reports, email, news articles, log files, or just about any kind oftext, with great ease.


System administration tasks

System administration is made easy with Perl. It's particularlyuseful for tying together lots of smaller scripts, working with filesystems, networking, and so on.


CGI and web programming

Since HTML is just text with built-in formatting, Perl can be used toprocess and generate HTML. Perl is probably the most popular languagearound for web development, and there are many tools and scriptsavailable for free.


Database interaction

Perl's DBI module makes interacting with all kinds of databases --- fromOracle down to comma-separated variable files --- easy and portable.Perl is increasingly being used to write large database applications,especially those which provide a database backend to a website.


Other Internet programming

Perl modules are available for just about every kind of Internetprogramming, from Mail and News clients, interfaces to IRC and ICQ,right down to lower level Socket programming.


Less typical uses of Perl

Perl is used in some unusual places as well. The Human Genome Projectrelies on Perl for DNA sequencing, NASA use Perl for satellite control,PDL (Perl Data Language, pron. "piddle") makes number-crunching easy,and there is even a Perl Object Environment (POE) which is used for event-driven state machines.


What is Perl like?

The following (somewhat paraphrased) article, entitled "What isPerl", comes from The Perl Journal(Used with permission.)

Perl is a general purpose programming languagedeveloped in 1987 by Larry Wall. It has become the language of choicefor WWW development, text processing, Internet services, mail filtering,graphical programming, and every other task requiring portableand easily-developed solutions.

Perl is interpreted.This means that as soon as you write your program, you can run it -there's no mandatory compilation phase. The same Perl program can runon Unix, Windows, NT, MacOS, DOS, OS/2, VMS and the Amiga.

Perl is collaborative.The CPAN software archive contains free utilities written by the Perlcommunity, so you save time.

Perl is free.Unlike most other languages, Perl is not proprietary. The source codeand compiler are free, and will always be free.

Perl is fast.The Perl interpreter is written in C, and a decade of optimisations haveresulted in a fast executable.

Perl is complete.The best support for regular expressions in any language, internalsupport for hash tables, a built-in debugger, facilities for reportgeneration, networking functions, utilities for CGI scripts, databaseinterfaces, arbitrary-precision arithmetic - are all bundled with Perl.

Perl is secure.Perl can perform "taint checking" to prevent security breaches. You canalso run a program in a "safe" compartment to avoid the risks inherentin executing unknown code.

Perl is open for business.Thousands of corporations rely on Perl for their information processingneeds.

Perl is simple to learn.Perl makes easy things easy and hard things possible. Perl handlestedious tasks for you, such as memory allocation and garbage collection.

Perl is concise.Many programs that would take hundreds or thousands of lines in otherprogramming languages can be expressed in a pageful of Perl.

Perl is object oriented.Inheritance, polymorphism, and encapsulation are all provided by Perl'sobject oriented capabilities.

Perl is flexibleThe Perl motto is "there's more than one way to do it." The languagedoesn't force a particular style of programming on you. Write whatcomes naturally.

Perl is fun.Programming is meant to be fun, not only in the satisfaction of seeingour well-tuned programs do our bidding, but in the literary act ofcreative writing that yields those programs. With Perl, the journey isas enjoyable as the destination.


The Perl PhilosophyThere's more than one way to do it

The Perl motto is "there's more than one way to do it" - oftenabbreviated TMTOWTDI. What this means is that for any problem, therewill be multiple ways to approach it using Perl. Some will be quicker,more elegant, or more readable than others, but that doesn't make themwrong.


A correct Perl program...

"... is one that does the job before your boss fires you." That's inthe preface to the Camel book, which is highly recommended reading.

Of course, some Perl programs are more correct than others, but whileelegance is a fine thing to strive for, most Perl people realise thatsoemtimes you just have to write a quick and dirty hack that'll keepthings running for the mean time. If you get the time to make itbeautiful later, so much the better.


Three virtues of a programmer

The Camel book contains the following entries in its glossary:


Laziness

The quality that makes you go to great effort to reduce overall energyexpenditure. It makes you write labor-saving programs that other peoplewill find useful, and document what you wrote so you don't have toanswer so many questions about it. Hence, the first great virtue of aprogrammer.


Impatience

The anger you feel when the computer is being lazy. This makes youwrite programs that don't just react to your needs, but actuallyanticipate them. Or at least pretend to. Hence, the second greatvirtue of a programmer.


Hubris

Excessive pride, the sort of thing Zeus zaps you for. Also the qualitythat makes you write (and maintain) programs that other people won'twant to say bad things about. Hence, the third great virtue of aprogrammer.


Three more virtues

In his "State of the Onion" keynote speech at The Perl Conference 2.0 in1998, Larry Wall described another three virtues, which are the virtuesof a community of programmers. These are:

  • Diligence

  • Patience

  • Humility

You may notice that these are the opposites of the first three virtues.However, they are equally necessary for Perl programmers who wish towork together, whether on a software project for their company or on anOpen Source project with many contributors around the world.


Share and enjoy!

Perl is Open Source software, and most of the modules and extensions forPerl are also released under Open Source licenses of various kinds (Perlitself is released under dual licenses, the GNU General Public Licenseand the Artistic License, copies of which are distributed with thesoftware).

The culture of Perl is fairly open and sharing, and thousands ofvolunteers worldwide have contributed to the current wealth of softwareand knowledge available to us. If you have time, you should try andgive back some of what you've received from the Perl community.Contribute a module to CPAN, help a new Perl programmer to debug herprograms, or write about Perl and how it's helped you. Even buyingbooks written by the Perl gurus (like many of the O'Reilly Perl books)helps give them the financial means to keep supporting Perl.


Parts of PerlThe Perl interpreter

The main part of Perl is the interpreter. The interpreter is available for Unix, Windows, and many other platforms. The current version of Perlis 5.005, which is available from the Perl website or anyof a number of mirror sites (the Windows version is available from Activestate. The next release of Perl will beversion 5.6; the jump in version numbers is because it was felt that the number of additional features between releases warranted a largerdifference between version numbers.


Manuals

Along with the interpreter come the manuals for Perl. These areaccessed via the perldoc command or, on Unix systems, also via the man command. More than 30 manual pages come with the current version of perl. These can be found by typing man perl (or perldoc perl on non-Unix systems). The Perl FAQs (Frequently Asked Questions files) are available in perldoc format, and can be accessed by typing perldoc perlfaq

Watch while this is demonstrated; you'll get a chance to try it soon.


Perl Modules

Perl also comes with a collection of modules. These are Perl programswhich carry out certain common tasks, and can be included as commonlibraries in any Perl script. Less commonly used modules aren'tincluded with the distribution, but can be downloaded from (CPAN) andinstalled separately.


Chapter summary

  • Common uses of Perl include

    • text processing

    • system administration

    • CGI and web programming

    • other Internet programming

  • Perl is a general purpose programming language, distributed forfree via the Perl websiteand mirror sites

  • Perl includes excellent support for regular expressions, objectoriented programming, and other features

  • Perl allows a great degree of programmer flexibility - "There'smore than one way to do it".

  • The three virtues of a programmer are Laziness, Impatience andHubris. Perl will help you foster these virtues

  • The three virtues of a programmer in a group environmentare Diligence, Patience, and Humility.

  • Perl is a collaborative language - everyone is free to contribute to the Perl software and the Perl community

  • Parts of Perl include:

    • the Perl interpreter

    • documentation in several formats

    • library modules


Chapter 3. Creating and running a Perl program

In this chapter...

In this chapter we will be creating a very simple "Hello, world"program in Perl and exploring some of the basic syntax of the Perlprogramming language.


Logging into your account

Your username and password will have been given to you with thesecourse notes.

Table 3-1. Details required to connect to the Netizen training server

Hostname or IP address Your username Your password 

  • Open the telnet program

  • Connect to the training server at the hostname or IPnumber given above

  • Login using the username and password you weregiven

  • You will find yourself at a Unix shell prompt. Hopefully (if you metthe pre-requisites of this course) you will now be able to see thatyour account has a subdirectory called exercises/ which are theexample scripts and exercises given in these course notes. If you'renot quite up to speed with Unix, there's a cheat-sheet in Appendix Aof these notes.


    Using perldoc

    On the command line, type perldoc perl. You will findyourself in the Perl documentation pages. Here's how to get aroundinside the documentation:

    Table 3-2. Getting around in perldoc

    ActionKeystrokePage downSPACEPage upbQuitq
    Using the editor

    A Perl script is just a normal text file, which means that you can edit it using a normal text editor.

    The system you are using has several editors available for your use,including vi, pico, and others. Those who are notalready familiar with vi should probably usepico,as it has a simpler interface. If you're an emacs user,sorry, our server doesn't have the resources to run it, but we do haveother editors which have an emacs emulation mode.

    To edit a file using pico, type:

    % pico filename

    (Note that the percent sign is your unix command line prompt - youdon't have to type it.)

    To edit a file using vi, type:

    % vi filename

    For other editors, just type the name of the editor followed by thename of the file you wish to edit.

    A summary of editor commands appears in Appendix B in the back ofthese coursenotes, just in case you need them.

    Incidentally, Appendix C contains a guide to pronouncing ASCIIcharacters, especially punctuation. This will help you translate perlinto spoken language, for ease of communication with otherprogrammers.


    Our first Perl program

    We're about to create our first, simple Perl script: a "hello world"program. There are a couple of things you should know in advance:

    • Perl programs (or scripts --- the words are interchangeable)consist of a series of statements

    • When you run the program, each statement is executed in turn, from the top of your script to the bottom. (There are two special caseswhere this doesn't occur, one of which --- subroutine declarations ---we'll be looking at later today)

    • Each statement ends in a semi-colon

    • Statements can flow over several lines

    • Whitespace (spaces, tabs and newlines) are ignored most places in a Perl script.

    Now, just for practice, open a file called hello.pl in yourtext editor. Type in the following one-line Perl program:

    print "Hello, world!\n";

    This one-line program calls the print function with a singleparameter, the string literal "Hello, world!" followed by anewline character.

    Save it and exit.


    Running a Perl program from the command line

    We can run the program from the command line by typing in:

    perl hello.pl

    You should see this output:

    Hello, world!

    This program should, of course, be entirely self-explanatory. Theonly thing you really need to note is the \n("backslash N") which denotes a new line.


    The "shebang" line

    So what if we want to run our program from the command line withouthaving to type in the name of the Perl interpreter first?

    You can make a file executable by typing:

    % chmod +x hello.pl

    at the command line. (For more information about the chmod command,type man chmod).

    In order to let the shell know what to do with our program when we tryto run it with ./hello.pl from the command line, we put the following line at the top of our program:

    #!/usr/bin/perl

    That's what we call a "shebang" line (because the # is a "hash"sign, and the ! is referred to as a "bang", hence "hashbang" or"shebang"). It tells the system what to use to interpret ourscript. Of course, if the Perl interpreter were somewhere else on oursystem, we'd have to change the shebang line to reflect that.


    Comments

    Incidentally, comments in Perl start with a hash sign (#), either on aline on their own or after a statement. Anything after a hash is acomment.

    # This is a hello world programprint "Hello, world!\n"; # print the message
    Command line options

    Perl has a number of command line options, which you can specify onthe command line by typing perl options hello.pl or whichyou can include in the shebang line. Let's say you want to use the-w command line option to turn on warnings:

    #!/usr/bin/perl -w

    (Incidentally, it's always a good idea to turn on warnings whileyou're developing something.)

    Advanced: Setting the special variable $^W to a true value will locallydisable warnings (i.e. in the current block).

    Readme: A full explanation of command line options can be found in the Camelbook on pages 330 to 337, or by typing perldoc perlrun.


    Chapter summary

    Here's what you know about Perl's operation and syntax so far:

    • Perl programs typically start with a "shebang" line

    • statements (generally) end insemicolons

    • statements may span multiple lines; it's only the semicolon that ends a statement

    • comments are indicated by a hash (#) sign. Anything after a hash sign on a line is a comment.

    • \n is used to indicate a newline

    • whitespace is ignored almosteverywhere

    • command line arguments to Perl can be indicated on the shebangline

    • the -w command line argument turns on warnings


    Chapter 4. Perl variables

    In this chapter...

    In this section we will explore Perl's three main variable types ---scalars, arrays, and hashes --- and learn to assign values to them,retrieve the values stored in them, and manipulate them in certain ways.


    What is a variable?

    A variable is a place where we can store data. Think of it like apigeonhole with a name on it indicating what data is stored in it.

    The Perl language is very much like human languages in many ways, so youcan think of variables as being the "nouns" of Perl. For instance, youmight have a variable called "total" or "employee".


    Variable names

    Variable names in Perl may contain alphanumeric characters in upper orlower case, and underscores. A variable name may not start with anumber, though - that means something special, which we'll encounterlater. Likewise, variables that start with anything non-alphanumericare also special, and we'll discuss that later, too.

    It's standard Perl style to name variables in lower case, withunderscores separating words in the name. For instance,employee_number. Upper case is usually used for constants,for instance LIGHT_SPEED or PI. Following theseconventions will help make your Perl more maintainable and more easilyunderstood by others.

    Lastly, variable names all start with a punctuation sign depending onwhat sort of variable they are:

    Table 4-1. Variable punctuation

    Variable typeStarts withPronouncedScalar$dollarArray@atHash%Percent

    (Don't worry if those variable type names don't mean anything to you. We're about to coverit.)


    Variable scoping and the strict pragma

    Many programming languages require you to "pre-declare" variables --that is, say that you're going to use them before you use them.Variables can either be declared as global (that is, they can be usedanywhere in the program) or local (they can only be used in the samepart of the program in which they were declared).

    In Perl, it is not necessary to declare your variables before youbegin. You can summon a variable into existence simply by using it,and it will be globally available to any routine in your program. Ifyou're used to programming in C or any of a number of other languages,this may seem odd and even dangerous to you. This is, in fact, thecase.


    Arguments in favour of strictness

    • avoids accidental creation of unwanted variables when you make a typing error

    • avoids scoping problems, for instance when a subroutine uses a variable with the same name as a global variable

    • allows for warnings if values are assigned to variables and never used


    Arguments against strictness

    • takes a while to get used to, and may slow down development until it becomes instinctual

    • enforces a nasty, fascist style of coding which isn't nearly as much fun

    Sometimes a little bit of fascism is a good thing, like when you wantthe trains to run on time. Because of this, Perl lets you turnstrictness on if you want it, using something called the strictpragma. A pragma, in Perl-speak, is a set of rules for how your codeis to be dealt with.

    Readme: Other effects of the strict pragma are discussed on page500 of the Camel.


    Using the strict pragma

    In the interests of bug-free code and teaching better Perl style,we're going to use the strict pragma throughout this training course.Here's how it's invoked:

    #!/usr/bin/perl -wuse strict;

    That typically goes at the top of your program, just under yourshebang line and introductory comments.

    Once we use the strict pragma, we have to explicitly declare newvariables using my. You'll see this in use below, and it will be discussed again later when we talk about blocks and subroutines.

    Try running the program exercises/strictfail.pl and see whathappens. What needs to be done to fix it? Try it and see if itworks. By the way, get used to this error message - it's one of themost common Perl programming mistakes, though it's easily fixed.

    Readme: There's more about use of my on page 189 of the Camel.


    Scalars

    The simplest form of variable in Perl is the scalar. A scalar is a single item of data such as:

    • Arthur

    • Just Another Perl Hacker

    • 42

    • 0.000001

    • 3.27e17

    Here's how we assign values to scalar variables:

    my $name = "Arthur";my $whoami = 'Just Another Perl Hacker';my $meaning_of_life = 42;my $number_less_than_1 = 0.000001;my $very_large_number = 3.27e17; # 3.27 by 10 to the power of 17

    Advanced: There are other ways to assign things apart from the= operator, too. They're covered on pages 92-93 of the Camel.

    As you can see, a scalar can be text of any length, and numbers of anyprecision (machine dependent, of course). Perl magically convertsbetween them when it needs to. For instance, it's quite legal to say:

    # adding an integer to a floating point numbermy $sum = $meaning_of_life + $number_less_than_1;# here we're putting the int in the middle of a string we# want to printprint "$name says, 'The meaning of life is $meaning_of_life.'\n";

    This may seem extraordinarily alien to those used to strictly typedlanguages, but believe it or not, the ability to transparently convertbetween variable types is one of the great strengths of Perl. Somepeople say that it's also one of the great weaknesses.

    Advanced: You can explicitly cast scalars to various specific data types. Look upint() on page 180 of the camel, for instance.


    Double and single quotes

    While we're here, let's look at the assignments above. You'll seethat some have double quotes, some have single quotes, and some haveno quotes at all.

    In Perl, quotes are required to distinguish strings from thelanguage's reserved words or other expressions. Either type of quote can be used, but there is one important difference: double quotes can include other variable names inside them, and those variables will thenbe interpolated - as in the last example above - while single quotes donot interpolate.

    # single quotes don't interpolate...my $price = '$9.95';# double quotes interpolate...my $invoice_item = "24 widgets at $price each\n";print $invoice_item;

    The above example is available in your directory asexercises/interpolate.pl so you can experiment with different kinds of quotes.

    Note that special characters such as the \n newline character areonly available within double quotes. Single quotes will fail to expandthese special characters just as they fail to expand variable names.

    When using either type of quotes, you must have a matching pair ofopening and closing quotes. If you want to include a quote mark in theactual quoted text, you can escape it by preceding it with a backslash:

    print "He said, \"Hello!\"\n";

    You can also use a backslash to escape other special characters such asdollar signs within double quotes:

    print "The price is \$300\n";

    To include a literal backslash in a double-quoted string, use twobackslashes: \\

    Readme: There are special quotes for executing a string as a shell command(see "Input operators" on page 52 of the Camel), and also specialquoting functions (see "Pick your own quotes" on page 41).


    Exercises

  • Write a script which sets some variables:

  • your name

  • your street number

  • your favourite colour

  • Print out the values of these variables using double quotes for variable interpolation

  • Change the quotes to single quotes. Whathappens?

  • Write a script which prints out C:\WINDOWS\SYSTEM\ twice -- once using double quotes, once using single quotes. How do you have to escape the backslashes in each case?

  • You'll find answers to the above inexercises/answers/scalars.pl.


    Arrays

    If you think of a scalar as being a singular thing, arrays are theplural form. Just as you have a flock of sheep or a wunch of bankers,you can have an array of scalars.

    An array is a list of (usually related) scalars all kept together.Arrays start with an @ (at sign), and are initialised thus:

    my @fruit = ("apples", "oranges", "guavas", "passionfruit", "grapes");my @magic_numbers = (23, 42, 69);my @random_scalars = ("mumble", 123.45, "willy the wombat", -300);

    As you can see, arrays can contain any kind of scalars. They can havejust about any number of elements, too, without needing to know howmany before you start. Really any number - tens or hundreds of thousands, if you've got the memory.

    Readme: Arrays are discussed on page 6 of the Camel or by typingperldoc perldata.

    So if we don't know how many items there are in an array, how can wefind out? Well, there are a couple of ways.

    First of all, Perl's arrays are indexed from zero. We can accessindividual elements of the array like this:

    print $fruits[0]; # prints "apples"print $random_scalars[2]; # prints "willy the wombat"

    Wait a minute, why are we using dollar signs in the example above,instead of at signs? The reason is this: we only want a scalar back,so we show that we want a scalar. There's a useful way of thinkingof this, which is explained in chapter 1 of the Camel: if scalars arethe singular case, then the dollar sign is like the word "the" -"the name", "the meaning of life", etc. The @ sign on anarray, or the % sign on a hash, is like saying "those" or"these" - "these fruit", "those magic numbers". However, whenwe only want one element of the array, we'll be saying things like"the first fruit" or "the last magic number" - hence thescalar-like dollar sign.

    If we wanted what we call an array slice we could say:

    @fruits[1,2,3]; # oranges, guavas, passionfruit@magic_numbers[0..1]; # 23, 42

    You just learnt something new, by the way: the .. ("dotdot") range operator (see pages 90-91 of your Camel or perldocperlop) which creates atemporary list of numbers between the two you specify - in this case 0and 1, but it could have been 1 and 100 if we'd had an array bigenough to use it on. You'll run into this operator again and again,so remember it.

    Another thing you can do with arrays is insert them into a string, thesame as for scalars:

    print "My favourite fruits are @fruits\n"; # whole arrayprint "Two types of fruit are @fruits[0,2]"; # array slice

    Returning to the point, how do we find the last element in an array?Well, there's a special variable called $#array which is theindex of the last element, so you can say:

    @fruit[0..$#fruit];

    and you'll get the whole array. If you print $#fruit you'llfind it's 4, which is not the same as the number of elements - 5.Remember that it's the index of the last element and that theindex starts at zero, so you have to add one to it to find outhow many elements in the array.

    But wait! There's More Than One Way To Do It - and an easier way, atthat. If you evaluate the array in a scalar context - that is, dosomething like this:

    my $fruit_count = @fruits;

    ... you'll get the number of elements in the array.

    Advanced: There's more than two ways to do it, as well - scalar(@fruits) and int(@fruits) will also tell us how many elements there are in the array.


    A quick look at context

    There's a term you've heard used just recently but which hasn't beenexplained: context.

    All Perl expressions are evaluated in a context. The two maincontexts are:

    • scalar context, and

    • list context

    Here's an example of an expression which can be evaluated in eithercontext:

    my $howmany = @array; # scalar contextmy @newarray = @array; # list context

    If you look at an array in a scalar context, you'll see how manyelements it has; if you look at it in list context, you'll see thecontents of the array itself.


    What's the difference between a list and an array?

    Not much, really. A list is just an unnamed array. Here's ademonstration of the difference:

    # printing a list of scalarsprint ("Hello", " ", $name, "\n");# printing an array@hello = ("Hello", " ", $name, "\n");print @hello;

    If you come across something that wants a LIST, you can either give itthe elements of list as in the first example above, or you can pass itan array by name. If you come across something that wants an ARRAY, youhave to actually give it the name of an array.

    Readme: List values and Arrays are covered on page 47 of the Camel.


    Exercises

  • Create an array of your friends' names

  • Print out the first element

  • Print out the last element

  • Print out the array from within a double-quoted string using variable interpolation

  • Print out an array slice of the 2nd to 4th items using variable interpolation

  • Answers to the above can be found inexercises/answers/arrays.pl


    Advanced exercises

  • Print the array without putting quotes around its name. Whathappens?

  • Set the special variable $, to something appropriate and try the previous step again (see page 132 of your Camel for this variable's documentation)

  • What happens if you have a small array and then you assign a value to $array[1000]?

  • Answers to the above can be found inexercises/answers/arrays_advanced.pl


    Hashes

    A hash is a two-dimensional array which contains keys and values.Instead of looking up items in a hash by an array index, you can look upvalues by their keys.

    Readme: Hashes are covered in the Camel on pages 7-8, then in moredetail on page 50 or in perldoc perldata.


    Initialising a hash

    Hashes are initalised in exactly the same way as arrays, with a commaseparated list of values:

    my %monthdays = ("January", 31, "February", 28, "March", 31, ...);

    Of course, there's more than one way to do it:

    my %monthdays = ( "January" => 31, "February" => 28, "March" => 31, ...);

    The spacing in the above example is commonly used to make hashassignments more readable.

    The => operator is syntactically the same as the comma, butis used to distinguish hashes more easily from normal arrays. Also, youdon't need to put quotes on the item which comes immediately before the => operator:

    my %monthdays = ( January => 31, February => 28, March => 31, ...);
    Reading hash values

    You get at elements in a hash by using the following syntax:

    print $monthdays{"January"}; # prints 31

    Again you'll notice the use of the dollar sign, which you should readas "the monthdays belonging to January".


    Adding new hash elements

    You can also create elements in a hash on the fly:

    my %monthdays = ();$monthdays{"January"} = 31;$monthdays{"February"} = 28;...
    Other things about hashes

    • Hashes have no internal order

    • There is no equivalent to $#array to get the size of a hash

    • However, there are functions such as each(), keys() and values() which will help you manipulate hash data. We look at these later, when we deal with functions.

    Advanced: You may like to look up the following functions which related to hashes:keys(), values(), each(), delete(), exists(), and defined().


    What's the difference between a hash and an associative array?

    Back in the days of Perl version 4 (and earlier), hashes were calledassociative arrays. The name "hash" is now preferred because it'smuch quicker to type. If you consider all the times thathashes are talked about in the newsgroup comp.lang.perl.misc and other Perlnewsgroups, the renaming of associative arrays to hashes has resultedin a major saving of bandwidth.


    Exercises

  • Create a hash of people and something interesting about them

  • Print out a given person's interesting fact

  • Change an person's interesting fact

  • Add a new person to the hash

  • What happens if you try to print an entry for a person who's not in the hash?

  • Answers to these exercises are given in exercises/answers/hash.pl


    Special variables

    Perl has many special variables. These are used to set or retrievecertain values which affect the way your program runs. For instance,you can set a special variable to turn interpreter warnings on and off,or read a special variable to find out the command line arguments passedto your script.

    Special variables can be scalars, arrays, or hashes. We'll look at someof each kind.

    Readme: Special variables are discussed at length in chapter 2 of your Camelbook (from page 127 onwards) and in the perlvar manual page.You may also like to look up the English module, which lets youuse longer, more English-like names for special variables. You'll findthis information in chapter 7, on page 403, or use perldocEnglish to read the module documentation.


    The first special variable, $_

    The first special variable, and possibly the oneyou'll encounter most often, is called $_("dollar-underscore"), and it represents the current thing that yourPerl script's working with - often a line of text or an element of alist or hash. It can be set explicitly, or it can be set implicitly bycertain looping constructs (which we'll look at later).

    The special variable $_ is often the default argument forfunctions in Perl. For instance, the print() function defaults to printing $_

    $_ = "Hello, world!\n";print;

    If you can think of Perl variables as being "nouns", then$_ is the pronoun "it".

    Readme: There's more discussion of using $_ on page 131 of your Camel book.


    Exercises

  • Set $_ to a string like "Hello, world", then print it out by using the print() command's default argument

  • The answers to the above exercise are in exercises/answers/scalars2.pl.


    @ARGV - a special array

    Perl programs accept arbitrary arguments or parameters from the command line, like this:

    perl printargs.pl foo bar baz

    This passes "foo", "bar" and "baz" as arguments into ourprogram, where they end up in an array called @ARGV. Try this script, which you'll find in your directory. It's calledexercises/printargs.pl.

    #!/usr/bin/perl -wprint "@ARGV\n";

    To run the script, type:

    % exercises/printargs.pl

    You should see "foo bar baz" printed out.


    Exercises

  • Modify your earlier array-printing script to print out the script's command line arguments instead of the names of your friends. Call your script by typing ./scriptname.pl firstarg secondarg thirdarg or similar.

  • The answers to the above exercise is in exercises/answers/argv.pl


    %ENV - a special hash

    Just as there are special scalars and arrays, there is a special hashcalled %ENV. This hash contains the names and values ofenvironment variables. To view these variables under Unix, simply typesetenv (C-type shells) or export(sh type shells) on the command line.


    Exercises

  • A user's home directory is stored in the environment variableHOME. Print out your own homedirectory.

  • The answer to the above can be found inexercises/answers/env.pl


    Chapter summary

    • Perl variable names typically consist of alphanumeric characters and underscores. Lower case names are used for most variables, and upper case for global constants.

    • The statement use strict; is used to make Perl require variables to be pre-declared and to avoid certain types of programming errors.

    • There are three types of Perl variables: scalars, arrays, and hashes.

    • Scalars are single items of data and are indicated by a dollar sign ($) at the beginning of the variable name.

    • Scalars can contain strings, numbers,etc

    • Strings must be delimited by quote marks. Using double quote marks will allow you to interpolate other variables and meta-characters such as \n (newline) into a string. Single quotes do not interpolate.

    • Arrays are one-dimensional lists of scalars and are indicated by an at sign (@) at the beginning of the variable name.

    • Arrays are initialised using a comma-separated list of scalars inside round brackets.

    • Arrays are indexed from zero

    • Item n of an array can be accessed by using $arrayname[n]

    • The index of the last item of an array can be accessed by using $#arrayname.

    • The number of elements in an array can be found by interpreting the array in a scalar context, eg my $items = @array;

    • Hashes are two-dimensional arrays of keys and values, and are indicated by a percent sign (%) at the beginning of the variable name.

    • Hashes are initialised using a comma-separated list of scalars inside curly brackets. Whitespace and the => operator (which is syntactically identical to the comma) can be used to make hash assignments look neater.

    • The value of a hash item whose key is foo can be accessed by using $hashname{foo}

    • Hashes have no internal order

    • $_ is a special variable which is the default argument for many Perl functions and operators

    • The special array @ARGV contains all command line parameters passed to the script

    • The special hash %ENV contains information about the user's environment.


    Chapter 5. Operators and functions

    In this chapter...

    In this chapter, we look at some of the operators and functions whichcan be used to manipulate data in Perl. In particular, we look atoperators for arithmetic and string manipulation, and many kinds offunctions including functions for scalar and list manipulation, morecomplex mathematical operations, type conversions, dealing with files,etc.


    What are operators and functions?

    Operators and functions are routines that are built into the Perllanguage to do stuff.

    The difference between operators and functions in Perl is a very trickysubject. There are a couple of ways to tell the difference:

    • Functions usually have all their parameters on the right hand side

    • Operators can act in much more subtle and complex ways thanfunctions

    • Look in the documentation - if it's in perldoc perlop,it's an operator; if it's in perldoc perlfunc, it's afunction. Otherwise, it's probably a subroutine.

    The easiest way to explain operators is to just dive on in, so here wego:


    Operators

    Readme: There are lists of all the available operators, and what theyeach do, on pages 76-94 of the Camel. Precedence and associativity arealso covered there.


    Arithmetic operators

    Arithmetic operators can be used to perform arithmetic operations onvariables or constants. The commonly used ones are:

    Table 5-1. Arithmetic operators

    OperatorExampleDescription+ $a + $b Addition- $a - $b Subtraction * $a * $b Multiplication/ $a / $bDivision% $a % $b Modulus (remainder when $a is divided by $b, eg 11 % 3 = 2)** $a ** $b Exponentiation ($a to the power of $b)

    Advanced: Just like in C, there are some short cut arithmetic operators:

    $a += 1; # same as $a = $a + 1$a -= 3; # same as $a = $a - 3$a *= 42; # same as $a = $a * 42

    (In fact, you can extrapolate the above with just about any operator -see page 17 of the Camel for more about this)

    You can also use $a++ and $a---- if you're familiarwith such things. ++$a and ----$a are also valid,but they do some slighty different things and you won't need themtoday (but you can read about them on pages 17 to 18 of the Camel ifyou are sufficiently interested).


    String operators

    Just as we can add and multiply numbers, we can also do similar thingswith strings:

    Table 5-2. String operators

    OperatorExampleDescription. $a . $bConcatenation (puts $a and $b together as one string)x $a x $b Repeat (repeat $a $b times --- eg "foo" x 3 gives us "foofoofoo"

    Readme: There's more about the concatenation operator an the top ofpage 16 of the Camel.


    Exercises

  • Calculate the cost of 18 widgets at $37.00 each and print theanswer (Answer: exercises/answers/widgets.pl)

  • Print out a line of dashes without using more than one dash inyour code (except for the -w). (Answer: exercises/answers/dashes.pl)

  • Use exercises/operate.pl to practice using arithmetic and stringoperators.


  • File operators

    We can use file test operators to test various attributes of files anddirectories:

    Table 5-3. File test operators

    OperatorExampleDescription-e -e $a Exists - does the file exist?-r -r $a Readable - is the file readable?-w -w $a Writable - is the file writable?-d -d $a Directory - is it a directory?-f -f $a File - is it a normal file?-T -T $a Text - is the file a text file?

    Readme: There are examples of these in use on pages 19-20 of the Camel.There is a complete list of the file operators in the Camelon page 85. There are lots!


    Other operators

    You'll encounter all kinds of other operators in your Perl career, andthey're all described in the Camel from page 76 onwards. We'll cover them as they become necessary to us -- you've already seenoperators such as the assignment operator (=), the=> operator which behaves a bit like the commaoperator, and so on.

    Advanced: While we're here, let's just mention what "unary" and "binary" operatorsare.

    A unary operator is one that only needs something on one side of it,like the file operators or the autoincrement (++) operator.

    A binary operator is one that needs something on either side of it, suchas the addition operator.

    A trinary operator also exists, but we don't deal with it in thiscourse. C programmers will probably already know about it, and can useit if they want.


    Functions

    A function is like an operator - and in fact some functions double asoperators in certain conditions - but with the following differences:

    • longer names

    • can take any kinds of arguments

    • arguments always come after the function name

    The only real way to tell whether something is a function or an operatoris to check the perlop and perlfunc manual pages andsee which it appears in.

    Readme: There's an introduction to functions on page 8 of the Camel, labeled'Verbs'.


    Types of arguments

    Functions typically take the following kind of arguments:

    SCALAR -- Any scalar variable - 42, "foo", or $a

    LIST -- Any named or unnamed list (remember that a named list is an array)

    ARRAY -- A named array; usually results in the array being modified

    HASH -- Any named or unnamed hash

    PATTERN --A pattern to match on - we'll talk more about these later on, inRegular Expressions

    FILEHANDLE -- A filehandle indicating a file that you've opened or one of thepseudo-files that is automatically opened, such as STDIN, STDOUT, andSTDERR

    There are other types of arguments, but you're not likely to need todeal with them in this module.

    Readme: In chapter 3 of the Camel (starting on page 141) you'll see howthe documentation describes what kind of arguments a function takes.


    Return values

    Just as a function can take arguments of various kinds, they can return various things for you to use - though they don't have to, and you don'thave to use them if you don't want.

    If a function returns a scalar, and we want to use it, we can saysomething like:

    my $age = 29.75;my $years = int($age);

    and $years will be assigned the returned value of the int()function when given the argument $age - in this case, 29,since int() truncates instead of rounding.

    If we just wanted to do something to a variable and didn't care whatvalue was returned, we could just say:

    my $input = <STDIN>;chomp($input);

    While we're at it, you should also know that the brackets on functionsare optional if it's not likely to cause confusion. What's likely tocause confusion varies from one person to the next, but it's a prettysafe bet to use brackets as much as possible when you're starting out,and then drop them off if you see that other people are usually doingit. Seriously. You can learn a lot about Perl style by looking atother people's code, especially code found on CPAN or given as examplesin Perl books, newsgroups, etc.


    More about context

    Many different functions and operators behave differently depending onwhether they're called in scalar context orlist context. Each onewill be noted in its documentation, either in the Camel or in themanual pages.

    Here are some Perl operators and functions that care about context:

    Table 5-4. Context-senstive functions

    What? Scalar context List contextreverse() Reverses characters in a string Reverses the order of the elements in an arrayeach() Returns the next key in a hash Returns a two-element list consisting of the next key and value pair in a hashgmtime() and localtime() Returns the time as a string in common format Returns a list of second, minute, hour, day, etckeys() Returns the number of keys (and hence the number of elements) in a hash Returns a list of all the keys in a hashreaddir() Returns the next filename in a directory, or undef if there are no more Returns a list of all the filenames in a directory

    There are many other cases where an operation varies depending oncontext. Take a look at the notes on context at the start ofperldoc perlfunc to see the official guide to this: "anythingyou want, except consistency".

    You can also use perldoc -f functionname to get thedocumentation for just a single function.


    Some easy functions

    Readme: Starting on page 143 of the Camel book, there is a list of everysingle Perl function, their arguments, and what they do.


    String manipulationFinding the length of a string

    The length of a string can be found using thelength() function:

    #!/usr/bin/perl -wuse strict;my $string = "This is my string";print length($string);
    Case conversion

    You can convert Perl strings from upper case to lower case, or viceversa, using the lc() and uc()functions, respectively.

    #!/usr/bin/perl -wprint lc("Hello, World!"); # prints "hello, world!"print uc("Hello, World!"); # prints "HELLO, WORLD!"

    The lcfirst() and ucfirst()functions can be used to change only the first letter of a string.

    #!/usr/bin/perl -wprint lcfirst("Hello, World!"); # prints "hello, World!"print lcfirst(uc("Hello, World!")); # prints "hELLO, WORLD!"

    Notice how, in the last line of the example above, thelcfirst() operates on the result of theuc() function.


    chop() and chomp()

    The chop() function removes the last character of astring and returns that character.

    #!/usr/bin/perl -wuse strict;my $char = chop("Hello"); # $char is now equal to "o"my $string = "Goodbye";$char = chop $string;print $char . "\n"; # "e"print $string . "\n"; # "Goodby"

    The chomp() works similarly, butonly removes the last character if it is a newline.This is very handy for removing extraneous newlines from user input.


    String substitutions with substr()

    The substr() function can be used to return a portionof a string, or to change a portion of a string.

    #!/usr/bin/perl -wuse strict;my $string = "Hello, world!";print substr($string, 0, 5); # prints "Hello"substr($string, 0, 5) = "Greetings";print $string; # prints "Greetings, world!"
    Numeric functions

    There are many numeric functions in Perl, including trig functions andfunctions for dealing with random numbers. These include:

    • abs() (absolutevalue)

    • cos(), sin(), and atan2()

    • exp()(exponentiation)

    • log() (logarithms)

    • rand() and srand() (random numbers)

    • sqrt() (square root)


    Type conversions

    The following functions can be used to force type conversions (if youreally need them):

    • oct()

    • int()

    • hex()

    • chr()

    • ord()

    • scalar()


    Manipulating lists and arraysStacks and queues

    Stacks and queues are special kinds of lists.

    A stack can be thought of like a stack of paper on a desk. Things areput onto the top of it, and taken off the top of it.

    A queue, on the other hand, has things added to the end of it and takenout of the start of it. Queues are also referred to as "FIFO" lists(for "First In, First Out").

    We can simulate stacks and queues in Perl using the following functions:

    • push() -- add items to the end of a list

    • pop() -- remove items from the end of a list

    • shift() -- remove items from the start of a list

    • unshift() -- add items to the start of a list

    A queue can be created by pushing items onto the endof a list and shifting them off the front.

    A stack can be created by pushing items on the end ofa list and popping them off.


    Sorting lists

    The sort() function, when used on a list, returns asorted version of that list. It does not sort thelist in place.

    The reverse() function, when used on a list, returnsthe list in reverse order. It does not reverse thelist in place.

    #!/usr/bin/perl -wmy @list = ("a", "z", "c", "m");my @sorted = sort(@list);my @reversed = reverse(sort(@list));
    Converting lists to strings, and vice versa

    The join() function can be used to join together theitems in a list into one string. Conversely, split()can be used to split a string into elements for a list.


    Hash processing

    The delete() function deletes an element from a hash.

    The exists() function tells you whether a certain keyexists in a hash.

    The keys() and values() functionsreturn lists of the keys or values of a hash, respectively.


    Reading and writing files

    The open() function can be used to open a file forreading or writing. The close() function closes afile after you're done with it.


    Time

    The time() function returns the current time in Unixformat (that is, the number of seconds since 1 Jan 1970).

    The gmtime() and localtime()functions can be used to get a more friendly representation of the time,either in Greenwich Mean Time or the local time zone. Both can be usedin either scalar or list context.


    Exercises

    These exercises range from easy to difficult. Answers are provided inthe exercises directory (filenames are given with each exercise).

  • Create a scalar variable containing the phrase "There's morethan one way to do it" then print it out in all upper-case (Answer:exercises/answers/tmtowtdi.pl)

  • Print a random number

  • Print a random item from an array (Answer:exercises/answers/quotes.pl)

  • Print out the third character of a word entered by the user as an argument on the command line(There's a starter script in exercises/thirdchar.pl and theanswer's in exercises/answers/thirdchar.pl)

  • Print out the date for a week ago (the answer's inexercises/answers/lastweek.pl

  • Print out a sentence in reverse

  • reverse the whole sentence

  • reverse just the words

  • (Answer: exercises/answers/reverse.pl)


    Chapter summary

    • Perl operators and functions can be used to manipulate data andperform other necessary tasks

    • The difference between operators and functions is blurred; mostcan behave in either way

    • Chapter 3 of your Camel book, perldoc perlop,perldoc perlfunc, and perldoc -f functionnamecan be used to find out detailed information about operators andfunctions.

    • Functions can accept arguments of variouskinds

    • Functions may return scalars, listsetc

    • Return values may differ depending on whether a function is called in scalar or list context


    Chapter 6. Conditional constructs

    In this chapter...

    In this section, we look at Perl's various conditional constructs andhow they can be used to provide flow control to our Perl programs. Wealso learn about Perl's meaning of Truth and how to test for truth invarious ways.


    What is a block?

    The simplest block is a single statement, for instance:

    print "Hello, world!\n";

    Sometimes you'll want several statements to be grouped togetherlogically. That's what we call a block. A block can be executedeither in response to some condition being met, or as an independentchunk of code that's given a name.

    Blocks always have curly brackets ( { and } )around them. In C and Java, curly brackets are optional in some cases- not so in Perl.

    { $fruit = "apple"; $howmany = 32; print "I'd like to buy $howmany $fruit" . "s.\n";}

    You'll notice that the body of the block is indented from thebrackets; this is to improve readability. Make a habit of doing it.

    Readme: The Camel book refers to blocks with curly braces around themas BLOCKs (in capitals). It discusses them on page 97.


    Scope

    Something that needs mentioning again at this point is the concept ofvariable scoping. You will recall that we use the myfunction to declare variables when we're using thestrictpragma. The my also scopes the variables so that they arelocal to the current block

    #!/usr/bin/perl -wuse strict;my $a = "foo";{ # start a new block my $a = "bar"; print "$a\n"; # prints bar}print $a; # prints foo

    Now, onto the situations in which we'll encounter blocks.


    What is a conditional statement?

    A conditional statement is one which allows us to test the truth ofsome condition. For instance, we might say "If the ticket price isless than ten dollars..." or "While there are still ticketsleft..."

    You've almost certainly seen conditional statements in otherprogramming languages, so we'll just assume that you get the generalidea.

    Readme: Perl's conditional statements are listed and explained onpages 95-106 of the Camel.


    What is truth?

    Conditional statements invariably test whether something is true ornot. Perl thinks something is true if it doesn't evaluate to zero(0), an empty string (""), or undefined.

    42 # true0 # false"0" # false, because perl switches it to a number when it # needs to"wibble" # true$new_variable # false (if we haven't set it to anything, it's # undefined)

    Readme: The Camel discusses Perl's idea of truth on pages 20-21, includingsome odd cases.


    Comparison operators

    We can compare things, and find out whether our comparison statementis true or not. The operators we use for this are:

    Table 6-1. Comparison operators

    OperatorExampleMeaning==$a == $bEquality (same as in C and other C-like languages)!=$a != $bInequality (again, C-like)<$a < $bLess than> $a > $bGreater than<=$a <= $bLess than or equal to>= $a >= $bGreater than or equal to

    If we're comparing strings, we use a slightly different set ofcomparison operators, as follows:

    Table 6-2. String comparison operators

    OperatorMeaningeq Equalityne Inequalitylt Less than (in "asciibetical" order)gt Greater thanle Less than or equal toge Greater than or equal to

    Some examples:

    69 > 42 # true"0" == 3 - 3 # true'apple' gt 'banana' # false; apple is alphabetically before # banana1 + 2 == "3com" # true - 3com is evaluated in numeric # context because we used == not eq

    Assigning undef to a variable name undefines it again, as does using the undef function with the variable's name as itsargument.


    Existence and Defined-ness

    We can also check whether things are defined (something is defined when it's had a value assigned to it), or whether an element of ahash exists.

    To find out if something is defined, use Perl's defined function.You can't just use the name of the variable because the variable canbe defined an still evaluate to false - for example, if you assign itthe value 0.$skippy = "bush kangaroo";if (defined($skippy)) { print "Skippy is defined.\n";} else { print "Skippy is undefined.\n";}

    Readme: The defined function is described in the Camel on page 155.

    To find out if an element of a hash exists, use theexistsfunction:my %animals = ( "Skippy" => "bush kangaroo", "Flipper" => "faster than lighting",);if (exists($animals{"Blinky Bill"}) { print "Blinky Bill exists.\n";} else { print "Blinky Bill doesn't exist.\n";}

    Readme: There's a bit of explanation of the difference between a hash key"existing" and being "defined" on page 164 of the Camel.

    One last quick example to clarify existence, definedness and truth:

    my %miscellany = ( "apple" => "red", # exists, defined, true "howmany" => 0, # exists, defined, false "koala" => undef, # exists, undefined, false);if (exists($miscellany("wombat")) { # doesn't exist print "Wombat exists\n";} else { print "We have no wombats here.\n"; # this will happen}
    Boolean logic operators

    Boolean logic operators can be used to combine two or more Perlstatements, either in a conditional test or elsewhere.

    The short circuit operators come in two flavours: line noise, and English.Both do similar things but have different precedence. This causes greatconfusion. There are two ways of avoiding this: use lots of brackets,or read page 89 of the Camel book very, very carefully.

    Advanced: Alright, if you insist: and and or operators have very lowprecendence (i.e. they will be evaluated after all the other operatorsin the condition) whereas && and || have quite highprecedence and may require parentheses in the condition to make it clearwhich parts of the statement are to be evaluated first.

    Table 6-3. Boolean logic operators

    English-likeC-styleExampleResultand&&$a && $bTrue if both $a and $b are true; acts on $a then if $a is true, goes on to act on $b.or||$a || $bTrue if either of $a and $b are true; acts on $a then if $a is false, goes on to act on $b.

    Here's how you can use them to combine conditions in a test:

    $a = 1;$b = 2;$a == 1 and $b == 2 # true$a == 1 or $b == 5 # true$a == 2 or $b == 5 # false($a == 1 and $b == 5) or $b == 2 # true (parenthesized expression # evaluated first)
    Using boolean logic operators as short circuit operators

    These operators aren't just for combining tests in conditionalstatements --- they can be used to combine other statements as well.

    Here's a real, working example of the || short circuitoperator:

    open(INFILE, "input.txt") || die("Can't open input file: $!");

    What is it doing?

    Readme: The open() function can be found on page 191 of the Camel, if youwant to look at how it works.

    The && operator is less commonly used outside of conditionaltests, but is still very useful. Itsmeaning is this: If the first operand returns true, the second willalso happen. As soon as you get a false value returned, the expressionstops evaluating.

    ($day eq 'Friday') && print "Have a good weekend!\n";

    The typing saved by the above example is not necessarily worth the loss inreadability, especially as it could also have been written:

    print "Have a good weekend!\n" if $day eq 'Friday';if ($day eq 'Friday') { print "Have a good weekend!\n";}

    ...or any of a dozen other ways. That's right, there's more than oneway to do it.

    The most common usage of the short circuit operators, especially|| (or or) is to trap errors, such as when opening files or interacting with the operating system.

    Readme: Short circuit operators are covered from page 89 of the Camelbook.


    Types of conditional constructs

    You'll have noticed that we snuck in something new in the last section-- the if construct. It probably didn't surprise you much - you'll haveseen something similar in just about every programming language.(Bonus points will not be given for naming programming languages which have no "if" construct.)


    if statements

    The if construct goes like this:

    if (conditional statement) { BLOCK} elsif (conditional statement) { BLOCK} else { BLOCK}

    Both the elsif and else parts of the above are optional, and ofcourse you can have more than one elsif. elsif is also spelt differently toother languages' equivalents - C programmers should take especial noteto not use else if.

    If you're testing for something negative, it can sometimes make sense touse the similar-but-opposite construct, unless.

    unless (conditional statement) { BLOCK}

    There is no such thing as an "elsunless" (thank the gods!), and if youfind yourself using an else with unless then you should probablyhave written it as an if test in the first place.

    There's also a shorthand, and more English-like, way to useif and unless:

    print "We have apples\n" if $apples;print "Yes, we have no bananas\n" unless $bananas;
    while loops

    We can repeat a block while a given condition is true:

    while (conditional statement) { BLOCK}while ($hunger) { print "Feed me!\n"; $hunger--;}

    The logical opposite of this is the "until" construct:

    until ($full) { print "Feed me!\n"; $full++;}
    for and foreach

    Perl has a for construct identical to C and Java:

    for ($count = 0; $count <= $enough; $count++) { print "Had enough?\n";}

    However, since we often want to loop through the elements of an array,we have a special "shortcut" looping construct called foreach, which is similar to the construct available in some Unix shells. Compare the following:

    # using a for loopfor ($i = 0; $i <= $#array; $i++) { print $array[$i] . "\n";}# using foreachforeach (@array) { print "$_\n";}

    There are some examples of foreach inexercises/foreach.pl

    Tip: foreach(n..m) can be used to automatically generate a list of numbers between n and m.

    We can loop through hashes easily too, using the keys function toreturn the keys of a hash as an list that we can use:

    foreach $key (keys %monthdays) { print "There are $monthdays{$key} days in $key.\n";}

    We'll look at hash functions later.


    Exercises

  • Set a variable to a numeric value, then create an if statement as follows:

  • If the number is less than 3, print "Too small"

  • If the number is greater than 7, print "Too big"

  • Otherwise, print "Just right"

  • Set two variables to your first and last names. Use anif statement to print out whichever of them comes first in thealphabet.

  • Use a while loop to print out a numberedlist of the elements in an array

  • Now do it with a foreachloop

  • Now do it with a hash, printing out the keys and values for eachitem (hint: look up the keys function in your Camel book)

  • Answers are given in exercises/answers/loops.pl


    Practical uses of while loops: taking input from STDIN

    STDIN is the standard input stream for any Unix program. If a programis interactive, it will take input from the user via STDIN. Many Unixprograms accept input from STDIN via pipes and redirection. Forinstance, the Unix cat utility prints out any file it hasredirected to its STDIN:

    % cat < hello.pl

    Unix also has STDOUT (the standard output) and STDERR (where errorsare printed to).

    We can get a Perl script to take input from STDIN (standard input) anddo things with it by using the line input operator, which is a set ofangle brackets with the name of a filehandle in between them:

    my $user_input = <STDIN>;

    The above example takes a single line of input from STDIN. The input isterminated by the user hitting Enter. If we want to repeatedly take input from STDIN, we can use the line input operator in a while loop:

    #!/usr/bin/perl -wwhile ($_ = <STDIN>) { # do some stuff here, if you want... print; # remember that print takes $_ as its default argument]

    Conveniently enough, the while statement can be written moresuccinctly, because in these circumstances, the line input operatorassigns to $_ by default:

    while (<STDIN>) { print;}

    Better yet, the default filehandle used by the line input operator isSTDIN, so we can shorten the above example yet further:

    while (<>) { print;}

    As always, there's more than one way to do it.

    The above example script (which is available in your directoryas exercises/cat.pl) will basically perform the same function as theUnix cat command; that is, print out whatever's given to itthrough STDIN.

    Try running the script with no arguments. You'll have to type somestuff in, line by line, and type CTRL-D (a.k.a. ^D) when you'reready to stop. ^D indicates end-of-file (EOF) on most Unix systems.

    Now try giving it a file by using the shell to redirect its own sourcecode to it:

    perl exercises/cat.pl < exercises/cat.pl

    This should make it print out its own source code.


    Named blocks

    Blocks can be given names, thus:

    #!/usr/bin/perl -wLINE: while (<STDIN>) { ...}

    By tradition, the names of blocks are in upper case. The name shouldalso reflect the type of thing you are iterating over -- in this case, aline of text from STDIN.


    Breaking out of loops

    You can break out of loops using next,last and similar statements.

    #!/usr/bin/perl -wLINE: while (<STDIN>) { chomp; # remove newline next LINE if $_ eq ''; # skip blank lines last LINE if lc($_) eq 'q'; # quit}

    The LINE indicating the block to break out of isoptional (it defaults to the current smallest loop), but can be very useful when you wish to break out of a loop higher up the chain:

    #!/usr/bin/perl -wLINE: while (<STDIN>) { chomp; # remove newline next LINE if $_ eq ''; # skip blank lines # we split the line into words and check all of them foreach (split $_) { last LINE if lc($_) eq 'quit'; # quit }}
    Chapter summary

    • A block in Perl is a series of statements grouped together bycurly brackets. Blocks can be used in conditional constructs andsubroutines.

    • A conditional construct is one which executes statements based onthe truth of a condition

    • Truth in Perl is determined by testing whether something is NOTany of: numeric zero, the null string, or undefined

    • The if - elsif - else conditional construct can be used to perform certain actions based on the truth of a condition

    • The while, for, and foreach constructscan be used to repeat certain statements based on the truth of acondition.

    • A common practical use of the while loop is to read each line of a file.

    • Blocks may be named using the NAME: convention

    • You can break out of blocks using next, last and similar statements


    Chapter 7. Subroutines

    In this chapter...

    In this chapter, we look at subroutines and how they can be usedto simplify your code.


    Introducing subroutines

    If you have a long Perl script, you'll probably find that there areparts of the script that you want to break out into subroutines. Inparticular, if you have a section of code which is repeated more thanonce, it's best to make it a subroutine to save on maintenance (and, ofcourse, linecount).

    A subroutine is basically a little self-contained mini-program in theform of block which has a name, and can take arguments and return values:

    # the general casesub name { BLOCK}# the specific casesub print_headers { print "Programming Perl, 2nd ed\n"; print "by\n"; print "Larry Wall et al.\n";}
    Calling a subroutine

    A subroutine can be called in either of the following ways:

    &print_headers;print_headers();

    If (for some reason) you've got a subroutine that clashes with areserved function or something, you will need to prefix your functionname with & (ampersand) to be perfectly clear. Youshould avoid doing this anyway; overloading built-in functions can causemore confusion than it's worth.

    Advanced: There are other times when you need to use an ampersand on yoursubroutine name, such as when a function needs a SUBROUTINE type ofparameter, or when making an anonymous subroutine reference.


    Passing arguments to a subroutine

    You can pass arguments to a subroutine by including them in the brackets when you call it.The arguments end up in an array called @_ which is onlyvisible inside the subroutine.

    print_headers("Programming Perl, 2nd ed", "Larry Wall et al");# we can also pass variables to a subroutine by name...my $fiction_title = "Lord of the Rings";my $fiction_author = "J.R.R. Tolkein";print_headers($fiction_title, $fiction_author);sub print_headers { my ($title, $author) = @_; print "$title\n"; print "by\n"; print "$author\n";}

    You can take any number of scalars in as arguments - they'll allend up in @_ in the same order you gave them.

    Readme: You could also use $title = shift; $author = shift; to get the sameresult. See the entry for shift on page 215 of the Camel book.


    Returning values from a subroutine

    To return a value from a subroutine, simply use thereturn function.

    sub print_headers { my ($title, $author) = @_; return "$title\nby\n$author\n\n";}sub sum { my $total; foreach (@_) { $total = $total + $_; } return $total;}

    You can also return lists from your subroutine:

    # subroutine to return the first three arguments passed to itsub firstthree { return @_[0..2]; }my @three_items = firstthree("x", "y", "z", "a", "b");# sets @three_items to ("x", "y", "z");
    Exercises

  • Write a subroutine which prints out its firstargument

  • Modify the above subroutine to also print out the last argument

  • Now change it to compare the first and last arguments and returnthe one which is numerically larger (you'll want to use an if statementfor that)


  • Chapter summary

    • A subroutine is a named block which can be called from anywhere inyour Perl program

    • Subroutines can accept parameters, which are available via thespecial array @_

    • Subroutines can return scalar or list values.


    Chapter 8. Regular expressions

    In this chapter...

    In this chapter we begin to explore Perl's powerful regular expressioncababilities, and use regular expressions to perform matching andsubstitution operations on text.


    What are regular expressions?

    The easiest way to explain this is by analogy. You will probably befamiliar with the concept of matching filenames under DOS and Unix byusing wildcards - *.txt or /usr/local/* forinstance. When matching filenames, an asterisk can be used to matchany number of unknown characters, and a question mark matches any singlecharacter. There are also less well-known filename matching characters.

    Regular expressions are similar in that they use special characters tomatch text. The differences are that any kind of text can be matched,and that the set of special characters is different.

    Regular expressions are also known as REs, regexes, and regexps.

    Tip: If you have a mathematical background, you may like to think of a regexpas a definition of a set of strings. For instance, a regexp maydescribe the set of all strings which begin with the letter "a".


    Regular expression operators and functionsm/PATTERN/ - the match operator

    The most basic regular expression operator is the matching operator,m/PATTERN/.

    • Works on $_ bydefault.

    • In scalar context, returns true (1) if the matchsucceeds, or false (the empty string) if the matchfails.

    • In list context, returns a list of any parts of the pattern which areenclosed in parentheses. If there are no parentheses, the entirepattern is treated as if it were parenthesized.

    • The m is optional if you use slashes as the patterndelimiters.

    • If you use the m you can use any delimiter you likeinstead of the slashes. This is very handy for matching on stringswhich contain slashes, for instance directory names orURLs.

    • Using the /i modifier on the end makes it caseinsensitive.

    while (<>) { print if m/foo/; # prints if a line contains "foo" print if m/foo/i; # prints if a line contains "foo", "FOO", etc print if /foo/i; # exactly the same; the m is optional print if m!http://!; # using ! as an alternative delimiter
    s/PATTERN/REPLACEMENT/ - the substitution operator

    This is the substitution operator, and can be used to find text whichmatches a pattern and replace it with something else.

    • Works on $_ bydefault.

    • In scalar context, returns the number of matches found and replaced.

    • In list context, behaves the same as in scalar context and returns the number of matches found and replaced.

    • You can use any delimiter you want, the same as them// operator.

    • Using /g on the end of it matches globally, otherwisematches (and replaces) only the first instance of thepattern.

    • Using the /i modifier makes it case insensitive.

    # fix some misspelt textwhile (<>) { s/freind/friend/g; s/teh/the/g; s/jsut/just/g; print;}

    The above example can be found inexercises/spellcheck.pl.


    Binding operators

    If we want to use m// or s/// to operate on something other than $_ we need to use binding operators to bind the match to another string.

    Table 8-1. Binding operators

    OperatorMeaning=~True if the pattern matches!~True if the pattern doesn't matchprint "Please enter your homepage URL: ";my $url = <STDIN>;if ($url =~ /geocities/) { print "Ahhh, I see you have a geocities homepage!\n";}
    Metacharacters

    The special characters we use in regular expressions are calledmetacharacters, because they are characters thatdescribe other characters.


    Some easy metacharacters

    Table 8-2. Regular expression metacharacters

    Metacharacter(s) Matches...^ Start of string$ End of string. Any single character except \n (though special things can happen in multiline mode)\n Newline (subtly different to $ - when working in multiline mode, there may be newlines embedded in the multiline string you're working with.\t Matches a tab\s Any whitespace character, such as space or tab\S Any non-whitespace character\d Any digit (0 to 9)\D Any non-digit\w Any "word" character - alphanumeric plus underscore (_) \W Any non-word character\b A word break - the zero-length point between a word character (as defined above) and a non-word character.

    Readme: These and other metacharacters are all outlined in chapter 2 of theCamel book and in theperlre manpage - type perldoc perlre to read it.

    Any character that isn't a metacharacter just matches itself. If youwant to match a character that's normally a metacharacter, you canescape it by preceding it with a backslash

    Some quick examples:

    # Perl regular expressions are usually found within slashes - the# matching operator/function which we will see soon. /cat/ # matches the three characters # c, a, and t in that order./^cat/ # matches c, a, t at start of line/\scat\s/ # matches c, a, t with spaces on # either side/\bcat\b/ # same as above, but won't # include the spaces in the text # it matches# we can interpolate variables just like in strings:my $animal = "dog" # we set up a scalar variable/$animal/ # matches d, o, g/$animal$/ # matches d, o, g at end of line/\$\d\.\d\d/ # matches a dollar sign, then a # digit,