Getting Started with Perl

Perl Background, Features and Running Scripts

This is to enable a Perl newcomer to get started, using either Linux, MacOS, Windows, or a flavour of Unix. It introduces some global Perl concepts, terminology and good programming practices, in a modern context.

Introducing Perl

Perl is a scripting language, which is used every day by many people, for a wide variety of purposes. It is arguably most popular as an alternative to shell scripting, AWK and sed.

How much Perl you can learn in a specific period, depends on two factors: a) your background and experience in other languages, and b) how much time and effort you are willing to dedicate practicing new topics. The core objective should be to obtain as much self-sufficiency as possible in the generic use of Perl for arbitrary applications.

Our focus initially is on the language, terminology, concepts and good coding conventions of Perl, and not specific application areas, even if that may not sound appealing. Once a firm conceptual foundation using correct terminology has been established, you should be able to gain experience in a relatively independent and self-sufficient manner.

Background

The Perl language was created by Larry Wall in 1987. The name “Perl” is not an acronym, although one common explanation suggests it means: “Practical Extraction and Reporting Language”. That is not official. Others suggest: Pathologically Eclectic Rubbish Lister.

Perl runs on many operating systems, most POSIX-like, including WSL (Windows Subsystem for Linux), but also natively on Windows®. The latest version, as of this writing, is 5.30.1 (released 2019). It can be installed from source, if required. For one reason or another, which does not concern us here, Perl has been steadily losing ground in popularity, if the TIOBE Index is any indication. In that sense, Ruby and Python are its main “competitors”.

Presumably due to many cryptic Perl scripts and snippets, Perl is considered difficult to learn, read and maintain. This is in no way due to any deficiencies in Perl, but entirely due to irresponsible scripters showing off their mad Perl ‘skillz’. If Perl is to blame, it is most likely a consequence of a design philosophy that affords programmers freedom to achieve goals in a multitude of ways and paradigms — probably more so than any other scripting language. The approach we take, emphasises good coding conventions from the start.

Perl on Windows

The recommended version to use on Windows is Strawberry Perl — it works the most like Perl on POSIX systems (Unixes, MacOS, Linux). On the Strawberry Perl releases page, you will even find a ‘portable’ version, which means it does not require installation, and will run from any directory (preferably one where the whole path contains no spaces or other characters that would require escaping). You can also find Perl in Cygwin, and MSYS2. The latter is recommended if you also use other features of MSYS2, like its port of the Arch Linux package manager: pacman to install, update and manage packages.

IMPORTANT — Strawberry Perl Portable Environment

Since the portable version of Strawberry Perl does not modify your Windows Registry, nor your Start Menu, you must run the portableshell.bat file in the directory where you extracted it. It will open a new Command Prompt Console, with the PATH set, so that you can run all Perl command-line tools from this session. You can double-click on this batch file from a file browser. More than one session can be opened at the same time.

Strawberry Perl includes C/C++ compilers from MinGW-w64, so that Perl packages from CPAN that contain C/C++ components can be compiled, just like on POSIX systems. This distinguishes it from ActivePerl, which uses its own repository and package manager for modules from ActiveState, compiled on your behalf. ActivePerl does not have a portable version, and it must be installed. If you can live with that, it is a very Windows-friendly Perl environment.

Perl on MacOS & Linux

On MacOS and Linux, you can consider using Perlbrew, which allows you to install any number of Perl versions in your home directory, without affecting the vendor (system) Perl installation. You can use it to switch between ‘active’ Perl versions. It can be configured to support CPAN. Installed modules will only affect the active version, and will also be installed in your home directory, or /usr/local. An even simpler alternative is to download the latest relocatable Perl; it is self-contained and comes with cpanm (CPAN minus) already installed.

Another option to obtain not only Perl, but almost any other free and/or open source package, including GUI applications like MacVim and Dash, and to keep them up to date, is Homebrew — ‘The missing package manager for MacOS’ (their words). We cannot recommend it highly enough for power users. We would not be using MacOS, if not for Homebrew.

NOTE — System Perl

Both MacOS, and your typical Linux distributions, come with Perl installed. It is considered an essential part of any POSIX operating system. The version installed will always be an older one than the current Perl available. This is just something to be aware of, not a serious issue. If you want the latest Perl, you cannot override the system Perl, which is why we recommend Perlbrew as an option. Your job, however, may involve system scripts, using the system's Perl, so you may already have all you require.

Once you have followed the Perlbrew instructions to get it on your system, you can run (for example): ‘perlbrew install perl-5.16.0’ (or whatever version you prefer, but v5.16 is really a minimum). Once installed, you can run: ‘perlbrew switch perl-5.16.0’ to enable your new Perl. We shall assume that you either use Perlbrew, or system Perl, as long as it is at least v5.16. Occasionally, we shall mention features available in later versions.

On Unix variants, you are on your own, since it would be too time-consuming to consider various scenarios for all the Unix flavours. Suffice to say, you should have a system Perl at least. If it is older than v5.16, you will just be missing out on a few features — the concepts, terminology, and application remain the same.

Some Considerations

A fundamental principle of Perl is that it provides more than one way to accomplish almost any task. One consequence of this is that Perl code can be very difficult to read, unless the authors have been responsible, and used all the commendable features of Perl, rather than the obscure and terse syntax options. The other consequence is the huge number of Perl keywords available.

Perl does not provide any REPL interface (Read-Eval-Print-Loop), unlike some other languages, for example, Python. You can install the Devel::REPL module, or alternatively, the Reply from CPAN. Since both the modules require quite a few extra prerequisites, you may want to consider whether you really need a Perl REPL. At minimum, running: ‘perl -de '0'’ on the command line will enter the Perl debugger, and you can then perform some interactive computation. Note that on MacOS the debugger line editor does not recognise arrow keys, and the debugger cannot deal with multi-line blocks with any degree of sophistication (the block must be on one line).

Perl scripts or programs can be executed on the command line with: ‘perl ‹script-name›’. So-called Perl one-liners involve one of several switches (options), and some Perl code enclosed in shell quotes, to be executed directly (no script). This is not so easy on Windows, simply because the Command Prompt is unfortunately somewhat archaic (it does not distinguish between single quotes and double quotes, and escaping is not well-defined).

Perl can also be executed inside several graphical environments, typically called ‘integrated development environments’ (IDEs). We advise against the use of IDEs when learning Perl, since we feel it is initially more of a burden than any tangible benefit (code-completion is a productivity enhancement, especially if you cannot touch type, but does not aid comprehension, nor knowledge retention). Even syntax-highlighting and automatic indentation has dubious pedagogical value; however, we still recommend a good programmer's editor at minimum. We use Vim, but your decision should be based on rational consideration, not the recommendations of others — there is no best editor for everybody.

Your Linux distribution may not have perldoc installed by default, but it is often available as a package (e.g.: you may have to run ‘sudo apt install perl-doc’). We consider perldoc and perltidy essential tools.

Basic Scripts

The following will be used as our ‘baseline’ template for scripts. The version of Perl should be as high as you can manage — this depends on the version of Perl available in your environment, or the requirements for your project.

Modern Perl

Although many tutorials do not discuss it until later, we consider it almost mandatory, right from the start, that you add ‘use strict;’ to your Perl scripts. Using a specific version (v5.16 as minimum) will enable some modern features, and automatically enable strict mode. Otherwise, Perl allows you to write code that, in a modern context, is generally viewed as bad practice.

Starter Template

Below is a commented template you can use for any new script. Unnecessary comments (text starting with a hash # character) can obviously be removed, except for the very first comment, which is a shebang (hash-bang) line. Do not follow advice from older material to change it to #!/usr/bin/perl.

`template.pl` — Example Perl Script Template

#!/usr/bin/env perl
#
# @file    template.pl
# @brief   Example Perl Script Template
# @date    2019-12-12
# @licence MIT (https://opensource.org/licenses/MIT)
#
use v5.16;   #<- or: `use 5.016;` — just one preference, either works.
use utf8;    #<- optional, but cannot hurt, so you may as well add it.
binmode STDOUT, ":encoding(utf8)"; #<- set standard input and error
binmode STDERR, ":encoding(utf8)"; #   handles for UTF-8 encoding.

...          #  the ‘body’ of your script. what the script must ‘do’.

exit 0;      #<-specify an exit code (`0` == OK, non-zero == NOT-OK).

=encoding utf-8
=pod

=head1 NAME

scriptname

=head1 DESCRIPTION

Some high-level description of this script. Look at `perldoc perlpod`
for more information on what is required in good documentation.

=cut
__END__

The first few lines of comments should contain some official information. Examples are dates, author(s) and licensing. A use strict; pragma line is only necessary if you are using a Perl version earlier than v5.12. If you do not know how to set your editor for UTF-8 encoding, you may well omit the use utf8; line, but you should probably file this away for future investigation.

Vim Mode Lines

You can optionally add a Vim modelines, which allows a way to set file-specific settings by having a special line in a language-specific comment in your file. Here is an example for Perl:

Example Vim modeline for Perl

# vim: set ft=perl ff=unix fenc=utf-8 et ts=8 sts=4 sw=4 :

The settings are consistent with the default configuration for the recommended perltidy Perl code formatting utility.

Perl Pragmas

The lines starting with ‘use’ are called pragmas. For example, a pragma called ‘constant’ allows us to effectively create symbolic constants. The identifiers created require no sigil, and by convention, we create them with all-uppercase letters: ‘use constant MAX_LINES => 20;’

Pragmas effectively load Perl modules. The ‘use’ pragma has several variations, as you might see from its documentation. All variations, except one, require a Module name, making constant a module, but in the context of how we use it, and its effect, it is called a pragma. We shall look at use in more detail later.

Exit Codes

The ‘exit 0;’ statement is not necessary, but it is a good convention. If some error occurs earlier, and you cannot continue, you can have: ‘exit 1;’ or some other number instead of 1, to indicate that it is not a “normal” completion. Shell scripts and other programs can check the exit code of a program, and will consider a 0 exit code as “success”.

The exit code of processes can be determined on the command line (POSIX) by inspecting the $? special environment variable. It will contain the exit code of the last process run. The Windows Command Prompt makes the same information available in the %ERRORLEVEL% environment variable.

In PowerShel, the $? environment variable will contain $true for an exit code of 0, and $false for non-zero exit codes. The exact exit code can be found in the $LASTEXITCODE environment variable.

Basic Script Features

From the above template, you may deduce correctly that a hash character (‘#’) starts a comment. This means Perl will ignore all text starting with the ‘#’, until the end of the line. The first line is also a comment, but is special in POSIX shells — it specifies the interpreter to use for this script, if the script has been marked executable: ‘chmod a+x ‹scriptname›’. This is called a hash-bang or shebang line. It is ignored on Windows when running inside the Command Prompt, or even in PowerShell, but you should add it anyway.

Everything is ignored in a Perl script after ‘__END__’, which is optional. So you can put anything there, like lines of notes or data, which will be available when you read from the ‘DATA’ filehandle in the main script. You can use ‘__DATA__’ instead in other modules called from the main script, in which case you would read from ‘‹module-name›::DATA’.

Regardless of the presence of the ‘__END__’ line, the ‘=pod’ ⋯ ‘=cut’ section is treated as a special comment, so it could have appeared anywhere. We just prefer to put documentation at that location, especially in educational code where the focus is on syntax.

Running Perl Scripts

You run Perl scripts either by giving the ‘perl’ program a script file as argument, or setting its mode to executable on POSIX systems, with: ‘chmod a+x’. Any text after the script name, will be treated as arguments to the script (Perl makes arguments available to the script, in the ‘@ARGV’ global array).

In Perl for Windows, you are able to run the ‘pl2bat’ utility to create a ‘.bat’ file from a ‘.pl’ script. Running the batch file will run the embedded script. This is not ideal, since redirection and piping with batch files were not serious considerations for the Command Prompt designers, so only do that if you have no other choice.

There is no equivalent of pl2bat for PowerShell (open-sourced in 2016, running on Linux, MacOS and Windows), which is unfortunate. It is beyond our scope here, but you really should consider using PowerShell instead of Command Prompt on Windows. A PowerShell package does allow you to call PowerShell commands from Perl.

Warnings and Diagnostics

While learning Perl, or even later when looking for program issues or bugs, you can add the -w (useful warnings), or -W (all warnings) switch to the Perl interpreter executable, e.g.: ‘perl -w hello.pl’. More verbose messages can be obtained with: …-Mdiagnostics….

You can gain many useful insights regarding good modern Perl programming practices from the warnings. You generally do not want warnings and diagnostics enabled in production scripts, but for learning and development, both are essential. The benefits must not be underestimated.

Advanced — AutoDie

Perl experts suggest also adding use autodie; as part of your preamble pragmas. The autodie module replaces some functions with ones that succeed or die with lexical scope. Practically, when autodie is in effect, this means you can open a file, without checking its return code to see if it was successful — an exception will occur instead within the current scope (normally an eval block). Here is one example from its documentation:

Advanced use of `eval` and the `autodie` module

use feature qw(switch);
 
eval {
   use autodie;
   open(my $fh, '<', $some_file);
   my @records = <$fh>;
   # Do things with @records...
   close($fh);
};

given ($@) { # ☆
   when (undef)   { say "No error";                    }
   when ('open')  { say "Error from open";             }
   when (':io')   { say "Non-open, IO error.";         }
   when (':all')  { say "All other autodie errors."    }
   default        { say "Not an autodie error at all." }
}

^☆The given switch was introduced in Perl 5.10, so if you already have ‘use v5.16’ or higher, adding ‘use feature qw(switch);’ will be redundant.

As this is a bit advanced, you may ignore autodie for a while, as long as you do not entirely forget about it. You can start using it once you have gained more experience. Using the autodie module without an eval block, still means your script will terminate if a file could not be opened, for example, but that would just be an avoidance technique, not good practice.

Hello World

This represents the simplest professional Perl script — including POD (Plain Old Documentation; see ‘perldoc perlpod’ for more information, in case you are prone to premature curiosity). It does not do much, and exists only to provide a guideline and some structure. You should put all scripts in some working directory, e.g. C:\Course\work, $HOME/Work or in WSL: /mnt/c/Course/Work. The exact name or location is immaterial, but you should give it some thought.

`hello.pl` — Annotated Example “Hello World” Perl Script

#!/usr/bin/env perl
#
# @date    2019-12-12
# @author  Incus Data (Pty) Ltd.
# @license MIT https://opensource.org/licenses/MIT
#
use v5.16; use utf8;

print "Hello, World!\n";             #<- embedded newline (escaped).
say "Goodbye, cruel World";          #<- automatic newline added.

exit 0;         #<-good convention. means “successful completion”.

=encoding utf-8
=pod

=head1 NAME

hello.pl

=head1 DESCRIPTION

Example program, printing the traditional "Hello, World!" message,
and showing the difference between `print` and `say`. Comments and
POD are overkill for such a simple program, but are professional.

=cut
__END__

Notice a few points regarding fundamental Perl language features, and some initial terminology:

Comments start with # (hash characters). Perl ignores text from #, till end of line.
You should always be aware of which version of Perl you are targeting.
Statements starting with use, are called pragmas.
You should use UTF-8 encoding in your editor.
Perl is case sensitive. Never use names that differ only in their case.
Perl ignores whitespace (spaces, tabs, newlines). You should use ‘Unix line endings’.
Perl expects statements to be separated with a semicolon (;); a trailing semicolon is allowed, and treated as a ‘null statement’.
Text between double quotes ("···"), or single quotes ('···') is called a string literal.
Special characters, like newline or backslash, must be escaped in a double-quoted string literal.
Perl is a block-structured language, with blocks delimited by curly-braces. Blocks can appear without a preamble, and provides a lexical scope.
The __END__ in scripts is optional, and Perl ignores any text following it.

The most fundamental and universally portable way to run a Perl script like hello.pl above is:

perl hello.pl

On POSIX systems, you can do: ‘chmod a+x hello.pl’ to mark it as executable. If it is not on your PATH, you can still execute it as:

./hello.pl

The best you can do on Windows (but it is hardly ideal), is to run the Windows-specific pl2bat program, which will “wrap” the script in a batch file. Thus:

pl2bat hello.pl

will result in hello.bat, which you can run as hello. Redirection and piping do not work well with batch files on Windows, so only use this feature if you never expect to use piping or indirection with the program. Arguments passed to the batch file, will be forwarded to the Perl script.

Running ‘perldoc hello.pl’ will display POD documentation, if present. Here is the output:

NAME
    hello.pl

DESCRIPTION
    Example program, printing the traditional "Hello, World!" message, and
    showing the difference between `print` and `say`. Comments and POD are
    overkill for such a simple program, but are professional.

On Linux and MacOS, we suggest you keep your scripts in a ‘source directory’ with scripts having the .pl extension, and create symbolic links without the extension, in some directory in your $PATH.

NOTE — Output Pager Programs

The perldoc program will pipe the output to your pager program, which is generally the less pager, or the much older and limited more pager. Windows Command Prompt only has more, although a native less for Windows is freely available. Like many programs using a pager, perldoc will look at the PAGER environment variable (if it exists) to determine which pager to use. We highly recommend that you install less for Windows, and set your PAGER environment variable. The more pager is really just awful (and that is being polite).

Formatting and Indentation

Blank lines should be used to separate logical units of code from one another, much like spaces between paragraphs in many technical books. Blocks should be indented with a consistent number of spaces — called an indentation level. Nested blocks should be indented one level from the outer block.

The location of braces is a style, and should be consistent. We suggest that the opening brace appear on the same line as the statement it “belongs” to. The closing curly brace may or may not be indented, but should appear on a line all by itself.

The Perltidy script should not be ignored. It can be installed with ‘cpan Perl::Tidy’, manually downloaded and installed, or it may be available as a package in your Linux distribution. Perltidy can reformat your code according to standard conventions, but has many options to fine-tune its output to your specifications. The best option is to experiment with it, and incorporate perltidy in your workflow — especially if you share your work amongst colleagues.

Grouping code in ‘bare blocks’, i.e., curly-brace delimited blocks that are not associated with any statement, is perfectly valid, and a good convention to group related code. One advantage is that variables defined in such a block, are local to the block and will thus not be visible outside the block. This feature of Perl (and C, C#, C++, Java, PHP, JavaScript, etc.) is not used often enough.

Documentation

The documentation on PerlDoc is extensive — it covers the language syntax, built-in functions, built-in variables, regular expressions and documentation for the core modules. You can view a list of core modules using the ‘corelist’ utility. For example, to list all the standard core modules supplied with Perl v5.16.0, you can run this command:

corelist -v v5.16.0

You should probably pipe the output to ‘less’, or ‘more’, or redirect to a file — the list is big.

For the perl executable, you can run perl --help for a list of available options (switches).

PerlDoc

The ‘perldoc’ program, which is supplied with Perl, can display all the documentation found on PerlDoc on the command line (in text format, of course, not HTML). It can also display documentation for your scripts, if you documented them according to the perlpod specification. So, for our example script above, you can run the following command (output is shown afterwards):

perldoc template.pl

  NAME
      scriptname

  DESCRIPTION
      Some high-level description of this script. Look at `perldoc perlpod`
      for more information on what is required in good documentation.

‘perldoc’ pipes its output to ‘less’ or ‘more’, but you can redirect it yourself.

To get a table of contents for all the documentation, run:

perltoc perltoc

It will list all the names you can use with ‘perldoc’ to obtain further information. For example, you will read that perlfunc will provide documentation for all Perl's functions, so you can run:

perldoc perlfunc

For help on a specific function, add the ‘-f’ flag. For example, for documentation specific to the ‘chomp’ function, run:

perldoc -f chomp

For module documentation, including non-core modules you installed additionally, you must specify the full module name (it is case-sensitive):

perldoc Data::Dumper

^☆ ‘Data::Dumper’ is one of the core modules, and is not significant here; it is only used as an example module name.

You can also use: ‘perldoc -q "‹query›"’ to search the Perl FAQ. Unfortunately, it is not a generalised query; you still need to know the FAQ keywords, which is kind of useless for newcomers.

If HTML appeals to you, the Perl-supplied pod2html converter can convert any POD documentation (including all Perl's official documentation) to HTML. For example, you can try the following command line:

perldoc -u pod2html > pod2html.html

This will format the pod2html documentation as HTML, which we redirect to the pod2html.html file. You can open this file in your browser. It does not have CSS styling, so it may look a bit boring. You can open the HTML file from the Windows command line with: start pod2html.html, or on MacOS with: open pod2html.html. On Linux, which command you can use depends on your window manager, but try: firefox pod2html.html from your GUI terminal.

Offline Perl Documentation

You can also download the PerlDoc documentation in HTML format, with an embedded JavaScript search engine. Alternatively, you can install Zeal (or use the portable version), and download the Perl “docset” for an offline documentation viewer. Zeal can be used to view many other “docsets” as well, making it a useful developer tool. On MacOS, use Dash, on which Zeal is based, and whose docsets Zeal uses.

Regarding Zeal:

Remember you can turn off the ads it displays on the starting page, in the preferences.
Be patient when updating or downloading docsets, and do not interrupt it.
Delete the zeal.ini file in the Portable Zeal directory when moving or renaming it, otherwise it will not work afterwards.

Practical Patterns

When you get started with Perl scripting, you may be impatient to get ‘something useful’ accomplished. We understand that, and encourage you to be patient — the better you learn Perl for the sake of learning Perl, the easier it will be for you to accomplish your goals. With that in mind, we present a number of practical patterns to get you started.

Scalar Variables and Values

The simplest type or category of variables, is the scalar variable, indicated with the $ sigil. Variables should be defined before they are used, with the my keyword; it can define several variables at the same time, as long as they are enclosed in parentheses.

Variable definition syntax variations

my $var;                         #<- define single variable.
my ($a, $b, $c);                 #<- define multiple variables.
my $rav = 123;                   #<- define & initialise one variable.
my ($d, $e, $f) = (12, 34, 56);  #<- define & initialise many variables.
my ($g) = (11, 22, 33);          #<- define 1 variable, set it to `11`.
my $h = (11, 22, 33);            #<- define 1 variable, set it to `3`,
                                 #   (number of elements in the list).

List assignment makes it possible to initialise several variables, as the third last statement illustrates. The last two statements illustrate the difference between using, or omitting, the parentheses around a single variable definition. Arrays or lists used in a scalar context, results in the number of elements in the list or array variable.

Variables can store any Perl type, at any time. Basic types are numbers and strings in the form of literals: 123 (integer literal), 1.23 (floating point literal), "ABC" (string literal), 'ABC' (another string literal). Types are converted implicitly as needed, depending on the operation performed on them.

Typical operations include the arithmetic operators, ** for exponentiation or ‘to the power’, % for modulus or ‘remainder’, . for string concatenation. Numbers are compared with: == (equal), != (not equal), < (less than), <= (less than or equal), > (greater than), >= (greater than or equal), and <=> (compare).

String comparisons use different operators: eq, ne, gt, ge, lt, le and cmp. Implicit string operations include string interpolation of variables inside double-quoted string literals.

The ‘compare’ opeators return one of three possible values:

a negative value if the first operand is smaller than the second;
a positive value if the first operand is greater than the second;
or 0 if the operands are equal to each other.

Logical operators && (AND), || (OR) and ! (NOT) are available, as are the word versions: and, or and not (albeit with lower, if not better, precedence). The logical operators exhibits ‘short-circuiting’ behaviour, which means they only evaluate the second (right hand) operand, if the logical result cannot be obtained from the first.

Standard Input and Output

If any program can read from standard input, you can pipe the output of another program to it. In Perl, standard input is represented by the global STDIN file handle variable. As shown in their documentation, many functions expect a ‘FILEHANDLE’ as argument. This means STDIN can be used, although most will use STDIN (or STDOUT) as defaults if not supplied.

Reading from Standard Input

The built-in readline function can read one line of text from a file handle, with STDIN being the default. It returns a scalar string, with a trailing newline ("\n"), which is generally not very useful, so it is common to remove it with the chomp function, leading to some common patterns:

Reading and trimming a line from standard input

my $input = readline;          #<-buffered, one-line input.
chomp($input);                 #<-remove trailing newline if present.

The last two statements can be combined, if the $input variable does not yet exist:

Define variable, read input and trim result pattern

chomp(my $input = readline);   #<-define variable, read input, chomp.

If the user only pressed Enter without typing other text, and you have ‘chomped’ the result, it will be an empty string. If an error or end of file was encountered, the result will be undef (a special Perl value). You can test for these conditions as follows:

Testing for empty input, or end-of-file situations

use v5.16; use utf8;
chomp(my $input = readline);           #<- `chomp` removes trailing newline. 
if (!defined $input) {                 #<- test for EOF or error, and
   say "EOF or error";                 #   must be done first.
   }
elsif ($input eq "") {                 #<- test for blank line; it will
   say "Empty line!";                  #   also be ‘true’ for `undef`.
   }
else {                                 #<- non-zero length input.
   say "Thank you. Input = ", $input;
   }

To simplistically test for numeric input, you can implicitly convert the string to a number by multiplying it with 1.0. If it fails, which we can trap with eval, the input was not in a format that Perl can convert to a number.

Simple numeric validation

use v5.16; use utf8;
chomp(my $radius = readline);
if (!eval {1.0 * $radius}) {
   say "We have a number: ", $radius;
   }

Testing for numeric input is independent of checks for errors or empty lines; you may have to do those first, depending on your program's requirements.

Lines can be read until end of file, using several patterns, but we shall show only one here:

Read lines until end of file pattern

use v5.16; use utf8;
while (defined (my $line = readline)) {
   chomp $line;          #<- optional
   say uc $line;         #<- do what you will with `$line`.
   }

You can now pipe the output of a program to the script, or redirect standard input from a file. Assuming the above code was saved in uccat.pl, you can try the following:

Providing files as standard input

^> perldoc -f sqrt | perl uccat.pl
^> perl uccat.pl < uccat.pl

We are not exactly sure why all Perl tutorials and examples are so enamoured with using the <STDIN> synonym for readline — presumably because it is tradition, and distinguishes Perl from the more ‘boring’ languages. Since you will see it everywhere, we would be remiss if we did not alert you to it. Most Perl programmers will write the relevant line in our last example in this way:

Classic alternative to readline

while (defined (my $line = <STDIN>)) { ⋯

You are absolutely welcome to use either; at least just be consistent. After all, Perl is a language with many options to perform the same task — the philosophy of leaving the choice up to the programmer, who knows best; right?

Writing to Standard Output

You have several options in Perl to write output to file handles. There are two standard output handles you should be aware of: STDOUT and STDERR, although the latter is referred to as the standard error handle. This stream you can redirect on the command line with: …2>…, or combine them with: …2>&1… for further redirection or piping.

Functions like say or print by default write to STDOUT. Error messages should be written to STDERR as a matter of good coding practice. Here are some examples:

Writing to standard output and standard error handles

use v5.16; use utf8;
print STDOUT "Hello";               #<- write to `STDOUT` handle.
print ", world!\n";                 #<- default is `STDOUT`.
print 123, " ABC", " DEF";          #<- arbitrary list of arguments.
print STDERR "Error!";              #<- error messages to `STDERR`.
die "Error!. Terminating."          #<- message to `STDERR` & exit.

Please notice that when a file handle is supplied, it must not be followed by a comma. The say function is equivalent to print in every way, except it automatically appends a newline after all arguments have been printed (and is only enabled with a use v5.10; or higher). Neither will put separators between the listed arguments.

The die function not only writes to STDERR, but also terminates the script with a non-zero exit code.

Final Example Script

With some of the basic Perl features, conventions, and patterns in place, we can now write a relatively robust example script that will prompt the user for a radius, which it will validate, and then use to calculate the area and circumference of a circle with that radius:

`circle.pl` — Simple Example Circle Calculator

#!/usr/bin/env perl
#
# @file    circle.pl
# @brief   Simple Example Circle Calculator
# @date    2019-12-12
# @author  Incus Data (Pty) Ltd.
# @licence MIT (https://opensource.org/licenses/MIT)
#
use v5.16; use utf8;
binmode STDOUT, ":encoding(utf8)"; #<- set UTF-8 encoding for `STDOUT`.
binmode STDERR, ":encoding(utf8)"; #<- set UTF-8 encoding for `STDERR`.

use Math::Trig; #<-for `pi`

print "Radius (positive float)?: ";
my $radius = readline;

# Validation. Terminate if garbage, or negative.
   {
   if (!defined $radius) {
      die "No input.";
      }
   if ($radius eq "\n") {
      die "Empty input.";
      }
   if (!eval { $radius = 1.0 * $radius }) {
      die "Not a number.";
      }
   if ($radius < 0.0) {
      die "Negative radius.";
      }
   }

# Calculate area and circumference, then output.

my $circum = 2.0 * pi * $radius;
my $area = pi * $radius ** 2;

say "Radius  = $radius";
say "Circum. = $circum";
say "Area    = $area";

exit 0;

=encoding utf-8
=pod

=head1 NAME

circle.pl

=head1 DESCRIPTION

Example program, calculating the area and circumference of a circle.
The radius is provided by the user and validated. The program will
terminate on invalid input, with an informational message. It does
not (yet) respond to command-line arguments passed to the script.

=cut

Much can be improved in the program, which we will deal with later; for example, it does not process command-line arguments, and the output values are not formatted to a certain number of decimal places. However, this should allow you to start your Perl journey with a good foundation. You can run it through perltidy, should the formatting impact on your sensibilities.

Portability

A final note on writing portable Perl scripts. As far as is reasonable, you should ensure that your scripts run on all supporting platforms. This is not too difficult as documented, but you can also read it with: ‘perldoc perlport’.

One important point to highlight, is that the ‘$^O’ special variable contains the operating system name. On Linux, this will contain: linux; on MacOS: darwin and on Windows: MSWin32 (regardless of whether you are running 32-bit, or 64-bit, Windows). You can use this information to conditionally load modules, or perform different tasks. There is also the core Config module that will give you the same (and more) information.

Here is an example showing how to portably enable and use VT100/ANSI escape sequences in POSIX shells, the Windows 10 Console, or the newer Windows Terminal (Preview), as well as UTF-8 output.

`vtutf8.pl` — VT100/ANSI Escape Sequences and UTF-8 Output

#!/usr/bin/env perl
#
# @file    vtutf8.pl
# @brief   VT100/ANSI Escape Sequences and UTF-8
# @date    2019-12-18
# @author  Incus Data (Pty) Ltd.
# @licence MIT (https://opensource.org/licenses/MIT)
#
use v5.16;                            #<-minimum version.
use utf8;                             #<-*source* in UTF-8.
use open ':std', ':encoding(UTF-8)';  #<-all std. I/O in UTF-8.

#= Windows-specific code. Will be ignored on other systems. This may
#  become less necessary with newer vesions of the Windows 10 console,
#  and will not be necessary when running scripts in PowerShell, and/or
#  inside the newer Windows Terminal.
#
if ($^O eq 'MSWin32') {
   use if $^O eq 'MSWin32', 'Win32::Console';
   Win32::Console::OutputCP(65001);
   system(''); #<-enable vt escape sequences processing.
   }

#= At this point, most VT100/ANSI escapes sequences will work portably.
#  Optionally, one may use the core `Term::ANSIColor` module. Do *not*
#  use `Win32::Console::ANSI`.

print "\e[2J\e[0;0H"; #<-clear screen; move cursor.

print "\n  \e[36;7m STANDARD COLOURS \e[0m\n";
print " \x{250C}\x{2500}", "\x{2500}" x 17;
print "\x{2510}\n \x{2502} ";
for (my $color = 0; $color < 8; $color++) {
   print "\e[48;5;${color}m  ";
   }
print "\e[0m \x{2502}\n \x{2502} ";
for (my $color = 8; $color < 16; $color++) {
   print "\x1b[48;5;${color}m  ";
   }
print "\e[0m \x{2502}\n \x{2514}","\x{2500}" x 18,"\x{2518}\n";
print "\e[32m  UTF8: \e[33;1m‹«π®©·§○●»›★\e[0m\n\n";

exit 0;

=encoding utf-8
=pod

=head1 NAME

vtutf8.pl - Portable VT100/ANSI Demonstration

=head1 DESCRIPTION

Portably enables VT100/ANSI terminal escape sequences in a Windows 10
Console. This is disabled by default for console applications. Running
a sub shell with C<system('')>, enables it from that point, but is not
very efficient. Alternatively, one may directly use the Windows API.

This example does not use C<Term::ANSIColors>, but you could. Never use
C<Win32::Console::ANSI> on modern, updated, Windows 10 consoles. Your
font (even POSIX terminals) must support the UTF-8 glyphs you output.

The C<use utf8;> only tells Perl about your source code encoding. To
enable it for I/O, C<use open ':std', ':encoding(UTF-8)'> is also needed.
This arrangement is modern, and now supported by Windows 10 Console
APIs; but it will take a while before Perl, Python, and other command
line shells and programs will enable it by default.

TIP: To clear the screen portably, use the: C<\e2J\e[0;0H> sequence.
=cut

__END__

As the POD mentions, ‘use utf8;’ only tells Perl that your source code is UTF-8 encoded. To enable UTF-8 on output and also input, and to avoid “Wide character…” warnings, the file handles must be opened with :encoding(UTF-8). This is most easily accomplished with the ‘use open std…’ pragma, which will set the default encoding for the standard file handles.

2019-12-12: Updates. [brx]
2017-12-19: Edit. [jjc]
2017-12-18: New sections and examples. Re-organise. [brx]
2017-12-17: New sections. [brx]
2017-12-15: Created. [brx]