The Unix Environment

Introduction

Unix is a sophisticated operating system designed to support major application services and to provide a productive environment for professional programmers. Many vendors provide proprietary Unix versions (e.g. MacOS for Apple, HP-UX from Hewlett Packard, Solaris from Sun, AIX from IBM) and the co-operatively produced freeware Linux is increasingly accepted. All these *n*x versions provide a similar set of services; in this module we will only need to use a tiny subset of them.

Windows dominates the desktop workstation market and provides increasingly sophisticated versions for enterprise servers. However the power and stability of Unix makes it the platform of choice for very many large and critical background applications in industry, commerce and science.
We will use the Unix on the Macs in this module for several reasons:

The aim of this tutorial is to introduce those elements of Unix that you will find yourself using most of the time. Since you learn about programming and program environments by doing rather than by reading about them, you are strongly advised to try any examples included here on a machine in a lab. The more you use them, the easier they get.

The Shell interface

All operating systems provide a point & click Graphical User Interface to enable users to carry out basic tasks e.g. starting a common application, locating their files etc. However most developers use an alternative text-based mechanism, called a shell to control the system. A shell, like the GUI, is a program which interprets user requests. You usually call up a shell by clicking a GUI icon displaying a terminal.This pops up a window displaying a prompt - usually a dollar sign by default - ready for you to type a command line. After the shell has finished running the command, the prompt is redisplayed ready for you to type the next one. Some of you may be familiar with this from the "Command Prompt" application in Windows.

A Unix command is entered on a single line and consists of one or more "words" separated by spaces. It

For example, the command keyword cal displays a calendar
cal
on its own displays a calendar for the current year
cal   10   2008
displays a calendar for October 2008
cal   -j   2010
displays a calendar in day number form (1 to 365) for 2010 on the Macs
You might like to experiment with a few other possibilities to see what you get. The program works for any year from 1 to 9999.

Different commands allow different numbers and types of switches and parameters. Sometimes the order of these is important and sometimes it isn't. All of the commands and their options are described in the manual. For example:
man   cal
will display the manual entry for the command cal
At first the terseness and flexibility of commands can be a little daunting, but common commands and their options rapidly become familiar.

Typing commands may seem an old-fashioned way of communicating compared to the slickness of modern GUIs but it is surprisingly effective. Consider being dumped in a foreign country; you will probably survive by just pointing and waving your arms but you will be able to do a lot more, a lot faster if you have some grasp of the language. Similarly when you get familiar with the shell, you can issue the precise instruction to make Unix do exactly what you want without clicking on tedious sequences of menus.

The Unix Filesystem - Directories

Most of what you do with Unix will involve manipulating files of one sort or another. The Faculty's network supports a common file store which enables you to access your files from any of the Faculty's computers. How this is physically organised is a fascinating topic in its own right but for now we will simply look at how Unix views this filestore and how you can navigate around it.

The file system in Unix, like other operating systems, is hierarchical i.e. organised as a tree of containers. Windows calls these containers folders and Unix calls them directories. Windows, has a separate tree for each logical drive (A,C, D, E etc.) but Unix has a single tree. The root of the tree is the topmost directory beneath which all other files and directories are stored: its name is simply / (a single forward slash).

        / -------                                 the root of the file system
                 +--- bin                         the main Unix commands
                 |
                 +--- dev                         device drivers
                 |
                 +--- etc                         other Unix utilities      
                 |
                 +--- home6 ----                  the staff area
                 |              +----
                 |              .
                 |              .
                 |              +---- iv ---     iv's data
                 |              |           |
                 |              .           +---- public ---
                 |              .           |               +--- csci1401---
                 .              .           +---- Mail      |               +
                 .              .           .               .               .
                 |              .           .               .               .
                 +--- home5 ---                   the student area
                 |              +---
                 |              |
                 .              +--- p08123456    p08123456's data
                 .              .                       
                 .              .

Every item in the file system is identified by its path name which specifies how you get it to it from the root /. The names of each level of the tree that you pass through from root to the item are separated by a /.
For example, the cal command we saw above lives in the bin directory and so its path name is /bin/cal
The start of student p08123456's data area, called their home directory, is identified by the name /home5/p08123456
The name /home6/drs/include/simpleio.h identifies an item in the labs directory which is inside the csci1401 directory which is inside the public directory in iv's data area

Creating new directories

Operating systems use hierarchical file systems to make it easier for users to use meaningful names and also to make it quicker to search for a particular item. If this mechanism is to work, you need to organise your files into appropriate directories rather than just dump them all into your home directory.

To create a new directory, you use the MaKe DIRectory command mkdir. For example
mkdir   foo
will create a subdirectory called foo under your current directory
mkdir   /home5/p08123456/public
will create a subdirectory called public under the home directory for user p08123456
mkdir   /home5/p08123456/public/foo
will create a subdirectory called foo under an (already existing) subdirectory called public in the home directory for p08123456

Navigating around the file system

The shell keeps track of where you are currently in the file structure at all times. It maintains a working directory where it assumes all commands are to be carried out. When you first login, your working directory is your home directory e.g. /home5/p08123456.

You can change your working directory at any time with the Change Directory command cd. This directory will then become your current working directory. For example
cd   foo
will change to the subdirectory foo under your current directory
cd   /home6/drs/csci1401
will change to the subdirectory called csci1401 under drs's area
cd
on its own will always take you back to your home directory
If you try to change to a directory that doesn't exist, you will get an error message and stay where you are.

.. (dotdot) is a shortcut name which means "one level up" and can be used anywhere in a pathname.
Suppose your current directory is /home5/p08123456/public/foo then
cd   ..
would move your current directory to /home5/p08123456/public
cd   ../..
would move your current directory to /home5/p08123456
cd   ../bar
would try to move your current directory to /home5/p08123456/public/bar
cd   ../../..
would move your current directory to /home5 (bad idea)
Note the space between the cd and the ... This is because .. is a name - the parameter for the command cd

It is easy to get lost moving around the directory hierarchy which can cause problems since Unix runs all commands and by default looks for files in your current working directory. You can check where you are at any time by asking the shell to Print your Working Directory with the pwd command. For example,
cd   C/lab1
pwd
will print out /home5/p08123456/C/lab1

Listing the contents of a directory

To display the contents (files and subdirectories) of a directory, you use the LiSt command ls. This command has a variety of different flavours depending on the switches you use. For example:
ls
will list the names of all the items in your current working directory
ls   -l
will list the contents of the current directory in long form i.e. with details like size, date created, permissions for each item.
ls   -l   foo
If foo is a directory, this will list the contents in long form
If foo is a file, it will list the details for that file.
ls   /home6/drs/csci1401/include
will list the names of all the items in directory /home6/drs/csci1401/include

Because typing long names is tedious, there are a number of other shortcuts, besides the .. we saw above, that can be used within names

~ (tilde) means your home directory e.g. /home5/p08123456
. (dot) means the current working directory
* means ANY character string of ANY length
? means ANY SINGLE character
For example

ls   -l   ~/home6/drs/csci1401/src/*.c
will list, in long form, all items in directory /home6/drs/csci1401/src that end with '".c"
ls   ~/p*
will list the names of all the items in your home directory that have the form "p" followed by any number of characters.

Deleting a directory

You delete a directory with the ReMove DIRectory command rmdir. For example
rmdir   foo
will try to delete a subdirectory called foo from your current working directory
Note that Unix will not delete a directory unless it is empty i.e contains no files or subdirectories.

Files

Creating files

Typically you use an application to create your text files: a new text file, like a .c source file is created with a text editor, the gcc compiler creates object and executable files etc.. The recommended editor for the Macs is Smultron but you can use any editor you prefer - there are a lot to choose from. If you want to work on material on Windows machine at home, jEdit is an easily downloaded, free editor which is much better for a programmer than Notepad. Although you can run an editor from the shell, it is much easier to control it via the GUI; the only thing to take care with is that you save the file with the correct name (e.g. adding the .c extension for a C source file) and that you save it in the correct directory.

Renaming files and directories

What happens if you mis-spell the name when you are saving a file or creating a directory? You don't have to delete it and start again, but instead can just rename it.
The MoVe command mv is used to rename files. It is called "move" because when you rename something,you are effectively moving its place in the tree. The command works for both files and directories, since Unix treats a directory as just a type of file - one whose data is an index to the contents of the directory. For example
mv   somefile.c   otherfile.c
changes the name of a file in your current working directory from somefile.c to otherfile.c. The first parameter is the source and the second the destination.
mv   C/week2  C/lab2
changes the name of the subdirectory week2 in C to lab2
mv   *.c   ~/C/lab2
moves all the files ending *.c in the current directory into the directory C/lab2 e.g. rename "prog1.c" to be "C/lab2/prog1.c" etc.

Copying files

When you mv a file, you end up with one file with the same data but with a different name; when you copy a file, you end up with two separate files with different names but both conaining the identical data.

To copy a file use the CoPy command cp. For example
cp   fromFile   toFile
makes a copy of fromFile called toFile. If a file called toFile already exists, it will be overwritten; Unix won't ask you to confirm that you want to overwrite it.
cp   fromFile   C/lab2
is the same as dragging a file and dropping it in a new folder with a GUI. It makes a copy of the file fromFile in the subdirectory C/lab2. This newfile will also be called fromFile.
cp   ~/*.c   .
copies all the files ending ".c" from your home directory into your current working directory

Deleting files

The ReMove command rm deletes a file. For example
rm   foo.c
will delete a file called foo.c from your current working directory
rm   ~/public/bar
will delete a file called bar from a subdirectory called public in you home directory

You can use any shortcuts in the name but be VERY CAREFUL what you type
rm   *.c
will delete all files ending .c from your current working directory BUT
rm   *   .c
will delete ALL files from your current working directory and then complain that it can't find a file called .c. That space makes a lot of difference - every name matches * and Unix assumes you mean what you type

Permissions

There is a common file sytem on the network but your files are private to you and in general no-one else can access them unless you give them permission.
The long directory listing ( ls   -l ) prints out the permissions associated with a given file or directory in the following format
-rwxr-x---     1  csstf  iv  13539  Sep 16 01:13  a.out

The pattern of ten characters on the left provide information about the file type and its access permissions.
The first character will either be a '-' which indicates a regular file or a 'd'which indicates a directory.
The remaining nine characters can be grouped into three chunks of three representing the user (i.e. the owner's) access permissions, the group's access permissions and any other user's access permissions (i.e. everyone else).
Each chunk of three specifies whether that class of user is allowed to read the file/directory, write the file/directory and execute the file or list the contents if it is a directory.
In the above example the owner of a.out is iv who belongs to the group csstf. If we break the permissions into its parts we get:
-a.out is a regular file
rwxiv, the owner, has read, write and execute permissions and so can do anything to the file.
r-xany user in the group csstf can read and execute the file but they cannot write (or, by implication, delete) it.
---no other user can do anything at all with the file.

The owner of file in Windows can change the permisions via the Properties-> Security tab. The corresponding command in Unix is chmod (CHange MODe) which has the syntax.
chmod  change-how  change-what

change-what is a list of one or more names, and change-how consists of a string of characters consisting of

For the example given
-rwxr-x---     1  csstf  iv  13539  Sep 16 01:13  a.out
chmod g-rx a.out
removes read and execute rights from the group csstf resulting in
-rwx------     1  csstf  iv  13539  Sep 16 01:13  a.out
chmod og+r a.out
adds read rights for the group csstf and all users resulting in
-rwxr--r--     1  csstf  iv  13539  Sep 16 01:13  a.out
chmod o=wx a.out
sets write and execute (but not read) rights for all users NOT in group csstf resulting in
-rwxr---wx     1  csstf  iv  13539  Sep 16 01:13  a.out
chmod u-rwx a.out
removes all right from the owner resulting in
----r---wx     1  csstf  iv  13539  Sep 16 01:13  a.out which is probably a bit silly. Still you can always put them back with
chmod u=rwx a.out or chmod u+rwx a.out

There is an alternative way of specifying change-how as a number
rights---r---w---xrw-r-x -wxrwx
binary000100010001110101 011111
decimal042165 37

For example using this mechanism.
chmod 700 a.out
gives the owner of file all rights and gives no rights to anyone else resulting in
-rwx------     1  csstf  iv  13539  Sep 16 01:13  a.out
chmod 753
gives full rights to the owner, read & execute rights to any user in group csstf and write and execute rights to any other user resulting in
-rwxr-x-wx     1  csstf  iv  13539  Sep 16 01:13  a.out

Compiling and running C programs

The first step in producing a C program is to write the source code. This is a text file, produced with a text editor, which contains the statements that the program consists of. The name of this file must have the extension ".c"
Suppose the following example of a trivial program is stored in a file called prog1.c in your current directory.
#include <stdio.h>
#include "/home6/drs/include/simpleio.h"
main( )
{
    printf("Hello World!\n");
}

The source program must then be compiled with the command gcc to produce an executable.
gcc   prog1.c   /home6/drs/lib/simpleio.o   -o   prog1
tries to compile your program prog1.c, link (combines) it with a code library /home6/drs/lib/simpleio.o (which you didn't write) and make an executable file called prog1 in your current directory.

If you have made any errors in your source code, you will get corresponding error messages and no executable will be produced. You need to correct these errors and then try compiling the program again.

Once your program has compiled and linked successfully, you can run the program by just typing its name.
prog1


You will only be tested on the material up to this point. It covers what you need to know to use the lab sessions effectively. However, for those of you interested in learning more, the following is a brief overview of some additional commands you may find useful.

Some more Unix commands

Processes

Whenever you enter a command or you run a program in UNIX you create a process. On your system there will be lots of processes running - some of which are owned (started) by you and others which are started by the system itself.

Listing Processes

To list your processes on a Windows system you use the Task Manager; in Unix you use the Process Status command ps.
ps   -u   $USER
lists all the current processes which you own. The first number on each line is the process identifier and the last entry is the name of the program or command.

Stopping Processes

Sometimes a process gets stuck (e.g. program you have written goes into an infinite loop). On such occasions you can stop the process running by typing CTRL+c in the shell window.
If a process you started from the GUI gets stuck (e.g. the browser or editor hangs), this mechanism won't usually work. However you can stop the process if you know the process identifier.
kill   1234
will kill the process numbered 1234

This corresponds to the End Task option in the Windows Task Manager.

Don't ever just switch the machine off to solve these problems; the network is configured on the assumption that machines are not switched off.

More file commands

Displaying and printing files

Normally if you want to display a text file, you will open it up in the editor. If you want to print it you will use the print function to do so. However, if you don't want to change the file, it can be quicker to do both of these actions directly from the shell.

The command cat dumps an entire text file to the screen. For example
cat   someFile.c
The trouble with this is that you won't be able to read the output if it is longer than a screen-full. If you wish to view the file a screen at a time use the command less. For example
less   someFile.c

To move around the file with less, press the
spacebarto move forward a page at a time
b to move backward a page at a time
return to move forward a line at a time
y to move backward a line at a time
q to quit

To print a text file on the printer use the lp command.
lp   -d <printername>   somefile
where <printername> is the name of the printer you wish the output arrive e.g. lj583 for the printer in lab GH5.83
Don't try sending non-text files (e.g. program executables) to the printer - it will either do nothing or get very confused and waste lots of paper (and your print credits).

Searching for files

Users of the GUI typically use the Windows Search or Mac Finder function to locate files. Unix has a sophisticated command called find to locate all files matching a particular criteria below any point in the directory hierarchy. The following are some simple examples of its use - see the manual page for all the possible options.
find   ~   -name   "foo.c"   -print
starting at your home directory ~ find any files called "foo.c" below this.
find   .   -name   "foo.c"   -print
starting at your current directory . find any files called "foo.c" below this.
find   ~/C   -name   prog*   -print
find any files with names starting "prog" in C or any of its subdirectories.

Counting lines, words and characters

The command wc (Word Count) prints the number of characters, words or lines in a file depending on what switch value is supplied to it. For example
wc   -c   foo.c
prints the number of characters in foo.c
wc   -w   foo.c
prints the number of "words" in foo.c
wc   -l   foo.c
prints the number of lines in foo.c
wc   foo.c
prints all 3 counts for foo.c

Print lines in one or more files matching a pattern

The command grep is a sophisticated pattern matcher with many options; we will just look at it in its simplest form here.
Suppose you want to filter out (and look at) lines in a text file which contain a particular pattern, then grep is what you use. Here are some examples:
grep   printf   foo.c
prints lines from file foo.c that contain the string "printf"
grep   -v   printf   foo.c
prints lines from file foo.c that DO NOT contain the string "printf"
grep   -n   printf   foo.c
prints lines from file foo.c that contain the string "printf" and prefix each one with its original line number

Pipes and Redirection

When using the shell, the standard input is the keyboard and the standard output is the terminal. Thus you issue commands on standard input and the system prints information on standard output. Most UNIX commands are designed to work with standard input and standard output. For example, the results of the ls command are placed on standard output; if you cat a file it is written to standard output. If you don't give grep a file to filter it will use the standard input and sit there waiting for you to type something.

All of this has a purpose: to enable the commands to be linked together in a pipeline so that the output of one command can be processed as the input to the next.

Here is an example:
ls   -1   |   wc   -l
| (vertical bar) is the pipe symbol. The first command lists the current directory and puts the files to the standard output stream in a single column. Instead of the output going to the screen, however, it is "piped" to be the input of the command wc. The output from the word count program is therefore the number of lines generated by ls which is, of course, equivalent to the number of visible files in the current directory. The final answer is printed to the screen.
You could get exactly the same effect with
ls   |   wc   -w

You can put any number of commands together in a pipeline.

Suppose that instead of displaying the output of a program prog1 on the screen we want to store the results in a file called prog1.results in the current directory. We can do this with the output redirection symbol >
prog1   >   prog1.results

If we want to read the input to program prog1 from a file called prog1.data, we can do this with the input redirection symbol <
prog1   <   prog1.data

Summary

Files and directories
pwdPrint working directory
cdChange directory
mkdirMake a new directory
rmdirDelete directory
lsList contents of a directory
cpCopy files and directories
mvMove (rename) files and directories
rmDelete files
chmodChange access permissions
catDump (Concatenate) files
lessView a file page by page
Miscellaneous commands
gccComplile (& link) a C program
manDisplay an on-line manual page
wcWord count
grepPrint lines matching a pattern
findSearch for files in a directory hierarchy
psReport process status
killTerminate a process
Redirection and pipes
p1   |   p2 connect output from p1 to input of p2
p   <   filenameRedirect standard input of process p to filename
p   >   filenameRedirect standard outputof process p to filename
p   >>   filenameAppend standard output of process p to filename