In this lab, you will learn how a shell is built. You will also improve (or reinforce) your shell-using skills. You will also gain experience with C programming (you will interact with critical constructs, such as pointers and data structures, as well as standard idioms like linked lists). Along the way, you will use the fork() system call, a lightweight version of which you implemented in the previous lab! There is not much code to write, but there is a lot to absorb; we recommend beginning this lab over the weekend.
This lab adapts a shell lab due to Eddie Kohler. The high-level design, much of the code, and some of the description below is borrowed from him.
A shell parses a command line, and then runs (or executes) that
command line (one can also think of GUIs as shells, in which case the
"command line" is expressed by the user's mouse clicks, for example).
We've given you the skeleton, and nearly all of the parsing code, for a
simple but surprisingly functional shell. You will complete the parsing
code. You will also fill in the logic for executing the command line:
you will implement support for executing commands, I/O redirection,
pipes, as well as conditional, sequential, and background execution. You
will also implement several internal commands, such as
cd
. (Puzzle: why does cd have to be built
into the shell, when most other commands are executed by exec()ing an
existing program?)
You will electronically hand in code and answers (the mechanics are described at the end of the lab).
Please note the following expectations:
We recommend doing this lab inside the development platform that you set up in the previous lab (the virtual devbox or the CSIL machines). However, this lab uses standard interfaces and tools, and as a consequence, you can do the lab on OS X or even cygwin on Windows; if you decide to go down that path, note the following: (1) the testing scripts expect to be run on a Linux machine, and will incorrectly report errors for some test cases when they are run on platforms other than Linux, and (2) we will be grading on the standard lab platforms.
From within the development platform that you have chosen (which, in the
recommended cases, will be the virtual devbox or the CSIL machines),
change to your lab directory, use
Git to commit changes you've made since handing in lab 1 (if any),
obtain the latest version of the course
repository, and create a local branch called lab2 based on
origin/lab2 (which is our lab2 branch). The commands are as follows, but you
may not need all of them (for example if you are already on the
lab1
branch or if git status
indicates that
there are no changes to commit):
$ cd ~/cs170 $ git checkout lab1 $ git commit -am "here are my leftover changes from lab1" $ git pull $ git checkout -b lab2 origin/lab2 Branch lab2 set up to track remote branch refs/remotes/origin/lab2. Switched to a new branch "lab2" $ cd lab2
The git checkout -b command shown above actually does two things: it first creates a local branch lab2 that is based on the origin/lab2 branch provided by the course staff, and second, it changes the contents of your lab directory to reflect the files stored on the lab2 branch. Git allows switching between existing branches using git checkout branch-name, though you should commit any outstanding changes on one branch before switching to a different one.
Recall that Git allows you to keep track of changes that you make to your code. For example, if you are finished with one of the exercises, and want to checkpoint your progress, you can commit your changes by running:
$ git commit -am 'my solution for lab2 exercise1' Created commit 60d2135: my solution for lab2 exercise1 1 files changed, 1 insertions(+), 0 deletions(-) $
You can view changes by using git diff. For example, git diff origin/lab2 will display the changes relative to the initial code supplied for this lab.
It will be much easier to do the coding work in this lab if you have some familiarity with shells in general; this is for two reasons. First, comfort with shells makes you more productive. Second, if you have a good handle on what a shell is supposed to do, the coding work will make more sense. This portion of the lab is intended to provide some of that background (but some of the background you will have to acquire by "playing around" with the shell on your computer). In this part of the lab, we will interact with the installed shell on your system (rather than the source that you retrieved above). We will be assuming the bash shell, which is the default shell on both of the development platforms in this class.
A shell is a program whose main purpose is to run other programs.
Two of the key workhorses in a shell are
fork()
and execve()
(as we saw in class). Here's a simple shell command:
$ echo foo
(The initial $
is not part of the command. That is shorthand for the prompt that the shell prints before reading a command.)
The shell parses this command into two arguments, echo
and foo
. The echo
argument names the binary that should be executed. So the shell forks a child process to execute echo
with those two arguments. The echo
program has a very simple job: it just prints its arguments to the console. (echo foo
will just print foo
.) Meanwhile, the parent waits for the child to finish; when it does, the parent returns to read another command.
You may be interested in a reasonable tutorial for Unix shells. You can find others by searching for, e.g., ''shell tutorial'' on Google. Let us know if you find one you really like.
Each program has standard input, standard output, and
standard error file descriptors, whose numbers are 0, 1, and 2,
respectively. (You may know them in C++ as cin
, cout
and cerr
; in C the files are called stdin
, stdout
, and stderr
.) The echo
program writes its output to the standard output file descriptor. Normally this is the same as the shell's standard output, which is the terminal (your screen). But the shell lets you redirect these file descriptors to point instead to other files. For example:
$ echo foo > output.txt
This command doesn't print anything to the screen. But let's use the cat
program, which reads a file and prints its contents to standard output, to see what's in output.txt
:
$ cat output.txt foo
The > filename
operator redirects standard output, < filename
redirects standard input, and 2> filename
redirects standard error. (The syntax varies from shell to shell; we generally follow the syntax of the Bourne shell or bash
.)
Shells offer many ways to chain commands together. For example, the ;
operator says "do one command, then do another". This shell command prints two lines:
$ echo foo ; echo bar foo bar
The &&
and ||
operators chain commands
together based on their exit status. You can think of the exit
status as a command's "return value".
If a command accomplishes its function successfully, that command
generally exits with status 0, by calling exit(0)
. (This
is also what happens when a program runs off the end of its main
function.) But if there's an error, most commands will exit with status 1. For example, the cat
command will exit with status 0 if it reads its files successfully, and 1 otherwise:
$ cat output.txt foo [[exit status 0]] $ echo $? 0 $ cat doesnotexist.txt cat: doesnotexist.txt: No such file or directory [[exit status 1]] $ echo $? 1(The special variable
$?
in bash contains the exit value of
the previous command. However, the shell we build in this lab does not support
this variable.)
Now, &&
says "execute the command on the right only if the command on the left exited with status 0". And ||
says "execute the command on the right only if the command on the left exited with status NOT equal to 0". For example:
$ cat output.txt && echo "output.txt exists!" foo Output.txt exists! $ cat doesnotexist.txt && echo "doesnotexist.txt exists!" cat: doesnotexist.txt: No such file or directory [[Note: does not run echo!]] $ cat output.txt || echo "output.txt does not exist." foo $ cat doesnotexist.txt || echo "doesnotexist.txt does not exist." cat: doesnotexist.txt: No such file or directory doesnotexist.txt does not exist.
Parentheses ( )
allow you to run a set of commands in a subshell. This, in turn, lets you redirect the standard input, output, and error of a group of commands at once. For example:
$ ( echo foo ; echo bar ) > output.txt $ cat output.txt foo bar
The exit status of a subshell is the same as the exit status of the last command executed.
You can also execute a command in the background with the &
operator. Normally, the shell will not read a new command until the previous command has exited. But the &
operator tells the shell not to wait for the command.
$ echo foo & $ foo
(foo
is printed on top of the next shell prompt.)
Finally, the pipe operator |
sends the output of one command to the input of another. For example:
$ echo foo | rev oof
You may find the following commands particularly useful for testing your shell. Find out what they do by reading their manual pages. Be creative with how you combine these! (Also see the tester, below.)
cat
(print one or more files to standard output)echo
(print arguments to standard output)true
(exit with status 0)false
(exit with status 1)sleep
(wait for N seconds then exit)sort
(sort lines)
In addition to running programs from the file system, shells have
builtin commands that provide functionality that could not be obtained
otherwise. Two builtins commands that our shell will implement are cd
and
exit
.
The cd
command changes the shell's current directory, which is the default directory that the shell uses for files.
So cd dir
changes the current directory to dir
.
(You can think of the current directory as the directory that shows up by default when you use an "Open" or "Save" dialog in a
GUI program.) Of course, files can also be manipulated using
absolute pathnames, which do not depend on the current
directory; /home/studentname/cs170/lab2/cmdparse.c
is an example.
There may also come a time when you would like to leave your shell; the exit
command instructs the shell to exit with a given status. (exit
alone exits with status 0.)
(Why are cd
and exit
part of the shell instead of standalone programs?)
$ cd lab2 # if you are not already in the lab2 directory $ make $ ./cs202shYou can type commands to the shell; after each command you type, it will display the parser output and then attempt to execute the command. Initially, it won't execute anything, but as you complete the lab, more and more kinds of commands will work. You can run the parser on its own like this:
$ ./cs202sh -p
To quit your shell, you can press control-C. After you implement the
exit
command, you will be able to type
"exit
".
Test cases. You can and should interact with your shell by typing commands to it. You should also run a collection of test cases that are meant to stress your implementation:
$ make testThis command invokes the file
tester.pl
; you should read
this file to see what tests your shell is being given, and what the
expected output is.
Please note: passing tester.pl
is necessary but not
sufficient. We
will be using additional tests to grade and test whether your code meets
the specifications. Feel free to add additional tests to
tester.pl
, or devise your own testing script.
At a high level, our shell does two things: parse command lines, and execute those command lines. This piece of the lab is about parsing; the next (and longer) piece is about executing.
In our shell, parsing has two aspects (which are interleaved): converting a typed command line into a sequence of tokens, and converting that sequence of tokens into an internal representation (the execution piece acts on this internal representation). Tokens can be command names, command arguments, or special shell tokens. For instance, the command
ls -l "foo bar" > zotconsists of five tokens: "
ls
",
"-l
", "foo bar
" (the quotes are not part of the token), ">
", and
"zot
".
To motivate the internal representation, notice that command
lines can contain multiple commands, separated by various
operators:
;
, &
, |
,
&&
, ||
, (
, or
)
. Our internal representation is a list
of commands, with information about the operator that separates each command from
the next (along with other information, such as what kinds of
redirection the user specified for each command). The data
structure that implements this representation is (mostly)
a linked list of commands, with each node in the list
containing information about the command (we said "mostly" because,
owing to subshells, the data structure can be thought of as a hybrid of
a list and a tree).
The key structure in the code is
the struct command
(in cmdparse.h
).
A good way to get familiar with a code base is to look at the key data structures and interfaces (which are usually in header files). To this end:
cmdparse.h
.The implementation of parsing is in cmdline.c
, and is
driven from main.c
. You will need to have a solid
understanding of this code in order to do the remaining exercises.
Therefore:
cmdparse.c
and main.c
.
It may help to play around with the parser for the shell that you have been
given:
$ make # if you have not already built the code $ ./cs202sh -p # -p means "parse only" cs202$ ls -l "foo bar" > zotInspect the output. You can and should try typing in some of the commands from the warm-up section.
There is missing functionality in cmdparse.c. It does not handle parentheses properly, and it does not free the memory that it uses. Your job is to fill these gaps.
Exercise 1. In cmdparse.c
,
complete cmd_parse()
so that it handles parentheses and subshells.
If you did this correctly, then make test
should now
pass the first five tests (it will by default pass some of the
successive tests).
Exercise 2. In cmdparse.c
, complete
cmd_free()
.
We do not provide testing code for this. One hint: what does
strdup()
do? Type man strdup
to find out.
Speaking of resources....in this and the next part of the lab, you will need some comfort with C. We have been building this comfort in class, but you may still feel shaky. Here are some potentially helpful resources:
In this part of the lab, you will write the code to execute the parsed
command line. Now would be a good time to take another look at
main.c
to see the high-level loop and logic for the
entire shell.
Although we have given you the necessary skeleton, most of the "work" for executing command lines is unimplemented. Your job is to fill it in:
cmdrun.c
,
implement cmd_line_exec()
and cmd_exec()
.
The comments in cmdrun.c
give specifications and hints.
It may be best to implement
cmd_line_exec()
first, and then
to fill out cmd_exec()
in pieces, testing the functionality
as you go (using make test
and your own testing).
Your functions will need to make heavy use of system calls. To learn
more about the exact use of these system calls, you want to consult the
manual pages. The interface to these pages is a program called
man
. Systems programmers need to type
man
a lot; you should too.
For example,
man waitpid
will give you documentation on the parameters and return values to
waitpid()
. Note that sometimes you will need to specify a
section of the manual. For example, man open
may not give you documentation on the system call open(); you would want
man 2 open
. Here is a list
of some man
commands that will be useful in this lab; you
may need others:
# the numbers may be optional, depending on your system $ man 2 waitpid # note WEXITSTATUS! $ man 2 fork $ man 2 open $ man 2 close $ man 2 dup2 $ man 3 strdup $ man 3 strcmp $ man 2 pipe $ man 2 execve # or "man execvp" or "man 3 exec" $ man 2 chdir $ man 3 getcwd $ man 3 perror $ man 3 strtolInterestingly, you don't need to invoke
read()
and
write()
in this lab. (Why not?)
You can learn more about the man
program by typing
man man
(but avoid possible confusion).
A hint, concerning the exec
variants: note that the argv
array includes the command name
itself as the "zeroth" argument. For example, in the command echo foo
,
the initial element in the array is argv[0]
, which is the
address of memory that holds the string echo
.
And argv[1]
is the address of memory that holds the string
foo
. This is not necessarily intuitive.
A word about testing: if a particular test is giving you trouble, it may be helpful to type it into the standard shell (bash, as covered in the warmup earlier), and see what happens. Then play around with (or modify) the test, to try to understand what bash expects. That may give you insight about what your code is supposed to do.
If you have not done so yet, commit your code.
$ git status # see what files you changed $ git diff # examine your changes $ git commit -am "My solutions to lab2"Then, follow these steps:
This completes the lab.
Last updated: 2020-10-05 15:44:09 -0700 [validate xhtml]