CS170: Project 1 - Shell (20% of project score)
The goals of this project are:
The project is an individual project. It is due on Thursday, April 18, 2019, 23:59:59 PST (no deadline extensions or late turn ins).
Implement a basic shell
The goal of this project is to implement a basic shell. A shell is a command line interpreter that accepts input from the user and executes the commands that are given. Similar to well-known shells such as bash or tcsh, your shell must be able to execute commands, redirect the standard input or standard output of commands to files, pipe the output of commands to other commands, and put commands in the background.
When the shell is ready to accept commands, it must print the prompt "shell: " (without the quotes). At this point, the user can type commands. Commands are alphanumeric tokens (e.g., ls, ps, cat) that represent programs that should be executed. You should search for these programs in the directories determined by the PATH environment variable that is passed to your shell. Of course, commands can have arguments. Thus, tokens that follow a command (separated by white space) are treated as the arguments to this command (e.g., cat x indicates that the shell should invoke the cat program and pass x as its argument. When the shell has received a line of input, it typically waits until all commands have finished. Only then, a new prompt is displayed (however, this behavior can be altered -- see below for details).
In addition to commands and their arguments, your shell must understand the following meta-characters: '<', '>', '|', and '&'.
The meta-character '<' must be followed by a token that represents a file name. It indicates that the command before this meta-character reads its input from this file (instead of stdin of the shell). Thus, the meta-character '<' must follow the name of a command. Note that there can be spaces between the command name and '<', and between '<' and the file name. However, this does not have to be the case. That is, both cat < file and cat<file are valid and have the same effect. Note that at most one meta-character '<' (i.e., one input redirection) can be given for a single command. That is, more than one of the meta-character '<' for a single command is an error.
The meta-character '>' must be followed by a token that represents a file name. It indicates that the command before this meta-character writes its output to this file (instead of stdout of the shell). Thus, the meta-character '>' must follow the name of a command. Again, there can be spaces between the command name and '>', and between '>' and the file name. However, this does not have to be the case. That is, both ls > file and ls>file are valid and have the same effect. Note that at most one meta-character '>' (i.e., one output redirection) can be given for a single command. That is, more than one of the meta-character '>' for a single command is an error.
The meta-character '|' (i.e., pipe sign) allows multiple commands to be connected. When the shell encounters the '|' character, the output of the command before the pipe sign must be connected to the input of the command after the pipe sign. This requires that there is a valid command before and after the pipe. Also, note that there can be multiple pipe signs on the command line. For example, your shell has to be able to process an input such as cat f | sort | wc. With this command, the output of the cat command is redirected to the input of sort, which in turn sends its output to the input of the wc program. With regard to white spaces separating the meta-character from the commands, the same rules as above apply.
The ampersand character '&' indicates that the command (or commands) of the shell input should be executed in the background and the shell immediately displays a prompt to wait for the next line (even though the commands on the previous line might not have exited yet). The '&' token may only appear as the last token of a line.
To simplify things, we only allow one '&' character to appear, and it has to be last on a command line. Also, only the first command on the input line can have its input redirected, and only the last can have its output redirected. Observe, however, that in case of a single command, we can apply both input and output redirection to this command (i.e., cat < x > y is valid, while cat f | cat < g is not).
In case of errors (e.g., invalid input, command not found, ...) your shell should display an error and wait for the next input. It should never simply die. Whenever you output an error (syntax, parsing, command not found, ...), you must prefix your error message with "ERROR: " (without the quotes), and this message cannot exceed a single line.
To facilitate automated grading, when you start your simple shell program with the argument '-n', then your shell must not output any command prompt (no "shell: "). Just read commands as usual.
To exit the shell, the user must type Ctrl-D (pressing the D button while holding control). This signals the end of input (EOF) to functions that wait for and read the user input.
You may assume that the maximum length of individual tokens never exceeds 32 characters, and that the maximum length of an input line never exceeds 512 characters.
Your shell is supposed to collect the exit codes of all processes that it spawns. That is, you are not allowed to leave zombie processes around of commands that you start.
Your shell should use the fork(2) system call and the execvp(2) system call (or one of its variants) to execute commands. It should also use waitpid(2) or wait(2) to wait for a program to complete execution (unless the program is in the background). You might also find the documentation for signals (and in particular SIGCHLD) useful to be able to collect the status of processes that exit when running in the background.
Your shell must be written in C/C++ and run under Linux. More specifically, it must compile without any warning/errors and run on any CSIL machine.
Please follow the instructions below exactly!