this page last updated: Fri Jan 12 10:41:59 PST 2018
In this class, which attempts to provide a similar experience as a learning tool, one would none the less like to use Linux as a test of one's OS functionality. That is, while the assignments will be graded using tests that are not revealed, it would be helpful if one could use Linux as an exemplar to test one's submission before it is graded.
To do so, the grading will have to accept correct Linux semantics (at the currently installed version running on the CSIL machines) as being "correct" so that you can "know" that your OS is producing correct behavior. In general, such a litmus test is probably impossible. For example, it is possible to design tests that depend on the exact number of cycles that are assigned to a process as part of its time slice. Even then, the Linux scheduler (which is far more complex than the process scheduler you must write) could perturb the results.
However, if these timing-related tests are eschewed by the tests that are applied to your lab submissions, then there is another issue that we can address, but it requires an understanding of Linux character-level I/O.
Linux does support this property as long as one does not assume that it means that the file descriptor will behave in the same way each time it is used as a character stream. That is, the child can simply call read() and write() but the behavior of these calls with respect to delivering data will be different depending on how the file descriptor was initialized.
Let's take an example. Consider the following code for a program called read-write-80:
#include < stdio.h > main() { char ch[80]; int n; n = read(0, ch, 80); write(1, &ch[0], n); if (n < 0) { perror("cat"); } }If you compile and run this program from your login shell and then type the string
aaawhere you type carriage-return after the third a the program will echo the string "aaa\n" and exit to the shell.
apodictic:test rich$ ./read-write-80 aaa aaa apodictic:test rich$Notice that it is not possible to type in a second string because the newline character at after the end of the third 'a' caused the read to complete in the code and the write to then echo the string.
However, now create a file (say, called testfile) and in it put the fllowing three strings
aaa bbb ccc where a newline terminates each line. Now try running the program and using the shell to redirect the file into the program standard input device (file descriptor zero).
apodictic:test rich$ ./read-write-80 < testfile aaa bbb ccc apodictic:test rich$If the semantics were the same, you'd see the same output as before. That is, in the previous case where you are connected to the program via your terminal window, the newline character tells the read() call to terminate. However, when you redirect a file into standard in, then the newline characters do not trigger the call to read() to complete. Instead, read() waits to see the EOF before it sends its entire buffer (including newline characters) to write() in the program.
At this point, you have two legitimate questions to ask:
stty -ato see what flags are set) and that is causing the read to complete when the tty is standard in, but when the shell redirects a file, it does so by opening the file and not going through the tty. Thus newline characters just look like ordinary characters when read from the file and it is EOF that triggers the read to complete.
As to the second question, you have a bit of a problem because kos does not include a tty driver. As a result, you have a choice. You can either implement the same semantics as Linux does when the input is a tty. In this case, if you were to cross-compile read-write-80 for kos and run it as
./kos -a 'test_execs/read-write-80'and then to type
aaayour OS would echo 'aaa\n' and then exit (as Linux does when you type the characters 'aaa\n' on the keyboard with the program running in the foreground). However if you were to then run kos and ask the shell to redirect the input to standard in as
./kos -a 'test_execs/read-write-80' < testfileyou get the same output as you did when you typed 'aaa\n' from the keyboard. That is, the string 'aaa\n' is echoed and the program exits (causing kos to halt) which is different than the Linux case where you used a shell redirect to send three strings (seaparted by newlines) to the program and all three were echoed properly.
Alternatively, if you use the non-tty semantics for standard in, then the redirect works properly, but running
./kos -a 'test_execs/read-write-80'will allow you to input characters from the keyboard but will hang until you type "^D" which, is different than the tty semantics of Linux for the same program.
And then there are pipes.
In one of the labs you will be asked to implement Linux pipes. These too can be set up by a program as the standard in and standard out file descriptors for a child that will then simply call read() and write(). However, Linux pipes have slight different semantics as well. In particular, when reading a pipe, EOF indicates that the last writer of the pipe has closed and that no more data will be available. However pipes are intended to allow processes to communicate freely while they are open. Thus, as the implementer of a pipe, you are faced with the following design decision:
If a process has written some data into a pipe, and a reader has called read() with a buffer size that is larger than the data in the pipe, when do does read() return to user space?
If the writer and reader processes are not written so that they are coordinated, it is not possible for the reader to "know" how much buffer to use in a read() call to ensure that the writer fills it completely. Put another way, if the reader knew the writer was going to write 10 bytes at a time, the reader could always call read(pd,buf,10) and the OS could just wait for the buffer to become full each time before returning to user space.
However if the write just writes some data in an amount the reader cannot anticipate, then the reader must be able to have data delivered before the bufer is full (i.e. a short read) or the reader can only read a character at a time because the OS will only return to user space when the buffer is full.
Furthermore, the newline character can't be used in a pipe to trigger the call to read() to return to user space. Pipes are intended to tranfers both ASCII and binary data. If binary data is being transferred, then the byte corresponding to a newline character might be a legitimate element of the data stream. If the read() completes in this case, it is completing not because a line has ended but because some random byte matches the end-of-line character.
The solution for this dilemma is for the reader to implement the following logic. If the read() call begins reading data from the the pipe and filling the user space buffer and then discovers that there is no more data to read (but the pipe is still open has not been closed), the last of the data that is present in the pipe is delivered to user space and the read() call terminates with a "short read."
Notice that if you only implement these semantics for your implementation of read() and you run the program
./kos -a 'test_execs/read-write-80'and you try to type in
aaathe code prints a single 'a' character (and no new line) and then exits causing kos to halt. That is, it doesn't even print all three 'a' characters. The output is identical when you run
./kos -a 'test_execs/read-write-80' < testfileOnly a single 'a' (and no newline) is echoed from the program before the OS halts.
Why?
Because when characters come in from either the tty or a file they come in slowly, one at a time. The pipe logic sees the first 'a' but no other 'a' characters are waiting (since an interrupt must happen to annouce the second 'a' and it will take a long time). Thus the read() call notices that there are no other characters present and it returns to user space causing the call to complete and the write() call to print only the single character that was delivered.
The second problem you must solve is to differentiate between running kos and typing input from the keyboard and running kos using the shell to redirect input from a file. Here you need a way to tell the invocation of kos whether it should treat the input as a tty or a file.
To enable this latter functionality, kos includes a '-t' flag. Running kos wit this flag does two things
To use this feature, you would run
./kos -t -a 'test_execs/read-write-80'when entering data from the keyboard, but
./kos -a 'test_execs/read-write-80' < testfile(leaving off the -t) when using the shell to redirect a file into the OS.
Note that this flag doesn't solve the problem for you. Instead, it gives you a way to determine (the way Linux does) whether you should treat the input as a tty or as a file. Your code will need to query the IsTTY variable and to detect and filter the end-of-line character (-2) in its implementation of read() to implement tty semnatics (when IsTTY is 1) and to ignore end-of-line entirely (and treat -2 as a normal character) when IsTTY is zero.
For example, if you run
./kos -a 'test_execs/ksh'and then create a pipe between cat and read-write-80
ksh: test_execs/cat | test_execs/read-write-80 aaa bbb ccc ^Dyou may only see part of the output. For example, in my OS the program prints
aaaand then finishes. Why? Because my scehduler allows cat to write 3 characters before read-write-80 is allowed to get the CPU. When it does it gets the three characters and, finding so others in the pipe, the read() call made in read-write-80 completes.
Similarly, when I run
kos -t -a 'test_execs/ksh'and run the same test, my OS only prints the first 'a' before read-write-80 exits. The reason here is that in the TTY case, cat gets a character at a time and, because it has to wait for the console interrupt, read-write-80 gets the CPU much sooner. It only find a singel character and exits.
The moral here (if there is one) is that you can use Linux as a guide, but you'll need to understand (as always) what it is your OS is doing to understand whether it is doing the same things as Linux does. In this case, I'd need to emulate Linux CPU scheduling to get a precise replication of the Linux output from this test.