Homework 2: FAQ and Implementation Suggestions

FAQ and Implementation Suggestions for Project 2 (Multiprogramming and System Calls)

These suggestions were written by a previous CS170 TA when sample code was not made available. They may be useful for some of you and you DO NOT have to read these suggestions.

FAQ

Part 1 Implementation Suggestions

Part 2 Implementation Suggestions

FAQ

Question: How can I test part one (why doesn't printf work in my test programs)?

Answer: Until you have part 2 done you will have to test part 1 based on flow of execution. For example: to test if Fork/Exec works you Fork/Exec a function/executable and then call Halt() in the function/executable. If Nachos halts when you run your test program then your system call is probably working. Once you have the Write system call implemented you can further test part 1 by putting your Write statements where ever you wish to print something out. This will be useful for testing Join.
Question: What does a process control block contain?

Answer: A process control block (pcb) contains the attributes of a process. Some of the major attributes are the thread, the pid (SpaceID), open files, etc (remember that the thread has a pointer to the addrspace which is an indirect attribute of a pcb). There should be a global table of process control blocks as well. Remember that you will also want to be able to get a particular process's condition so that it can be waited on in Join (if necessary) and broadcast in Exit. It may not seem like you are using the process control blocks much in part 1 of the project (except for adding and deleting them) but you will use them more in part 2 of the project).
Question: How do I add a new file for this project?

Answer: You will want to create your .h and .cc file in userprog (every new file for this project can be added in this directory). Then you will want to add the .h, .cc, and .o to the list of files in the top level Makefile.common in the correct USERPROG section. Be sure not to add your file to the end of the list of files, but instead place it in the same section as the other userprog files. In each USERPROG section (ie _H, _C, and _O) all the files from each directory should be placed with the other files from that directory. Once this is done type "make depend" in the userprog directory and then you will be able to type "make" to compile as before.
Question: How do I translate the name of the executable when implementing Exec?

Answer: You will want to put a translate function in addrspace that is similar to the translate function in Machine. You will use this function to translate a virtual address into a physical address (one page at a time).
Question: Is there a difference between the parameters of Exec and Fork?

Answer: Yes there is. Exec takes the string representing the relative path (from where you run Nachos - userprog in this case) to the test program you wish to exec [ie. Exec("../test/myExecedProgram")]. Fork takes the name of the function you wish to fork. It is not a string, but a function pointer so it has the form Fork(myFunction) where you have implemented void myFunction() previously in you test program. In Exec you are translating the string that is passed in and in Fork you are using the pointer to myFunction as the PCReg value for when you run the new thread. If you try to pass a string to Fork or a name (not in string form) to Exec you will have serious problems
Question: Does a Forked thread use the same addrspace as the thread who Forked it?

Answer: No it does not. It uses a COPY of the addrspace from the thread who Forked it. This means you must create a new addrspace that has the same size page table as the Forking thread and then you must copy the Forking thread's pages in memory into the Forked threads pages in memory.
Question: The project spec says we need to add a function ReadFile to Addrspace. What is this used for?

Answer: This is used for copying the executable's code and data segments into memory (in the constructor for Addrspace (that Exec calls). noffH.code.virtualAddr is the logical address of the executable's code segment and noffH.initData.virtualAddr is the logical address of the executable's data segment. You no longer want to zero out (bzero) the memory. You need to copy these sections page by page into main memory. You will use your Addrspace::Translate to translate the logical data/code address to the physical address (in memory). Then you can use the C/C++/Unix "bcopy" to copy it over.
Question: Where should we put the global structures/classes?

Answer: You can put these in system.h in the threads directory. Only put the global variables (their declaration) in this file. Make sure you actually instantiate the global structures (like memory manager) in the system.cc file in the initialize function under the correct #ifdef's (for userprog in this case).
Question: Does the Exit system call take a parameter? What is that parameter?

Answer: The Exit system call should take an integer parameter. This parameter is the exit value of the process. This is the same as the exit values in Unix (0 -> good, 1 -> bad). For our purposes you can use any integer here. It only matters that if another process was "Join"ing on your process that the Join call would return the value that your process exited with (as per the project spec). This means that you need to save your exit value some how in the Exit system call so that Join can return it if necessary (even if you've already exited when someone calls Join on your process).
Question: What should we do when we can't allocate enough physical pages in Addrspace constructor for a new thread?

Answer: You should let Fork/Exec know that you don't have enough memory so that it can let the user know (return pid of -1) that it didn't Fork/Exec a new process. Make sure that you check if there is enough free pages in physical memory to facilitate the number of pages for your new process before trying to allocate the physical pages for each logical page. If you do not do this one of your allocates will return -1 and you will have to deallocate all the pages you just allocated before you can let Fork/Exec know of the error.
Question: Do Exec and Fork call thread->Fork?

Answer: Yes they do. After you have set all other information up (as in the project spec) you need to thread->fork a dummy function (that's in exception.cc). In this dummy function you will want to initialize and restore the registers (for Fork and Exec) and then set up the PC registers and return register address (for Exec only). After this you will call machine->Run() from the dummy function (for both Fork and Exec).

Part 1 Implementation Suggestions

Threads and Processes:

Thread::Fork() spawns a new kernel thread that uses the same AddrSpace as the thread that spawned it. If that address space is duplicated and not shared, it is no longer a kernel thread but a Forked Process. If that AddrSpace instead contains code and data loaded in from a separate file, it is no longer a forked process, but an executed process. The details of Fork() and Exec() are covered in steps 1) and 7) below.

System Calls:

In code/userprog/exception.cc, put function calls for each system call. just print debug statements in these functions for now, so you can see when system calls get executed. The function may need to return or take an argument. Because the argument lies in user space, you will need to transfer it over using machine register reads and writes. A sample stub illustrating this is shown below.
```
        case SC_Join: {
          int result = myJoin(machine->ReadRegister(4));
          machine->WriteRegister(2, result);
          break;
        }  
  
```
If the system calls: Fork(), Yeild(), Exec(), Join() and Exit() are implemented in that order, you will not have to worry about the call you are currently working on depending upon unimplemented calls.
Remember that the last thing the ExceptionHandler function needs to do when executing a system call is increment the program counter. Write a helper function to do this - it needs to update PCreg, NextPCreg, and PrevPCreg. They should all incriment by 32 bits (4 bytes).

10 Steps to a Multi-Programmed Nachos

Implement Fork(). Fork will create a new kernel thread and set it's AddrSpace to be a duplicate of the CurrentThread's space. It sets then Yields(). The new thread runs a dummy function that will will copy back the machine registers, PC and return registers saved from before the yield was performed. You did save the PC, return and other machine registers didn't you?

Duplicating the AddrSpace requires the implementation of a Memory Manager detailed in steps 2-4. Fork() will not work completely until the completion of step 4. Don't get stuck on step 1), steps 2-4 are much more important.
Write a Memory Manager that will be used to facilitate contiguous virtual memory. The amount of memory available to the user space is the same as the amount of physical memory, it's not until project 3 that you will have to implement swapping virtual memory.

You will need just two methods at first 1) getPage() allocates the first clear page and 2) clearPage(int i) takes the index of a page and frees it. You can use a bitmap (in code/userprog/bitmap.*) with one bit per page to track allocation or use your own integer array, which ever you prefer.

Modify AddrSpace (code/userprog/addrspace.*) to use the memory manager. first, modify the page table constructors to use pages allocated by your memory manager for the physical page index. The later modification will come in step 4.
Write the AddrSpace::Translate function, which converts a virtual address to a physical address. It does so by breaking the virtual address into a page table index and an offset. It then looks up the physical page in the page table entry given by the page table index and obtains the final physical address by combining the physical page address with the page offset. It might help to pass a pointer to the space you would like the physical address to be stored in as a paramter. This will allow the function to return a boolean TRUE or FALSE depending on whether or not the virtual address was valid. If confused, consult the text book on memory management and page table or Machine::Translate() in machine/translate.cc
Write the AddrSpace::ReadFile function, which loads the code and data segments into the translated memory, instead of at position 0 like the code in the AddrSpace constructor already does. This is needed not only for Exec() but for the initial startup of the machine when executing any test program with virtual memory.

You should buffer user file reads in a disk buffer called diskBuffer (defined in system.h). All of your user-level file I/O must go through the diskBuffer. Be sure to to under or over run the buffer during the copy. Also be sure not to write too much of the file into memeory. You can use the following prototype for the function.
```
   int AddrSpace::ReadFile(int virtAddr, 
                           OpenFile* file, 
                           int size, 
                           int fileAddr) {
   
```
You will also need to use the functions: File::ReadAt(buff,size,addr) and bcopy(src,dst,num) as well as the memory locations at machine->mainMemory[physAddr].
At this point, test programs should work the same as before. That is, the halt program and other system calls will still operate the way they did before you modified AddrSpace. Also Fork() should be working. Test your implementation appropriately.
Write the PCB and a process manager. Create a PCB class that will store the necessary information about a process. Don't worry about putting everything in it right now, you can always add more as you go along. To start, it should have a PID, parent PID, and Thread*. The process manager should do the same thing as the memory manager - it has getPID and clearPID methods, which return an unused process id and clear a process id respectively. Again, use a bitmap or similar integer array. You'll also need an array of PCB* to store the PCBs. Modify the AddrSpace constructors to include a PCB as an attribute.
Implement Yield(). Given that forked processes are almost the same as kernel threads, this one should be trivial.
Implement Exec(). Exec is creating a new process (kernel thread) with new code and data segments loaded from the OpenFile object constructed from the filename passed in by the user. In order to get that file name you will have to write a function that copies over the string from user space. This function will start copying memory from the physical address pointed to by the virtual address in machine->ReadRegister(4). It should go until it hits a NULL byte.

Fork the new thread to run a dummy function that sets the machine registers straight and runs the machine. The calling thread should yield to give control to the newly spawned thread. At this point you should be able to call test files from other test using Exec(someTestFile) from someOtherTestFile.c (in the test/ dir).
Implement Join(). Join should force the current running thread to wait for some process to finish. The PCB manager can keep track of who is waiting for who using a condition variable for each PCB.
Implement Exit(int status). This function should set set the status in the PCB being exited. It should also force any threads waiting on the exiting process to wake up.
Test your system calls using your own test programs, then the ones in ~cs170/nachos-projtest/proj2 that don't rely on part 2. If you have time, test using some programs to do crazy or malicious things.

Part 2 Implementation Suggestions

NACHOS Files:

We will be using NACHOS files exclusivly in part 2. This will be good preparation for project 3. NACHOS Files are similar to Unix files except they are stored on avirtual disk that is implemented as one big Unix file. The interface to Create, Open, Close, WriteAt and ReadAt to those files are defined in filesys/filesys.cc, filesys/openfile.cc and filesys/filehdr.h. You may want to look through those files when getting ready to call these functions for the first time.

10 Steps to an I/O Enabled NACHOS:

Create a SysOpenFile object class that contains a pointer to the file system's OpenFile object as well as the systems (int)FileID and (char *)fileName for that file and the number of user processes accessing currently it. Declare an array of SysOpenFile objects for use by all system calls implemented in part 2.
Create a UserOpenFile object class that contains the (char *)fileName, an index into the global SysOpenFile table and an integer offset represeting a processes current position in that file.
Modify the AddrSpace's PCB to contain an array of OpenUserFiles. Limit the number to something reasonable, but greater than 20. Write a method (in PCBManager) that returns an OpenUserFile object given the fileName.
Implement Create(char *fileName). This is a straight forward call that should simply get the fileName from user space then use fileSystem->Create(fileName,0) to create a new instance of an OpenFile object. Until a user opens the file for IO it is not necessary to do anything further.
Implement Open(char *fileName);. This function will use an OpenFile object created previously by fileSystem->Open (fileName). Once you have this object, check to see if it is already open by some other process in the global SysOpenFile table. If so, incriment the userOpens count. If not, create a new SysOpenFile and store it's pointer in the global table at the next open slot. You can obtain the FileID by looking up the name in your SysOpenFile table.

Then create a new instance of an OpenUserFile object (given a SysOpenFile object) and store it in the currentThread's PCB's OpenUserFile array.

Finally, return the FileID to the user.
Implement a function to Read/Write into MainMem and a buffer given a staring virtual address and a size. It should operate in the same way AddrSpace::ReadFile writes into the main memory one diskBuffer at a time.
It may help to put the section of the code in ReadFile into a "helper" function called userReadWrite() that is general enough that both ReadFile(), myRead() and myWrite() can call. It need only be parameterized by the type of operation to be performed (Read or Write).

When called by Write(), It will read from the MainMem addressed by the virtual addresses. It writes into the given (empty) buffer. Write(), will then put that buffer into an OpenFile. When called by Read(), It will write into MainMem the data in the given (full) buffer that Read() read from an OpenFile.
Implement Write(char *buffer, int size, OpenFileId (int)id); First you will need to get the arguments from the user by reading registers 4-6. If the OpenFileID is == ConsoleOutput (syscall.h), then you call PutChar() in console.cc to print the buffer content. Otherwise, grab a handle to the OpenFile object from the user's openfile list pointing to the global file list. Why can't you just go directly to the global file list?... becuase the user may not have opened that file before trying to write to it. Once you have the OpenFile object, you should fill up a buffer using your userReadWrite() function. Then simply call OpenFileObject->Write(myOwnBuffer, size);
Implement Read(char *buffer, int size, OpenFileId id); Get the arguments from the user in the same way you did for Write(). If the OpenFileID == ConsoleInput (syscall.h), use a routine ( GetChar() in console.cc) to read into a buffer one character at a time. Otherwise, grab a handle to the OpenFile object in the same way you did for Write() and use OpenFileObject->ReadAt(myOwnBuffer,size,pos) to put n characters into your buffer. pos is the position listed in the UserOpenFile object that represents the place in the current file you are writing to. The number read is returned from ReadAt().

Now that your buffer is full of the read, you must write that buffer into the user's memory using the userReadWrite() function. Finally, return the number of bytes written.
Test your system calls using your own test programs, then the ones in ~cs170/nachoos-projtest/proj2 Now that both parts are finished all test programs should work.
Test some more.

FAQ and Implementation Suggestions for Project 2 (Multiprogramming and System Calls)

Contents

FAQ

Part 1 Implementation Suggestions

System Calls:

10 Steps to a Multi-Programmed Nachos

Part 2 Implementation Suggestions

10 Steps to an I/O Enabled NACHOS: