Lecture 12: OS Concepts, Processes, fork & exec
\( \newcommand\bigO{\mathrm{O}} \)
1. Operating Systems
What is the point of an operating system (OS)? As a user, I want to run programs like the web browser, the compiler, shell, or Emacs. Technically, I can run each of those programs by themselves without a proper OS (this was prevalent in the personal computers in 80s). However, my computer comes with an operating system nevertheless, nowadays. Why?
The operating system makes sure that the resources of my computer (memory, CPU, network, etc.) are shared between the programs I am running in a way I want. It is "a program that facilitates easy, efficient, fair, orderly, and secure use of various software and hardware resources". So, the main goal of an operating system is managing resources, and it uses some hardware support to put it into a privileged level to manage programs' access to resources.
- Operating Systems manage hardware such as
- mouse, keyboard, monitor, hard disk, RAM, Network Interface Card (NIC), CPU, printer…
- Operating Systems provides an interface for programmers to use in the
form of libraries / system calls.
- OS kernel is the "core" of the OS that manages resources.
- Exists in kernel space, which is separate from user-space where application memory exists. This kernel is run in privileged mode.
- Whenever a program needs to do something like asking for more memory, doing some I/O, sending data over network, etc., it makes a "system call" and switches control to the OS kernel which can process this request and honor it if it is appropriate.
- The OS kernel also makes sure that programs don't step on other programs memory, and they are run in a fair manner. This is important for e.g. your web browser not accessing your banking app's memory on your phone, and also to make sure that if multiple people are using a computer like CSIL, one is not using up all of the available processing power.
2. A (simplified) view of the Application / OS / Hardware Stack.
- Application
- vim, emacs, c++ executables, Python scripts, web browser, UNIX shell, …
- Application Programming Interface (API):
- Contains executable code for OS functionality and language libraries. Note that this definition includes conventional libraries (like the math library) as well as the interface provided by the OS.
- Language Libraries (C, C++, Java, Python, …)
- Language libraries can use system calls.
- For example std::cout (C++), System.out (Java), print (Python) all do
something extra and wrap around the
write
system call on Unix.
- For example std::cout (C++), System.out (Java), print (Python) all do
something extra and wrap around the
- Operating System (OS) Kernel
- Memory space that contains functionality the OS provides to languages and applications.
- File Management functionality, CPU scheduling, Process Management, IO, Memory Management (RAM and HD).
- Hardware
- Physical components of a computer (anything you can physically touch).
- Interfaces with the OS using device drivers (software operating the hardware device).
3. Processes
- A computer program in execution
- Or an instance of a program being run on the computer
- Contains the memory and state of the program being run
3.1. Threads
- A program unit that is executed independently of other parts of your program.
- A process may create threads
- An OS manages threads similar to a process
- But a thread shares the memory space of the main process.
- See lecture 13 for more details on threads.
3.2. Process Status (ps) command
- Can view active processes on your terminal with the
ps
command.ps -l
provides more details on active processes
[emre@csil-01]~% ps -l F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD 0 S 33786 2980740 2980733 2 99 19 - 58518 - pts/2 00:00:00 zsh 0 R 33786 2980773 2980740 0 99 19 - 56566 - pts/2 00:00:00 ps
- In this example, two terminal processes are currently executed.
- Each terminal window is running a shell (
zsh
in this case) that has a unique process ID (PID
) - Each terminal window has a parent process (
PPID
) that created this process. - Note, the
ps
command only shows the processes from the terminal (not the entire OS).- You can use
ps -e
orps a
to view the entire list of processes active on your OS. - For example, there is another person logged on this CSIL computer, and we can see the programs run by me, that person, and the graphical interface subsystem (the used
gdm
) usingps au
(here, I give theu
flag tops
so we can see the user names):
- You can use
[emre@csil-01]~% ps au USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 2088660 0.0 0.0 227936 1156 tty2 Ss+ Nov21 0:00 /sbin/agetty -o -p -- \u --noclear tty2 linux root 2091551 0.0 0.0 227936 1176 tty3 Ss+ Nov21 0:00 /sbin/agetty -o -p -- \u --noclear tty3 linux gdm 2091564 0.0 0.0 374164 5224 tty1 Ssl+ Nov21 0:00 /usr/libexec/gdm-wayland-session dbus-run-session -- gnome-session gdm 2091570 0.0 0.0 7376 456 tty1 S+ Nov21 0:00 dbus-run-session -- gnome-session --autostart /usr/share/gdm/greet gdm 2091572 0.0 0.0 33944 3164 tty1 S+ Nov21 0:00 dbus-daemon --nofork --print-address 4 --session gdm 2091573 0.0 0.0 624332 6104 tty1 Sl+ Nov21 0:00 /usr/libexec/gnome-session-binary --autostart /usr/share/gdm/greet gdm 2091581 0.0 1.2 4584932 94020 tty1 Sl+ Nov21 2:13 /usr/bin/gnome-shell gdm 2091601 0.0 0.0 308668 5400 tty1 Sl+ Nov21 0:00 /usr/libexec/at-spi-bus-launcher gdm 2091606 0.0 0.0 33512 2204 tty1 S+ Nov21 0:00 /usr/bin/dbus-daemon --config-file=/usr/share/defaults/at-spi2/acc gdm 2091616 0.0 0.2 475408 20872 tty1 Sl+ Nov21 0:00 /usr/bin/Xwayland :1024 -rootless -noreset -accessx -core -auth /r gdm 2091632 0.0 0.0 448104 3672 tty1 Sl+ Nov21 0:00 /usr/libexec/xdg-permission-store gdm 2091680 0.0 0.1 3062972 9688 tty1 Sl+ Nov21 0:00 /usr/bin/gjs /usr/share/gnome-shell/org.gnome.Shell.Notifications gdm 2091682 0.0 0.0 161756 4968 tty1 Sl+ Nov21 0:00 /usr/libexec/at-spi2-registryd --use-gnome-session gdm 2091694 0.0 0.0 677904 6460 tty1 Sl+ Nov21 0:00 /usr/libexec/gsd-sharing gdm 2091696 0.0 0.1 627660 9776 tty1 Sl+ Nov21 0:00 /usr/libexec/gsd-wacom gdm 2091698 0.0 0.1 640880 9700 tty1 Sl+ Nov21 0:02 /usr/libexec/gsd-color gdm 2091701 0.0 0.1 553360 9740 tty1 Sl+ Nov21 0:00 /usr/libexec/gsd-keyboard gdm 2091703 0.0 0.0 475316 5996 tty1 Sl+ Nov21 0:00 /usr/libexec/gsd-print-notifications gdm 2091705 0.0 0.0 669640 5376 tty1 Sl+ Nov21 0:00 /usr/libexec/gsd-rfkill gdm 2091707 0.0 0.0 608688 4908 tty1 Sl+ Nov21 0:34 /usr/libexec/gsd-smartcard gdm 2091710 0.0 0.0 557584 6560 tty1 Sl+ Nov21 0:00 /usr/libexec/gsd-datetime gdm 2091712 0.0 0.1 707748 10004 tty1 Sl+ Nov21 0:00 /usr/libexec/gsd-media-keys gdm 2091714 0.0 0.0 448056 5276 tty1 Sl+ Nov21 0:00 /usr/libexec/gsd-screensaver-proxy gdm 2091716 0.0 0.0 532908 5468 tty1 Sl+ Nov21 0:00 /usr/libexec/gsd-sound gdm 2091718 0.0 0.0 522188 5160 tty1 Sl+ Nov21 0:00 /usr/libexec/gsd-a11y-settings gdm 2091720 0.0 0.0 525172 3920 tty1 Sl+ Nov21 0:29 /usr/libexec/gsd-housekeeping gdm 2091722 0.0 0.1 637772 9780 tty1 Sl+ Nov21 0:00 /usr/libexec/gsd-power gdm 2091866 0.0 0.1 3128508 9832 tty1 Sl+ Nov21 0:00 /usr/bin/gjs /usr/share/gnome-shell/org.gnome.ScreenSaver gdm 2091878 0.0 0.1 536784 10684 tty1 Sl Nov21 0:08 ibus-daemon --panel disable -r --xim gdm 2091895 0.0 0.0 449096 5304 tty1 Sl Nov21 0:00 /usr/libexec/ibus-dconf gdm 2091897 0.0 0.1 834952 14164 tty1 Sl Nov21 0:00 /usr/libexec/ibus-x11 --kill-daemon gdm 2091899 0.0 0.0 448912 4928 tty1 Sl+ Nov21 0:00 /usr/libexec/ibus-portal gdm 2091918 0.0 0.0 375256 5256 tty1 Sl Nov21 0:00 /usr/libexec/ibus-engine-simple gdm 2091931 0.0 0.0 559896 6544 tty1 Sl+ Nov21 0:00 /usr/libexec/gsd-printer yfwang 2910965 0.0 0.0 233448 6420 pts/0 Ss+ 04:50 0:00 -bash yfwang 2914431 0.0 0.0 234548 6524 pts/1 Ss+ 05:32 0:01 -bash emre 2980740 0.0 0.0 234072 5904 pts/2 SNs 19:49 0:00 -zsh emre 2980927 0.0 0.0 235436 4116 pts/2 RN+ 19:51 0:00 ps au
3.3. top
command
- Processes consume CPU and memory resources when they're being executed.
top
shows the resource consumption of the top-most consuming processes. Some systems also havehtop
which provides a nicer colorful interface with way more filtering options.- Example:
top - 19:54:54 up 22 days, 12:32, 3 users, load average: 0.00, 0.00, 0.00 Tasks: 325 total, 1 running, 324 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 0.5 sy, 0.5 ni, 99.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 7638.6 total, 1985.1 free, 1579.5 used, 4074.0 buff/cache MiB Swap: 7638.0 total, 5701.6 free, 1936.4 used. 5517.7 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2981282 emre 39 19 236160 5384 4400 R 6.7 0.1 0:00.01 top 1 root 20 0 175172 9224 5928 S 0.0 0.1 0:52.24 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:04.83 kthreadd 3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp 4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp 6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/0:0H-events_highpri 9 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_percpu_wq 10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_tasks_kthre 11 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_tasks_rude_ 12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_tasks_trace 13 root 20 0 0 0 0 S 0.0 0.0 0:00.93 ksoftirqd/0 14 root 20 0 0 0 0 I 0.0 0.0 12:01.31 rcu_sched 15 root rt 0 0 0 0 S 0.0 0.0 0:03.44 migration/0 16 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/0 17 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/1 18 root rt 0 0 0 0 S 0.0 0.0 0:03.55 migration/1 19 root 20 0 0 0 0 S 0.0 0.0 0:00.60 ksoftirqd/1 21 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/1:0H-events_highpri 22 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/2 23 root rt 0 0 0 0 S 0.0 0.0 0:03.65 migration/2 24 root 20 0 0 0 0 S 0.0 0.0 0:00.44 ksoftirqd/2 26 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/2:0H-kblockd 27 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/3 28 root rt 0 0 0 0 S 0.0 0.0 0:03.84 migration/3 29 root 20 0 0 0 0 S 0.0 0.0 0:00.34 ksoftirqd/3 ...
3.4. Example of creating our own process
# Makefile CXX=g++ main: main.o ${CXX} -o main -std=C++17 main.o clean: rm -f *.o main ----- // main.cpp #include <unistd.h> using namespace std; int main() { while (true) { sleep(10000); } }
3.5. Foreground / Background Processes
- You can run a program in the foreground (the command waits for the program to terminate before the terminal accepts more commands)
- We've done this all quarter long by executing a program (ex:
./main
) - Executing a process with an
&
runs it in the background jobs
command lists all processes running in the background- Example
MacBook-Pro-38:lecture Richert$ make main g++ -c -o main.o main.cpp g++ -o main -std=C++17 main.o MacBook-Pro-38:lecture Richert$ ./main ^C MacBook-Pro-38:lecture Richert$ ./main& [1] 68713 MacBook-Pro-38:lecture Richert$ ps -l UID PID PPID F CPU PRI NI SZ RSS WCHAN S ADDR TTY TIME CMD 501 31029 31027 4006 0 31 0 2498932 492 - R+ 0 ttys000 24:33.59 -bash 501 68157 68156 4006 0 31 0 2489716 3428 - S+ 0 ttys001 0:00.14 -bash 501 68585 68584 4006 0 31 0 2489716 3380 - S 0 ttys002 0:00.10 -bash 501 68713 68585 4006 0 31 0 2433800 664 - R 0 ttys002 0:01.72 ./main MacBook-Pro-38:lecture Richert$ jobs MacBook-Pro-38:lecture Richert$ ./main& [1] 68726 MacBook-Pro-38:lecture Richert$ jobs [1]+ Running ./main & MacBook-Pro-38:lecture Richert$ ps PID TTY TIME CMD 31029 ttys000 25:56.74 -bash 68157 ttys001 0:00.14 -bash 68585 ttys002 0:00.12 -bash 68726 ttys002 0:06.81 ./main MacBook-Pro-38:lecture Richert$ kill 68726 [1]+ Terminated: 15 ./main
3.6. Suspend / Resume processes
- You can suspend a running process in the foreground (not terminate it) with
<C-z>
(that is Control andz
keys together, it is also denoted as^Z
in some texts). At this point, the shell tells the OS to stop scheduling the program (so the program is in memory and can be resumed, but it is not run). - Any program that the terminal is running can be run on the foreground
or background using the
fg
orbg
commands - Note that ^C is not the same as ^Z
<C-c>
terminates the process (by telling the OS to send a specific signal which the C++ runtime interprets as "terminate the process").<C-z>
stops execution, but the process still exists.
MacBook-Pro-38:lecture Richert$ ./main ^Z [1]+ Stopped ./main MacBook-Pro-38:lecture Richert$ jobs [1]+ Stopped ./main MacBook-Pro-38:lecture Richert$ bg %1 [1]+ ./main & MacBook-Pro-38:lecture Richert$ jobs [1]+ Running ./main & MacBook-Pro-38:lecture Richert$ ps PID TTY TIME CMD 31029 ttys000 33:08.34 -bash 68157 ttys001 0:00.16 -bash 68768 ttys001 0:06.58 ./main MacBook-Pro-38:lecture Richert$ kill 68768 [1]+ Terminated: 15 ./main
4. fork
and exec
4.1. Linux Command / Process Management
fork()
- Creates a new process that is EXACTLY the same as the parent process
(few exceptions like current locking / semaphore state).
- Returns a PID to parent of child's PID (child's PID == 0)
- Creates a new process that is EXACTLY the same as the parent process
(few exceptions like current locking / semaphore state).
exec()
- Replaces the content of the parent process with another process.
- Ex: exec ls –l (replaces bash process with "ls –l" process)
- When process finishes, parent process is no longer valid and everything terminates.
- Unix commands and applications use the fork / execute process.
- Commands such as
ls –l
,pwd
, …
- Commands such as
- The PPID is usually the shell that invokes the application or command.
- Replaces the content of the parent process with another process.
exec
command runs a commmand withexec()
in the current process space and then terminates.- Example
MacBook-Pro-38:lecture Richert$ exec ls -l total 1736 -rw-r--r--@ 1 Richert staff 92 May 28 18:58 Makefile -rwxr-xr-x 1 Richert staff 4248 May 28 19:42 main -rw-r--r--@ 1 Richert staff 108 May 28 19:41 main.cpp -rw-r--r-- 1 Richert staff 608 May 28 19:42 main.o -rwxr-xr-x 1 Richert staff 866700 May 17 13:35 my_googletest drwxr-xr-x 17 Richert staff 578 May 28 18:55 previous_examples [Process completed]
- Terminal process is terminated and cannot enter more commands.
- Must open a new terminal process
- Flow of fork / exec running the
ls -l
command:- bash (fork) - A copy of the bash shell process is made with
fork()
- bashcopy (exec
ls -l
) -ls -l
command / application replaces bashcopy process ls –l
terminates. OS removes bashcopy memory space. Control is resumed to original bash shell.
- bash (fork) - A copy of the bash shell process is made with
4.2. Example of fork with a simple C++ program
Note: the examples here are different from the ones I coded up in the lecture, you should also read the final programs from the lectures I shared on the course web page.
forkIt: forkIt.o ${CXX} -o forkIt -std=C++17 forkIt.o
// forkIt.cpp #include <unistd.h> // sleep(), fork(), pid_t (in sys/types.h) #include <iostream> #include <string> using namespace std; int main() { cout << "Before fork, " << __FILE__ << " " << __LINE__ << " " << __FUNCTION__ << endl; sleep(10); pid_t result = fork(); // child_result == 0, parent_result == PID of child cout << "After fork, " << __FILE__ << " " << __LINE__ << " " << __FUNCTION__ << endl; sleep(10); cout << "After sleep, " << __FILE__ << " " << __LINE__ << " " << __FUNCTION__ << endl; return 0; }
4.3. Example with fork / exec
// hello.cpp #include <unistd.h> #include <iostream> using namespace std; int main() { cout << "Hello World!" << endl; sleep(15); return 0; }
g++ -o hello hello.cpp
forkExec: forkExec.o ${CXX} -o forkExec -std=C++17 forkExec.o
// forkExec.cpp #include <unistd.h> #include <iostream> using namespace std; int main() { // path to some executable char* const HELLO_EXECUTABLE = (char*) "/Users/richert/Desktop/32lecture/hello"; cout << "Before fork, " << __FILE__ << ", " << __LINE__ << " " \ << __FUNCTION__ << endl; // parent receives child PID, child_result == 0 pid_t result = fork(); cout << "After fork, " << __FILE__ << ", " << __LINE__ << " " \ << __FUNCTION__ << endl; cout << "RESULT_PID = " << result << endl; cout << "PID: " << getpid() << endl; cout << "PPID: " << getppid() << endl; // Following if block executed ONLY by child process if (result == 0) { cout << "---" << endl; cout << "RESULT_PID = " << result << endl; cout << "PID: " << getpid() << endl; cout << "PPID: " << getppid() << endl; int execvResult; char* const path[] = { HELLO_EXECUTABLE }; execvResult = execv(HELLO_EXECUTABLE, path); //if success, run then terminate // THIS LINE OF CODE NEVER REACHED IN CHILD // (unless execv returned an error) perror("execv seems to have failed"); cerr << "execvResult=" << execvResult << endl; exit(1); } // parent process executes this // wait to check if no child process exists anymore // https://linux.die.net/man/2/waitpid while (waitpid(result, NULL, 0)) { if (errno == ECHILD) { // all children of process terminated cout << "pid: " << getpid() << " has no children" << endl; break; } } cout << "After waiting, " << __FILE__ ", " << __LINE__ << " " \ << __FUNCTION__ << endl; } // Play around with ps –l between sleep to see PPID and PID