Lecture 12: OS Concepts, Processes, fork & exec

\( \newcommand\bigO{\mathrm{O}} \)

1. Operating Systems

What is the point of an operating system (OS)? As a user, I want to run programs like the web browser, the compiler, shell, or Emacs. Technically, I can run each of those programs by themselves without a proper OS (this was prevalent in the personal computers in 80s). However, my computer comes with an operating system nevertheless, nowadays. Why?

The operating system makes sure that the resources of my computer (memory, CPU, network, etc.) are shared between the programs I am running in a way I want. It is "a program that facilitates easy, efficient, fair, orderly, and secure use of various software and hardware resources". So, the main goal of an operating system is managing resources, and it uses some hardware support to put it into a privileged level to manage programs' access to resources.

  • Operating Systems manage hardware such as
    • mouse, keyboard, monitor, hard disk, RAM, Network Interface Card (NIC), CPU, printer…
  • Operating Systems provides an interface for programmers to use in the form of libraries / system calls.
    • OS kernel is the "core" of the OS that manages resources.
    • Exists in kernel space, which is separate from user-space where application memory exists. This kernel is run in privileged mode.
    • Whenever a program needs to do something like asking for more memory, doing some I/O, sending data over network, etc., it makes a "system call" and switches control to the OS kernel which can process this request and honor it if it is appropriate.
    • The OS kernel also makes sure that programs don't step on other programs memory, and they are run in a fair manner. This is important for e.g. your web browser not accessing your banking app's memory on your phone, and also to make sure that if multiple people are using a computer like CSIL, one is not using up all of the available processing power.

2. A (simplified) view of the Application / OS / Hardware Stack.

  • Application
    • vim, emacs, c++ executables, Python scripts, web browser, UNIX shell, …
    • Application Programming Interface (API):
      • Contains executable code for OS functionality and language libraries. Note that this definition includes conventional libraries (like the math library) as well as the interface provided by the OS.
      • Language Libraries (C, C++, Java, Python, …)
      • Language libraries can use system calls.
        • For example std::cout (C++), System.out (Java), print (Python) all do something extra and wrap around the write system call on Unix.
  • Operating System (OS) Kernel
    • Memory space that contains functionality the OS provides to languages and applications.
    • File Management functionality, CPU scheduling, Process Management, IO, Memory Management (RAM and HD).
  • Hardware
    • Physical components of a computer (anything you can physically touch).
    • Interfaces with the OS using device drivers (software operating the hardware device).

3. Processes

  • A computer program in execution
  • Or an instance of a program being run on the computer
  • Contains the memory and state of the program being run

3.1. Threads

  • A program unit that is executed independently of other parts of your program.
  • A process may create threads
    • An OS manages threads similar to a process
    • But a thread shares the memory space of the main process.
    • See lecture 13 for more details on threads.

3.2. Process Status (ps) command

  • Can view active processes on your terminal with the ps command.
    • ps -l provides more details on active processes
[emre@csil-01]~% ps -l
F S   UID     PID    PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
0 S 33786 2980740 2980733  2  99  19 - 58518 -      pts/2    00:00:00 zsh
0 R 33786 2980773 2980740  0  99  19 - 56566 -      pts/2    00:00:00 ps
  • In this example, two terminal processes are currently executed.
  • Each terminal window is running a shell (zsh in this case) that has a unique process ID (PID)
  • Each terminal window has a parent process (PPID) that created this process.
  • Note, the ps command only shows the processes from the terminal (not the entire OS).
    • You can use ps -e or ps a to view the entire list of processes active on your OS.
    • For example, there is another person logged on this CSIL computer, and we can see the programs run by me, that person, and the graphical interface subsystem (the used gdm) using ps au (here, I give the u flag to ps so we can see the user names):
[emre@csil-01]~% ps au
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     2088660  0.0  0.0 227936  1156 tty2     Ss+  Nov21   0:00 /sbin/agetty -o -p -- \u --noclear tty2 linux
root     2091551  0.0  0.0 227936  1176 tty3     Ss+  Nov21   0:00 /sbin/agetty -o -p -- \u --noclear tty3 linux
gdm      2091564  0.0  0.0 374164  5224 tty1     Ssl+ Nov21   0:00 /usr/libexec/gdm-wayland-session dbus-run-session -- gnome-session
gdm      2091570  0.0  0.0   7376   456 tty1     S+   Nov21   0:00 dbus-run-session -- gnome-session --autostart /usr/share/gdm/greet
gdm      2091572  0.0  0.0  33944  3164 tty1     S+   Nov21   0:00 dbus-daemon --nofork --print-address 4 --session
gdm      2091573  0.0  0.0 624332  6104 tty1     Sl+  Nov21   0:00 /usr/libexec/gnome-session-binary --autostart /usr/share/gdm/greet
gdm      2091581  0.0  1.2 4584932 94020 tty1    Sl+  Nov21   2:13 /usr/bin/gnome-shell
gdm      2091601  0.0  0.0 308668  5400 tty1     Sl+  Nov21   0:00 /usr/libexec/at-spi-bus-launcher
gdm      2091606  0.0  0.0  33512  2204 tty1     S+   Nov21   0:00 /usr/bin/dbus-daemon --config-file=/usr/share/defaults/at-spi2/acc
gdm      2091616  0.0  0.2 475408 20872 tty1     Sl+  Nov21   0:00 /usr/bin/Xwayland :1024 -rootless -noreset -accessx -core -auth /r
gdm      2091632  0.0  0.0 448104  3672 tty1     Sl+  Nov21   0:00 /usr/libexec/xdg-permission-store
gdm      2091680  0.0  0.1 3062972 9688 tty1     Sl+  Nov21   0:00 /usr/bin/gjs /usr/share/gnome-shell/org.gnome.Shell.Notifications
gdm      2091682  0.0  0.0 161756  4968 tty1     Sl+  Nov21   0:00 /usr/libexec/at-spi2-registryd --use-gnome-session
gdm      2091694  0.0  0.0 677904  6460 tty1     Sl+  Nov21   0:00 /usr/libexec/gsd-sharing
gdm      2091696  0.0  0.1 627660  9776 tty1     Sl+  Nov21   0:00 /usr/libexec/gsd-wacom
gdm      2091698  0.0  0.1 640880  9700 tty1     Sl+  Nov21   0:02 /usr/libexec/gsd-color
gdm      2091701  0.0  0.1 553360  9740 tty1     Sl+  Nov21   0:00 /usr/libexec/gsd-keyboard
gdm      2091703  0.0  0.0 475316  5996 tty1     Sl+  Nov21   0:00 /usr/libexec/gsd-print-notifications
gdm      2091705  0.0  0.0 669640  5376 tty1     Sl+  Nov21   0:00 /usr/libexec/gsd-rfkill
gdm      2091707  0.0  0.0 608688  4908 tty1     Sl+  Nov21   0:34 /usr/libexec/gsd-smartcard
gdm      2091710  0.0  0.0 557584  6560 tty1     Sl+  Nov21   0:00 /usr/libexec/gsd-datetime
gdm      2091712  0.0  0.1 707748 10004 tty1     Sl+  Nov21   0:00 /usr/libexec/gsd-media-keys
gdm      2091714  0.0  0.0 448056  5276 tty1     Sl+  Nov21   0:00 /usr/libexec/gsd-screensaver-proxy
gdm      2091716  0.0  0.0 532908  5468 tty1     Sl+  Nov21   0:00 /usr/libexec/gsd-sound
gdm      2091718  0.0  0.0 522188  5160 tty1     Sl+  Nov21   0:00 /usr/libexec/gsd-a11y-settings
gdm      2091720  0.0  0.0 525172  3920 tty1     Sl+  Nov21   0:29 /usr/libexec/gsd-housekeeping
gdm      2091722  0.0  0.1 637772  9780 tty1     Sl+  Nov21   0:00 /usr/libexec/gsd-power
gdm      2091866  0.0  0.1 3128508 9832 tty1     Sl+  Nov21   0:00 /usr/bin/gjs /usr/share/gnome-shell/org.gnome.ScreenSaver
gdm      2091878  0.0  0.1 536784 10684 tty1     Sl   Nov21   0:08 ibus-daemon --panel disable -r --xim
gdm      2091895  0.0  0.0 449096  5304 tty1     Sl   Nov21   0:00 /usr/libexec/ibus-dconf
gdm      2091897  0.0  0.1 834952 14164 tty1     Sl   Nov21   0:00 /usr/libexec/ibus-x11 --kill-daemon
gdm      2091899  0.0  0.0 448912  4928 tty1     Sl+  Nov21   0:00 /usr/libexec/ibus-portal
gdm      2091918  0.0  0.0 375256  5256 tty1     Sl   Nov21   0:00 /usr/libexec/ibus-engine-simple
gdm      2091931  0.0  0.0 559896  6544 tty1     Sl+  Nov21   0:00 /usr/libexec/gsd-printer
yfwang   2910965  0.0  0.0 233448  6420 pts/0    Ss+  04:50   0:00 -bash
yfwang   2914431  0.0  0.0 234548  6524 pts/1    Ss+  05:32   0:01 -bash
emre     2980740  0.0  0.0 234072  5904 pts/2    SNs  19:49   0:00 -zsh
emre     2980927  0.0  0.0 235436  4116 pts/2    RN+  19:51   0:00 ps au

3.3. top command

  • Processes consume CPU and memory resources when they're being executed.
  • top shows the resource consumption of the top-most consuming processes. Some systems also have htop which provides a nicer colorful interface with way more filtering options.
  • Example:
top - 19:54:54 up 22 days, 12:32,  3 users,  load average: 0.00, 0.00, 0.00
Tasks: 325 total,   1 running, 324 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.5 sy,  0.5 ni, 99.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   7638.6 total,   1985.1 free,   1579.5 used,   4074.0 buff/cache
MiB Swap:   7638.0 total,   5701.6 free,   1936.4 used.   5517.7 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
2981282 emre      39  19  236160   5384   4400 R   6.7   0.1   0:00.01 top
      1 root      20   0  175172   9224   5928 S   0.0   0.1   0:52.24 systemd
      2 root      20   0       0      0      0 S   0.0   0.0   0:04.83 kthreadd
      3 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_gp
      4 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_par_gp
      6 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker/0:0H-events_highpri
      9 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 mm_percpu_wq
     10 root      20   0       0      0      0 S   0.0   0.0   0:00.00 rcu_tasks_kthre
     11 root      20   0       0      0      0 S   0.0   0.0   0:00.00 rcu_tasks_rude_
     12 root      20   0       0      0      0 S   0.0   0.0   0:00.00 rcu_tasks_trace
     13 root      20   0       0      0      0 S   0.0   0.0   0:00.93 ksoftirqd/0
     14 root      20   0       0      0      0 I   0.0   0.0  12:01.31 rcu_sched
     15 root      rt   0       0      0      0 S   0.0   0.0   0:03.44 migration/0
     16 root      20   0       0      0      0 S   0.0   0.0   0:00.00 cpuhp/0
     17 root      20   0       0      0      0 S   0.0   0.0   0:00.00 cpuhp/1
     18 root      rt   0       0      0      0 S   0.0   0.0   0:03.55 migration/1
     19 root      20   0       0      0      0 S   0.0   0.0   0:00.60 ksoftirqd/1
     21 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker/1:0H-events_highpri
     22 root      20   0       0      0      0 S   0.0   0.0   0:00.00 cpuhp/2
     23 root      rt   0       0      0      0 S   0.0   0.0   0:03.65 migration/2
     24 root      20   0       0      0      0 S   0.0   0.0   0:00.44 ksoftirqd/2
     26 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker/2:0H-kblockd
     27 root      20   0       0      0      0 S   0.0   0.0   0:00.00 cpuhp/3
     28 root      rt   0       0      0      0 S   0.0   0.0   0:03.84 migration/3
     29 root      20   0       0      0      0 S   0.0   0.0   0:00.34 ksoftirqd/3
     ...

3.4. Example of creating our own process

# Makefile
CXX=g++
main: main.o
    ${CXX} -o main -std=C++17 main.o
clean:
    rm -f *.o main
-----
// main.cpp
#include <unistd.h>
using namespace std;
int main() {
    while (true) { sleep(10000); }
}

3.5. Foreground / Background Processes

  • You can run a program in the foreground (the command waits for the program to terminate before the terminal accepts more commands)
  • We've done this all quarter long by executing a program (ex: ./main)
  • Executing a process with an & runs it in the background
  • jobs command lists all processes running in the background
  • Example
MacBook-Pro-38:lecture Richert$ make main
g++    -c -o main.o main.cpp
g++ -o main -std=C++17 main.o
MacBook-Pro-38:lecture Richert$ ./main
^C
MacBook-Pro-38:lecture Richert$ ./main&
[1] 68713
MacBook-Pro-38:lecture Richert$ ps -l
  UID   PID  PPID        F CPU PRI NI       SZ    RSS WCHAN     S             ADDR TTY           TIME CMD
  501 31029 31027     4006   0  31  0  2498932    492 -      R+                  0 ttys000   24:33.59 -bash
  501 68157 68156     4006   0  31  0  2489716   3428 -      S+                  0 ttys001    0:00.14 -bash
  501 68585 68584     4006   0  31  0  2489716   3380 -      S                   0 ttys002    0:00.10 -bash
  501 68713 68585     4006   0  31  0  2433800    664 -      R                   0 ttys002    0:01.72 ./main
MacBook-Pro-38:lecture Richert$ jobs
MacBook-Pro-38:lecture Richert$ ./main&
[1] 68726
MacBook-Pro-38:lecture Richert$ jobs
[1]+  Running                 ./main &
MacBook-Pro-38:lecture Richert$ ps
  PID TTY           TIME CMD
31029 ttys000   25:56.74 -bash
68157 ttys001    0:00.14 -bash
68585 ttys002    0:00.12 -bash
68726 ttys002    0:06.81 ./main
MacBook-Pro-38:lecture Richert$ kill 68726
[1]+  Terminated: 15          ./main

3.6. Suspend / Resume processes

  • You can suspend a running process in the foreground (not terminate it) with <C-z> (that is Control and z keys together, it is also denoted as ^Z in some texts). At this point, the shell tells the OS to stop scheduling the program (so the program is in memory and can be resumed, but it is not run).
  • Any program that the terminal is running can be run on the foreground or background using the fg or bg commands
  • Note that ^C is not the same as ^Z
    • <C-c> terminates the process (by telling the OS to send a specific signal which the C++ runtime interprets as "terminate the process"). <C-z> stops execution, but the process still exists.
MacBook-Pro-38:lecture Richert$ ./main
^Z
[1]+  Stopped                 ./main
MacBook-Pro-38:lecture Richert$ jobs
[1]+  Stopped                 ./main
MacBook-Pro-38:lecture Richert$ bg %1
[1]+ ./main &
MacBook-Pro-38:lecture Richert$ jobs
[1]+  Running                 ./main &
MacBook-Pro-38:lecture Richert$ ps
  PID TTY           TIME CMD
31029 ttys000   33:08.34 -bash
68157 ttys001    0:00.16 -bash
68768 ttys001    0:06.58 ./main
MacBook-Pro-38:lecture Richert$ kill 68768
[1]+  Terminated: 15          ./main

4. fork and exec

4.1. Linux Command / Process Management

  • fork()
    • Creates a new process that is EXACTLY the same as the parent process (few exceptions like current locking / semaphore state).
      • Returns a PID to parent of child's PID (child's PID == 0)
  • exec()
    • Replaces the content of the parent process with another process.
      • Ex: exec ls –l (replaces bash process with "ls –l" process)
      • When process finishes, parent process is no longer valid and everything terminates.
    • Unix commands and applications use the fork / execute process.
      • Commands such as ls –l, pwd, …
    • The PPID is usually the shell that invokes the application or command.
  • exec command runs a commmand with exec() in the current process space and then terminates.
  • Example
MacBook-Pro-38:lecture Richert$ exec ls -l
total 1736
-rw-r--r--@  1 Richert  staff      92 May 28 18:58 Makefile
-rwxr-xr-x   1 Richert  staff    4248 May 28 19:42 main
-rw-r--r--@  1 Richert  staff     108 May 28 19:41 main.cpp
-rw-r--r--   1 Richert  staff     608 May 28 19:42 main.o
-rwxr-xr-x   1 Richert  staff  866700 May 17 13:35 my_googletest
drwxr-xr-x  17 Richert  staff     578 May 28 18:55 previous_examples

[Process completed]
  • Terminal process is terminated and cannot enter more commands.
  • Must open a new terminal process
  • Flow of fork / exec running the ls -l command:
    1. bash (fork) - A copy of the bash shell process is made with fork()
    2. bashcopy (exec ls -l) - ls -l command / application replaces bashcopy process
    3. ls –l terminates. OS removes bashcopy memory space. Control is resumed to original bash shell.

4.2. Example of fork with a simple C++ program

Note: the examples here are different from the ones I coded up in the lecture, you should also read the final programs from the lectures I shared on the course web page.

forkIt: forkIt.o
    ${CXX} -o forkIt -std=C++17 forkIt.o
// forkIt.cpp
#include <unistd.h> // sleep(), fork(), pid_t (in sys/types.h)
#include <iostream>
#include <string>

using namespace std;

int main() {
    cout << "Before fork, " << __FILE__ << " " <<
    __LINE__ << " "  << __FUNCTION__ << endl;
    sleep(10);

    pid_t result = fork(); // child_result == 0, parent_result == PID of child

    cout << "After fork, " << __FILE__ << " " <<
    __LINE__ << " "  << __FUNCTION__ << endl;

    sleep(10);

    cout << "After sleep, " << __FILE__ << " " <<
    __LINE__ << " "  << __FUNCTION__ << endl;

    return 0;
}

4.3. Example with fork / exec

// hello.cpp
#include <unistd.h>
#include <iostream>

using namespace std;

int main() {
    cout << "Hello World!" << endl;
    sleep(15);
    return 0;
}
g++ -o hello hello.cpp
forkExec: forkExec.o
    ${CXX} -o forkExec -std=C++17 forkExec.o
// forkExec.cpp
#include <unistd.h>
#include <iostream>

using namespace std;

int main() {
    // path to some executable
    char* const HELLO_EXECUTABLE = (char*) "/Users/richert/Desktop/32lecture/hello";

    cout << "Before fork, " << __FILE__ << ", " << __LINE__ << " " \
    << __FUNCTION__ << endl;

    // parent receives child PID, child_result == 0
    pid_t result = fork();

    cout << "After fork, " << __FILE__ << ", " << __LINE__ << " " \
    << __FUNCTION__ << endl;

    cout << "RESULT_PID = " << result << endl;
    cout << "PID: " << getpid() << endl;
    cout << "PPID: " << getppid() << endl;

    // Following if block executed ONLY by child process
    if (result == 0) {
        cout << "---" << endl;
        cout << "RESULT_PID = " << result << endl;
        cout << "PID: " << getpid() << endl;
        cout << "PPID: " << getppid() << endl;

        int execvResult;
        char* const path[] = { HELLO_EXECUTABLE };
        execvResult = execv(HELLO_EXECUTABLE, path); //if success, run then terminate

        // THIS LINE OF CODE NEVER REACHED IN CHILD
        // (unless execv returned an error)

        perror("execv seems to have failed");
        cerr << "execvResult=" << execvResult << endl;
        exit(1);
    }

    // parent process executes this
    // wait to check if no child process exists anymore
    //  https://linux.die.net/man/2/waitpid
    while (waitpid(result, NULL, 0)) {
        if (errno == ECHILD) { // all children of process terminated
            cout << "pid: " << getpid() << " has no children" << endl;
            break;
        }
    }
    cout << "After waiting, " << __FILE__ ", " << __LINE__ << " " \
    << __FUNCTION__ << endl;
} // Play around with ps –l between sleep to see PPID and PID

Author: Mehmet Emre

Created:

The material for this class is based on Prof. Richert Wang's material for CS 32