CS170 Lecture notes -- A Brief Introduction to Operating Systems

Source Code Directory: /cs/faculty/rich/public_html/class/cs170/IntroOS

Lecture notes: http://www.cs.ucsb.edu/~rich/class/cs170/notes/IntroOS/index.html

Relevant Chapters from the Book: 1 and 2

The purpose of this lecture is to provide an overview for what the course is really about. This lecture will consist of both commentary on what the book has to say, and deep, meaningful thoughts of my own on the subject. Many of the lectures later in the quarter will share this character. As such, it is helpful for you to have a copy of the book handy should you find yourself reviewing this (or any other) lecture in this course that explicitly cites a set of chapters (1 and 2, in this case) from the book.

What is an operating system?

It is amazing to me that the book doesn't actually give you a definition in any meaningful way. In fact, it says something like "In general, we have no completely adequate definition of an operating system." Seriously? If I were me, I'd be pretty annoyed at having to take a course on a subject that has no definition. Worse, the subject that cannot be defined is the area of professional opportunity where the greatest potential for success lies. This kind of equivocating is reminiscent of superstition rather than science. The the course is a computer science course and and not one focused on the practice of becoming a witch doctor, perhaps we can do a little better.

Before we can define an operating system we need to refine our abstract thinking about computers. In particular, a computer is

a collection of devices that can be manipulated by a program to produce a specified calculation.

This definition is pretty broad and requires a supporting definition for the word device. A device, then, is

a machine capable of storing and/or transforming information according to a fixed set of rules.

Now we seem to be getting somewhere. We know that computers are made up of machines, that these machines manipulate information according to rules, and that the rules are finite. While we are at it, let's nail down a definition for program which is

a list of instructions to a computer that direct it to perform a desired calculation. and we can stop there with the semantic acrobatics.

Under this admittedly abstract set of definitions, an operating system is

software that implements an abstract machine using the constituent devices within a computer.

At first, it might seem like this definition is too broad or general. In particular, it doesn't explain what an operating system does nor does it seem to differentiate "operating systems" from "non operating systems" very well. However, these are features of the definition rather than bugs.

Herein lies the problem that the authors of the textbook could not solve. The definition of an operating system is, in fact, quite general. There are many operating systems and many software systems that, in fact, act as operating systems regardless of how they are characterized. It is for this reason that the field is important and difficult -- operating systems are general making understanding and building them a challenge.

This definition may also seem a little controversial given all of the attention "the industry" pays to concerns like security which are not explicitly part of the OS definition. The goal of this class is to help connect modern OS concepts like security, performance, extensibility, and portability to the basic notion of an operating system.

In the Beginning...

Operating systems, as the subject of intellectual exploration, are relatively "old." The word "old" is given in quotation marks here because computer science, itself, is a young discipline compared to other scientific fields of study, dating only to the 1940s. On that time line, the first operating systems were likely developed in the mid 1950s with commercially available products called operating systems debuting in the late 1950s and early 1960s.

The primary reason that operating systems were invented was to manage complexity.

Computers are complicated machines. They are built from less complicated machines, which, in turn, are built from less complicated machines, etc. down to a "bottom" level of complexity (e.g. the transistor or gate level) where physics and physical chemistry take over. That is, there is a basic unit of machine called a "gate" that performs a basic function (like flipping a switch) composed of physical processes and not other machines.

Knowing that computers are ultimately made up of "gates," however, is not quite enough to understand the complexity question. The other piece of the complexity puzzle has to do with a common engineering approach to dealing with complex systems called, colloquially, "divide and conquer."

When faced with the problem of building complicated things, engineers often decompose the complicated functionality into simpler separate functionalities and rules for combining those functionalities. This method of design organizes the architecture as a hierarchy in which components below a certain level are logically separate and the rules at each level define a small number of ways they can interact.

Hierarchical or "modular" design is much older than computer science and is, perhaps, the most successful human technique for understanding and building things in the real world. Thus it should come as little surprise that computers were designed and built this way from the beginning.

Abstraction

Key to this approach is the notion of abstraction. At each level in the design hierarchy, the levels "above" see only the rules and not the vast collection of lower level components. Thus, "divide and conquer" implies that the hierarchy consists of rules that "cover" successively larger amounts of lower level complexity. The term "abstraction" refers to this covering or hiding of complexity.

Thus computers are composed of layers of abstractions down to some fundamental level beyond which nature provides the concrete. What does this have to do with operating systems?

Extending the Hardware Abstractions with Software

In the mid to late 1960s it was had become clear that implementing abstractions strictly in hardware had reached a scaling limit. Even using hierarchy, the complexity (and expense) associated with building more powerful machines strictly from hardware devices had become prohibitive and that no amount of technological innovation would likely reverse this condition.

Software, however, is relatively cheap with respect to its ability to implement abstraction. Computer manufacturers had been aware of this situation for some time, but it wasn't until Edsger Dijkstra began to look at the problem of using software to extend the hardware abstractions that OS became a "true" field of study. Dijkstra proposed rules for defining software hierarchies that were designed to project successively simpler abstractions "upward" at each level with the top level, ultimately, being the "user." There are other methods of design that have been proposed for operating systems over the years (a few of them successful) but today operating systems are largely designed and built (roughly) according to the abstraction-layering architecture originally due to Dijkstra.

Fast Forward

Today, computers are significantly more complicated than they were even when Dijkstra began to advocate for operating system design as a full-fledged engineering discipline (around 1966). modern computers consist of

CPU, Memory, I/O devices
CPU
- Registers
- cache
- interlocks
- interrupts
Memory
- volatile and non-volatile
- banks and controllers
- consistency management (IOMMU)
I/O devices
- I/O buses
- CPU control (traps)
- interrupts
- displays
- networks
- storage devices

Originally, the goal of an OS was to make a machine made up of a far smaller number of different devices easier to program. Notice that without some form of abstraction, you'd need to write a program for each and every hardware component in your machine. Dijkstra believed that writing these programs was impossibly complex. He also believed that writing a hierarchy of abstractions was hard and thus it was important to be able to prove the hierarchy correct.

Today, we've given up on the prospect of proving an OS correct. Instead, we use the OS to implement an abstract machine that is simpler than the hardware and "secure." Additionally, the notion of "divide and conquer" inherent in an operating system makes it possible for the OS to multiplex the same set of hardware devices among multiple potentially competing usages. In this case, we also expect the operating system to implement "fairness" where this latter term refers to a global priority scheme and not to the notion that each usage gets the same share of the multiplex.

An Operating System is as an Operating System Does

Finally, having defined what an operating system is, we are able to rejoin the authors of the textbook and consider (as they do) what an operating system does.

There are three primary functions that a modern operating system performs:

Abstract extension of the hardware:An OS provides an abstract machine that can be used to program the hardware.
Secure access to all levels of the hierarchy:An OS implements secure access to the various levels of the software and hardware hierarchy that makes up a computer.
Fair multiplexing of scarce resources:If multiplexing is implemented, then an OS implements it fairly according to some predetermined priority scheme.

This Has Nothing to do with Me

It is tempting, after reading the text up to this point in the lecture to bask in the comfortable assurance that none of this abstract thinking is in any way connected with your very real and rapidly improving computer expertise. Unless you were unfortunate enough to be exposed to any of this as a small child, you probably think of computers as being Linux or Windows systems, the designation being one that differentiates the operating system that they run. Some of you may think of computers in terms of the Java programming environment you can use. Take heart. Most people in the world today think of computers in terms of their web browser and email client or worse, Android phone. What do abstraction, fairness, and security have to do with the latest version of Snapchat's app?

The answer is actually somewhat complicated (which is probably why the book does not boldly stake a claim to a definition of operating system in the first place -- the chickens).

Ease of Programming

One of the unheralded original contributions that is unique to computer science is that CS uses abstraction to make things easier. Compare this use of abstraction to that of Mathematics in which it is used to make things more general, at the possible expense of potentially considerable effort.

In a computing context, the way that hardware devices work is often driven more by economic concerns than convenience ones. The expert economic opinions fluctuate, but device interfaces are usually designed to be economical and fast rather than easy-to-use. For example, the hardware interface to a disk controller is designed so that the disk controller can be made cheaply, and so that the disk controller will be fast, but not so much so that you will be able to program it easily. Indeed today, disk controller and disk drive manufacturers rely on the notion that there will be an OS "above" their products to make them usable and, thus, they concentrate on making them cheap, reliable, and fast.

For example, the Linux read()/write()/open()/close() interface is an example of these higher-level abstractions. They ultimately culminate in complicated disk operations that are implemented using the disk hardware interface.

Why does this have anything to do with operating systems? The answer is that if you want to implement higher-level abstractions while ensuring fairness and security you need to implement all or part of them in the OS. Thus the operating system plays a role in making programs easier to write.

Performance

There is an easy way and a hard way to do almost anything. If easy and hard are synonyms for "fast" and "slow" then the operating system is often used to make certain operations "easy." For example, it would be possible to implement a set of primitive I/O abstractions from which many different file abstractions could be built. Linux does not implement the file abstraction this way, however. Instead, Linux files are implemented in the operating system using a large set of shared variables. Because the OS is permanent and developed over many years, the thinking is that it can be reliable, extensible, and fast. That is, you get the benefit of many OS programmers when you access a file rather than having to come up with an efficient file system implementation on your own.

Portability

A sometimes overlooked motivation for operating systems design is portability. Internally, the variety of machines offered by a single vendor may differ significantly in terms of hardware interfaces. For example, Sun Microsystems (now part of Oracle) supports both x86 and SPARC CPU types. Even when the CPUs are the same, the internal controllers and other devices may be different. At the same time, vendors would very much like to support cross-platform portability within their own line. The operating system, in essence, provides an abstract machine that can be made consistent across platforms.

Hardware Support

Depending on the way in which a computer is to be used, it may need to provide some basic hardware support for OS implementation. For example, one of the questions you might ask yourself is

"Why isn't the operating system implemented as a library?"

This question, at which you might scoff initially, is actually a reasonable one to ask. In fact, some real time systems actually work this way and there have been wacky scientific computing systems that have considered this organization. For the rest of us, however, the hardware must support a user mode and a supervisor mode, primarily to ensure that the operating system (and not a user program) is ultimately in charge of shared resources.

The hardware must support some way for different devices to cause OS code to run on their behalf. Most devices have small on-board processors, but the bulk of the work that they need to have done must be done by the machine itself. For device-specific code to be executed, most machine support one or more interrupt vectors that permit each device to signal that its own driver needs to be executed.

Modern hardware also typically supports some form of memory protection that allows the OS to isolate user programs from each other. Base/Limit registers and virtual memory mapping support are two different ways that the hardware can maintain secure and fair access to memory. We'll discuss this particular issue at much greater length in this class.

Operating Systems -- an Underview

At this point, you should have an idea of how to think about operating systems as a self-contained discipline. You shouldn't worry about whether you can "see" how this high-level overview pertains (or does not pertain) to your experience with operating systems in general, or a specific operating system -- that's the reason for the course. From here until the end of the course we'll try and drill into what the concepts mean in terms of what you experience when you interact with an operating system. Hopefully, as a result, you will understand the basics of the discipline from top to bottom, or (if you prefer) from bottom to top.