call lecture from undergraduate OS class.
this page last updated:
Sun Sep 27 09:54:04 PDT 2020
File System Project
The pedagogical goal for this class project is for you to have the experience
of building a working file system for Linux. Doing so will hopefully acquaint
you with the concepts of system calls, the Linux file abstraction, and device
From an engineering perspective (i.e. what it is you need to do), the project
decomposes into three tasks:
Implicitly, there is a fourth task which is to integrate these three tasks
into a single, working file system.
- system call implementation -- implementing the system calls that
can be issued by Linux on files,
- implementing the file abstraction -- building the internal data
structures and procedures necessary to implement files, and
- implementing secondary storage management -- building the parts
of the file system that persist in secondary storage
File System Calls
For the system call components, you will need to use a software
facility call FUSE. FUSE (File System in User Space) is available for most
Linux systems. It provides a way to intercept file system calls issued by
Linux programs and to redirect the program flow into a daemon running as a
user-level process. That is, when a Linux program (any Linux program)
makes a file system call, FUSE will invoke a routine in a daemon process that
you have written instead of sending that system call to the Linux file system
implementation in the kernel.
Thus, by using FUSE, you can test your file system as if it were any other
file system. More importantly, you can compare the results that your file
system produces to the results produced by the Linux file system itself.
This "A/B" will be how we will evaluate your work at the end of the quarter.
File System Abstractions
Here, you have some latitude. As long as you implement the semantics of the
file system calls correctly, the design and implementation of the internals is
up to you. Any internal organization is acceptable however it must be in
memory only (i.e. you can't just stick things in a database).
The third part of the exercise is to write the secondary storage management
that persists the file system state across machine reboots. The goal is to be
able to shut down your file system (either through and unmount or a machine
reboot) and have all of the files remain in tact and in the same state when
the file system is remounted.
To help keep this exercise on schedule, we'll do this project in phase, where
each stage receives a project score (more on scoring below).
- Phase 1 -- Implement the basic open/close/read/write/seek and
directory functions using FUSE. To
complete this phase you must
at the end of phase 1, you should have a basic file system that works for most
programs that use the minimal POSIX file system interface
- implement mkfs to make a file system using a disk block
- implement the basic file abstractions (block management, block maps,
- integrate the file system implementation with FUSE
- Phase 2 -- Complete as many of the FUSE file operations as you can
and optimize the performance. At the end of Phase 1, you may have a working
file system but it is likely to be quite slow and/or incomplete if you are careful about
persistence. At the end of Phase 2, you should be able to run any regular Linux
commands (e.g. tar, gcc, grep, vi, etc.) in your file system just as on the
Linux file system itself. To do so, you'll need to make sure that you are
handling issue like access times, permissions, etc.
The expectation for Phase 2 is that it will be
faster than Phase 1 and it will work for more utilities,
but no less reliable. That is, the Phase 2 version is a
more complete version of the necessary Linux functionality, that may also
Dates and Grading
You will be graded on the results of Phase 1 and Phase 2.
Phase 1 is due November 11, 2020 at 11:59 PM
You will create a tar file containing all of your code and a detailed list of
instructions so that I can build, install, and test your Phase 1 solution.
Optionally, you make include test codes for me to use. If you do not, I will
write my own.
The final project (the completion of Phase 2) is due
at the end of the class. I will schedule a time slot to meet with each team
durin the class period on one of the following two days:
- December 7, 2020
- December 9, 2020
You will present
your file system
and demonstrate your final project (this activity will
take place in lieu of a final). The format for this evaluation is that you
will provide me with access to your file system so that I can run a series of
tests on it and to ask you questions about its response.
It is best if you work in a team of either 2 or 3, however to accommodate tthe
obvious additional challenges we are now facing, if you wish to work alone or
as a team of 4, you
All members of the team will receive the same grade.
I will assign presentation times randomly to each group or individuals for time
slots during those two
Referencing Existing Work
As mentioned, the goal of the project is to provide an opportunity to build a
file system from scratch -- an opportunity that is hopefully as beneficial as
it is rare professionally. Still, there exists a myriad of open source file
systems that have been developed using FUSE and much or all of the
functionality necessary to succeed with this project is likely to be
available as freely accessible code. Further, it is often easiest to
understand how to implement some of these abstractions through code examples.
For this project, it is acceptable to use existing code as reference material
however it is not acceptable to incorporate code snippets or routines from
other projects. That is, you may read code you find that helps illuminate a
particular concept just as you may read a paper or other text, however you may
not copy (either by hand or through electronic means) code for use in your
project. Further, as part of your code base, you must include a README or
LICENSE file that cites what ever references (text or code) that you have
chosen to consult.