CS270 -- Rich's Hints for DIY File System Project Phase 1

CS270 -- Rich's Hints for DIY File System Phase 1

Rich Wolski --- Fall, 2024

this page last updated: Fri Sep 27 11:08:24 AM PDT 2024

Roadmap

The goal of Phase 1 is to establish a familiarity with the technologies that are necessary to complete Phase 2. These technologies are the Linux package-manager and configuration system, FUSE, and a little bit of Linux systems programming. This document is intended to be a guide but not a tutorial. That is, it expects that you will be using the typical documentation and Internet information outlets to familiarize yourself with the details.

Requirements and Style

When faced with a project that requires an amalgamation of disparate technologies, there are couple of different stylistic approaches that seem to be effective (at least in my experience). The first is to review all of them to make sure you understand what they do and how they might work together. In this case, the technologies are

The appropriate Linux package manager for Ubuntu, which is apt>. As root (or with sudo privileges), you can install prepackaged and tested version of the software and software dependencies you need or want to complete your project.
LibFUSE which is the FUSE file system capability you will use to implement file system "lift-off." Note that LibFUSE is available in prepackaged and tested form for most Debian, Ubuntu, and RedHat (CentOS, Alma, Rocky) Linux distros so you don't necessarily need to build it from source using the repo.
A Linux mail client and, possibly, sendmail. These will depend on what distro you choose.
A Linux "shell-out" capability. I use the utility popen() but you are free to use what every makes the most sense for the development environment you are using.

Taken together, this is your "stack" for the project. You can start out by planning how to integrate the various components of your stack and then start to develop according to this plan.

The other approach is to start at the bottom of "the stack" and to incrementally add functionality. I typically use this approach when I haven't use a technology much so I'll use this style for the remainder of this guide.

The Approach

The incremental approach I used to complete Phase 1 consisted, roughly, of the following steps.

install FUSE and dependencies
get hello world from LibFUSE repo and test
modify hello world to print out a different message (on a write)
install and test mail client
integrate mail client

Install FUSE and Dependencies

The basic FUSE interface and the documentation for LibFUSE are all for the C language. However, there are C++, Python, Go, and Rust bindings if you decide not to use C. My experience with these bindings is sketchy so I used the native C interface.

To install the necessary FUSE dependencies on Ubuntu 20.04, as the root user, I ran

apt update
apt upgrade
apt-get install gcc make gdb fuse3 libfuse3-dev

which gets the basic C development tools as well. Note that the initial ubuntu user has sudo privileges in the instance. You probably want to do a little reading up on how to run these tools either as the root user or with sudo if this is not familiar to you.

Install and Test FUSE Hello World

The LibFUSE repository on github has several helpful examples, including a minimalist "hello world" example. I cloned the repo and followed the build instructions for hello world. You will to read through the documentation to understand what you have built, how to start it, and how to stop it. You might also Linux mount command. FUSE does something similar, but in a different way. This is a good opportunity to hone your search skills with respect to understanding how to use FUSE. At the end of this step, though, the "hello world" example should work and you should know how to unmount the FUSE file system that it implements.

Modify the Hello World Example

The "hello world" example is a good "template" for building a FUSE daemon (the process FUSE will launch for you) and for implementing your call-backs. Take a look at the code and you will see that it records a file name and a string (which can be passed as command line arguments) in an in-memory data structure (look for "options"). When you run it and you type ls in the top level directory for the FUSE file system, you will see a file listed. If you run the Linux utility cat you should see the string. It is as if FUSE created a file with that string in it but -- really -- it just a process responding to file system calls that Linux makes as if there was a file in the file system.

At this point, you might find it instructive to run the hello world example with debugging turned on (read the FUSE documentation to see how to do that). When debugging is enabled, FUSE will print a trace of all of the Linux file system calls that the shell (bash in this case) makes when you access the "fictitious" file that the hello world example is implementing.

Of particular interest is that the example only allows you to read the file. Any attempt to write the file will fail. That is, the example emulates a read-only file with a single string in it. For your Phase 1 assignment, you will need to be able to write the file (I will not test being able to read it).

Thus, the simplest thing to do here is to add to this example the ability to change the string in the file by calling the Linux write() file system call. You will need to consult the LibFUSE documentation to understand how to add a call-back for Linux write(). You will also want to write a test code that opens the file and calls write() on it with a string you specify. My test for Phase 1 will essentially do that.

Install and Test a Linux mail Client

The next step is to make sure that (as the root user) you can send mail to a user at UCSB from the command line. Every distro (and usually every version of every distro) has a different preferred command-line mail client. For Ubuntu, the default comes in the mailutils package. You can install this using the command

sudo apt-get install mailutils

choose the defaults and "Internet site" in the installation menu that comes up.

When installed, you should be able to type

echo "This is a test" | mail -s 'CS270 testing' myemail@cs.ucsb.edu

where you substitute your email ID in CS for myemail and you should receive an email at that address.

Also, PLEASE DO NOT USE MY MAIL ADDRESS DURING TESTING. Substitute your UCSB email address for mine when you are testing. Then, before you turn in your solution, change the email address to be rich@cs.ucsb.edu so that when I run your code, I will get an email.

Integrate the Mail Client

At this point, you have a way to read and write a "fictitious" file and a way to send mail to a UCSB email address from the Linux command line. The next step is to integrate the two so that when a test code opens a file and writes a string into it, you send the string to the email address you are using as a target. To do so, you need to make two changes to your FUSE program.

The first is to allow the user code to specify the name of the file using the Linux open() file system call. The Hello Word example doesn't use the argument passed to open() to specify the file name. It sets the "options" structure from a parameter passed when the program is initialized (and includes a default if nothing is specified). Change this to set the file name specified in open. You will need to understand the contents of the "path" parameter passed by FUSE when it makes the call back for open().

The second change to make is to alter the write() call-back so that instead of changing the string that your FUSE process remembers (in "options"), you send the string to the mail client as the content of a message that will be mailed to the target recipient. This functionality is most easily implemented using a "shell out" which is a facility available in most language runtimes that allows a calling program to send a command line (and some parameters) to a shell and to wait for the command to complete. In C, one (but by no means the only) way to do that is to use the Linux popen() call. You will want to do a little reading to determine what the best way is to invoke the Linux mail client of your choosing with a specific target recipient and to send it a string that it will mail to that recipient.

There is a question about how to specify the recipient. As I mentioned, during testing, it should not be me. The two easiest options are either to hard code it into your FUSE program (and change it to my email address just before you turn in your solution) or to add an option to the "options" structure in the original Hello World FUSE example program (and to document how it is I should set it when I run your solution).

Create /cs270 and mount your file system on it as root

Become the root user using the command

sudo -s

executed as the user ubuntu. Then create the directory and mount for FUSE file system on it.

mkdir /cs270
~ubuntu/your-email-fuse-fs -s /cs270

where the binary of your FUSE daemon is in ubuntu user's home directory.

Then send me an email with the IP address of your instance that let's me know you are ready for me to grade your Phase 1.