In this simulation we will assume that there are users submitting jobs to printers. In particular, we have nusers users, and nprinters printers. We'll assume that all the printers are identical (e.g. in a machine room) so that when a user wants to print something out, it doesn't matter which printer it comes out on.
Now, in our simulation, every so often, a user will decide to print something. When this happens, the print job will be submitted, and if any printer is available, it will print the job (taking 4 seconds a page). If all the printers are printing something, then the job will be queued until one of the printers is ready. Our print queue will have a fixed size. If the queue is full, then the user must wait until the queue is not full to submit the job.
Obviously, we are going to use threads for this simulation. Each user will have its own thread, and each printer will have its own thread. The threads will communicate through shared memory.
/* * global simulation parameters */ typedef struct { int nusers; int nprinters; int arrtime; int maxpages; int bufsize; int nevents; int starttime; void *state; /* your added global state */ } SimParameters; /* * each active simulation entity will have its own Agent structure */ typedef struct { SimParameters *p; /* pointer to global sim parameters */ void *v; /* pointer to individual state record */ int id; /* integer id for this entity */ } Agent; /* * print jobs look like this */ typedef struct { int jobsize; int userid; int jobid; } Job; extern void initialize_state(SimParameters *); extern void submit_job(Agent *, Job *); extern Job *get_print_job(Agent *); extern pthread_mutex_t print_lock; extern int Ego();This defines some data structures that will be used, plus some subroutine prototypes.
There is also a driver program, in this case printqsim.c. This defines a main() routine which sets up the threads. Together with your definitions of the subroutines, the driver program will solve the problem.
The idea is to write the simulated entities without changing either the header or driver files. Instead, we are to provide a C file that defines the subroutines in the header file, and when this is compiled with the driver program, the resulting program solves the problem.
That is, the code in printqsim.c is the simulation engine, but the engine is missing a few parts. Our job is to supply the missing parts, and the engine will run the simulation. Many simulation systems work this way.
In this case, our job is to define initialize_state(), submit_job() and get_print_job() so that together with printqsim.c, our program performs the user/printer simulation correctly.
Ok, let's look at printqsim. It takes 6 arguments:
#include < stdio.h > #include < pthread.h > #include "printqsim.h" pthread_mutex_t print_lock; SimParameters Parameters; int Ego() { int return_val; return_val = (int)pthread_self(); return(return_val); } void *user_thread(void *arg) { Agent *s; SimParameters *p; int i; int sleeptime; int jobsize; Job *j; /* * get the Agent for this entity structure */ s = (Agent *) arg; /* * from the agent struct, get the global simulation parameters */ p = s->p; /* * a user submits nevents number of print jobs -- this is the main loop * * nevents is a global simulation parameter */ for (i = 0; i < p->nevents; i++) { /* * pick a random sleep time based on the parameter arrtime */ sleeptime = random()%(p->arrtime*2) + 1; printf("%4d: user %2d/%03d: Sleeping for %2d seconds\n", time(0)-p->starttime, s->id, i, sleeptime); fflush(stdout); /* * put this thread to sleep */ sleep(sleeptime); /* * sleep done, let's make a job */ j = (Job *) malloc(sizeof(Job)); /* * random job size */ j->jobsize = (random()%p->maxpages) + 1; /* * the job belongs to this thread (user) */ j->userid = s->id; /* * we'll label the job with this iteration number */ j->jobid = i; printf("%4d: user %2d/%03d: Submitting a job with size %d\n", time(0)-p->starttime, s->id, i, j->jobsize); fflush(stdout); /* * call the submit job routine passing the queue structure and * the job as arguments */ /* * you write this function */ submit_job(s, j); } /* * here, all of the jobs have been submitted -- exit */ printf("%4d: user %2d/%03d: Done\n", time(0)-p->starttime, s->id, i); fflush(stdout); return NULL; } /* Assume 4 seconds a page */ void *printer_thread(void *arg) { Agent *s; SimParameters *p; int jobsize, userid, jobid; int i; Job *j; /* * get he agent structure for this agent */ s = (Agent *)arg; /* * get the simulation parameters for the entire simulation */ p = s->p; i = 0; /* * printers run for ever */ while(1) { printf("%4d: prnt %2d/%03d: ready to print\n", time(0)-p->starttime, s->id, i); fflush(stdout); /* * get the next print job * * you write this function */ j = get_print_job(s); /* * if there isn't one, we are finished */ if (j == NULL) { printf("%4d: prnt %2d/%03d: Done\n", time(0)-p->starttime, s->id, i); fflush(stdout); pthread_exit(NULL); } /* * otherwise, simulate printing a job by sleeping * for 4 * the jobsize seconds */ printf("%4d: prnt %2d/%03d:", time(0)-p->starttime, s->id, i); printf(" Printing job %3d from user %2d size %3d\n", j->jobid, j->userid, j->jobsize); fflush(stdout); sleep(4*j->jobsize); /* * throw away the job structure since the job * has been simulated */ free(j); i++; } } /* * simulation engine */ main(int argc, char **argv) { Agent *s; pthread_t *user_tids; pthread_t *printer_tids; pthread_attr_t attr; int i; /* * first, check to see if there are enough arguments */ if (argc != 7) { fprintf(stderr, "usage: printqsim nusers nprinters arrtime maxpages bufsize nevents\n"); exit(1); } /* * initialize a global mutex lock for the entire simulation */ pthread_mutex_init(&print_lock, NULL); /* * parse the arguments and pit them in a global simulation * structure variable called Parameters. Its purpose is to * hold the simulation parameters for the entire simulation */ Parameters.nusers = atoi(argv[1]); Parameters.nprinters = atoi(argv[2]); Parameters.arrtime = atoi(argv[3]); Parameters.maxpages = atoi(argv[4]); Parameters.bufsize = atoi(argv[5]); Parameters.nevents = atoi(argv[6]); Parameters.starttime = time(0); /* * seed the random number generator */ srandom(Parameters.starttime); /* * call the initialize routine * * you write this function */ initialize_state(&Parameters); /* * there will be one pthread for each user and one pthread for * each printer. Need to make the space to hold the thread * identifiers */ user_tids = (pthread_t *) malloc(sizeof(pthread_t)*Parameters.nusers); printer_tids = (pthread_t *) malloc(sizeof(pthread_t)*Parameters.nprinters); /* * create the user threads */ for (i = 0; i < Parameters.nusers; i++) { /* * make an agent structure for this user */ s = (Agent *)malloc(sizeof(Agent)); /* * point this agent at the global simulation parameters */ s->p = &Parameters; /* * give this simulation structure its own id */ s->id = i; /* * make this thread pre-emptable */ pthread_attr_init(&attr); pthread_attr_setscope(&attr,PTHREAD_SCOPE_SYSTEM); /* * launch this user thread passing in the simulation structure * for this user as an argument */ pthread_create(&(user_tids[i]), &attr, user_thread, (void *) s); } /* * now create the printer threads */ for (i = 0; i < Parameters.nprinters; i++) { /* * make an agent structure for each printer */ s = (Agent *)malloc(sizeof(Agent)); /* * point the printer at the global parameters */ s->p = &Parameters; /* * give this printer its own id */ s->id = i; /* * make this thread pre-emptable */ pthread_attr_init(&attr); pthread_attr_setscope(&attr,PTHREAD_SCOPE_SYSTEM); /* * launch this printer thread passing in the * simulation structure for this printer as an argument */ pthread_create(&(printer_tids[i]), &attr, printer_thread, (void *) s); } /* * at this point, we have one thread running for each user * and one running for each printer, each thread has a copy * of the simulation parameters and its own simulation id -- there * is nothing left for the main thread to do */ /* * the main thread is done -- exit */ for(i=0; i < Parameters.nprinters; i++) { (void)pthread_join(printer_tids[i],NULL); } for(i=0; i < Parameters.nusers; i++) { (void)pthread_join(user_tids[i],NULL); } pthread_exit(NULL); }Now, the main() routine sets up a SimParameters struct. The purpose of this structure is to hold information that is pertinent to the simulation globally. Any parameters that all threads will need can be stored here. In addition, you can add an global structures you want in the routine initialize_state() by pointing the state field in the SimParameters structure to your own structure.
Values that thread-specific are stored in an Agent structure. Each agent structure also gets a pointer to the SimParameters structure so it can see the global state. In this way, thread need not access global variables directly, but rather get all of their information through a Agent structure that is passed in.
After submitting nevents jobs, the user thread exits. The user thread prints out when it sleeps, and when it submits a job.
Now, look at ps1.c.
#include < stdio.h > #include < pthread.h > #include "printqsim.h" void initialize_state(SimParameters *p) { pthread_mutex_lock(&print_lock); fprintf(stdout,"thread-%d, initialize_v: dummy version called\n", Ego()); fflush(stdout); /* * no dynamically initialized global state in this example */ p->state = NULL; pthread_mutex_unlock(&print_lock); } void submit_job(Agent *s, Job *j) { pthread_mutex_lock(&print_lock); fprintf(stdout,"thread-%d, submit_job: dummy version called\n", Ego()); fprintf(stdout,"thread-%d\returning\n", Ego()); fflush(stdout); pthread_mutex_unlock(&print_lock); return; } Job *get_print_job(Agent *s) { pthread_mutex_lock(&print_lock); fprintf(stdout,"thread-%d, get_print_job: dummy version called\n", Ego()); fprintf(stdout,"thread-%d\treturning\n", Ego()); pthread_mutex_unlock(&print_lock); return NULL; }This is one solution to the problem. It's not a working solution, but it is one that will compile, run, and hopefully illustrate a couple of points. What it does is set p->state to NULL, ignore print jobs when they are submitted, and force the printer threads to exit. It also prints out a message in each routine so that you can see your routines being called by the main driver.
Try running it:
rich@homer:~/public_html/class/cs170/notes/PrinterSim$ !./ ./ps1 5 3 5 5 5 3 thread--1208141600, initialize_v: dummy version called 0: user 0/000: Sleeping for 10 seconds 0: user 1/000: Sleeping for 10 seconds 0: user 2/000: Sleeping for 5 seconds 0: user 3/000: Sleeping for 2 seconds 0: user 4/000: Sleeping for 10 seconds 0: prnt 0/000: ready to print thread--1260594256, get_print_job: dummy version called thread--1260594256 returning 0: prnt 0/000: Done 0: prnt 1/000: ready to print thread--1271084112, get_print_job: dummy version called thread--1271084112 returning 0: prnt 1/000: Done 0: prnt 2/000: ready to print thread--1281573968, get_print_job: dummy version called thread--1281573968 returning 0: prnt 2/000: Done 2: user 3/000: Submitting a job with size 1 thread--1239614544, submit_job: dummy version called thread--1239614544 returning 2: user 3/001: Sleeping for 6 seconds 5: user 2/000: Submitting a job with size 1 thread--1229124688, submit_job: dummy version called thread--1229124688 returning 5: user 2/001: Sleeping for 10 seconds 8: user 3/001: Submitting a job with size 5 thread--1239614544, submit_job: dummy version called thread--1239614544 returning 8: user 3/002: Sleeping for 1 seconds 9: user 3/002: Submitting a job with size 5 thread--1239614544, submit_job: dummy version called thread--1239614544 returning 9: user 3/003: Done 10: user 0/000: Submitting a job with size 3 thread--1208144976, submit_job: dummy version called thread--1208144976 returning 10: user 0/001: Sleeping for 9 seconds 10: user 1/000: Submitting a job with size 3 thread--1218634832, submit_job: dummy version called thread--1218634832 returning 10: user 1/001: Sleeping for 8 seconds 10: user 4/000: Submitting a job with size 2 thread--1250104400, submit_job: dummy version called thread--1250104400 returning 10: user 4/001: Sleeping for 9 seconds 15: user 2/001: Submitting a job with size 3 thread--1229124688, submit_job: dummy version called thread--1229124688 returning 15: user 2/002: Sleeping for 10 seconds 18: user 1/001: Submitting a job with size 4 thread--1218634832, submit_job: dummy version called thread--1218634832 returning 18: user 1/002: Sleeping for 1 seconds 19: user 0/001: Submitting a job with size 2 thread--1208144976, submit_job: dummy version called thread--1208144976 returning 19: user 0/002: Sleeping for 7 seconds 19: user 4/001: Submitting a job with size 5 thread--1250104400, submit_job: dummy version called thread--1250104400 returning 19: user 4/002: Sleeping for 4 seconds 19: user 1/002: Submitting a job with size 3 thread--1218634832, submit_job: dummy version called thread--1218634832 returning 19: user 1/003: Done 23: user 4/002: Submitting a job with size 4 thread--1250104400, submit_job: dummy version called thread--1250104400 returning 23: user 4/003: Done 25: user 2/002: Submitting a job with size 4 thread--1229124688, submit_job: dummy version called thread--1229124688 returning 25: user 2/003: Done 26: user 0/002: Submitting a job with size 4 thread--1208144976, submit_job: dummy version called thread--1208144976 returning 26: user 0/003: DoneThis created a simulation with 5 users, 3 printers, an average of 5 seconds between print jobs, a max page size of 5, a print queue size of 5, and three print jobs per user.
You'll note that the simulation did run, but not correctly. Why? Well, the printers never printed anything, for starters. Moreover, more than 5 print jobs were submitted and ostensibly queued, and the subsequent print jobs were still allowed to be submitted.
Also notice the debugging information. In the dummy routines, the task id of the task calling the routine is printed as well as the routine name and the action that is being taken. If you look at the code, you'll see a mutex lock surrounding the print statements. Why?
This may seem like a boneheaded example, but it illustrates something important -- it is sometimes best to start with something that has the right structure and then add the necessary functionality. The labs in the course all can be developed in this way.
Since you have multiple threads accessing the buffer, you'll need to protect it with a mutex. The above is all done in ps2.c.
#include < stdio.h > #include < pthread.h > #include "printqsim.h" /* * describes a buffer used to queue jobs between users and printers */ typedef struct { Job **jobs; /* array of Job pointers used as a queue */ int head; /* head of the queue */ int tail; /* tail of the queue */ int njobs; /* number of jobs in queue */ pthread_mutex_t *lock; /* lock for the queue */ } Buffer; void initialize_state(SimParameters *p) { Buffer *b; b = (Buffer *) malloc(sizeof(Buffer)); b->jobs = (Job **) malloc(sizeof(Job *)*p->bufsize); b->head = 0; b->tail = 0; b->njobs = 0; b->lock = (pthread_mutex_t *) malloc(sizeof(pthread_mutex_t)); pthread_mutex_init(b->lock, NULL); /* * Okay -- the queue we've just made will be used by all agents * * point the simulation parameters structure at this state */ p->state = (void *) b; } void submit_job(Agent *s, Job *j) { Buffer *b; SimParameters *p; /* * get the global sim parameters from the agent */ p = s->p; /* * get the buffer queue from the sim parameters */ b = (Buffer *) p->state; while(1) { /* * lock this queue */ pthread_mutex_lock(b->lock); /* * if there is space to add another job */ if (b->njobs < p->bufsize) { /* * add it at the head */ b->jobs[b->head] = j; b->head = (b->head + 1) % p->bufsize; /* * bump the count */ b->njobs++; /* * drop the lock */ pthread_mutex_unlock(b->lock); return; } else /* the queue is full -- for now, kill this thread */ { /* * drop the lock so we don't die holding * it */ pthread_mutex_unlock(b->lock); printf("%4d: user %2d -- the queue is full -- exiting\n", time(0)-p->starttime, s->id); fflush(stdout); /* * see ya */ pthread_exit(NULL); } } return; } Job *get_print_job(Agent *s) { /* * do nothing for now */ return NULL; }First, it defines a Buffer struct that uses an array as a circular queue (with head/tail/njobs) defining the state of the queue. It also has a mutex.
In initialize_state(), the buffer is allocated, and state is set to be the buffer. However, now submit_job inserts the job into the buffer if there's room. If there's not room, the user thread exits. Also, nothing is done with get_print_job(). Again, this is an example of programming incrementally -- you try one thing and test it to make sure it works before going on. In this case, we don't have a completely working solution yet, but we have a start.
When we call this with the same arguments as before, we see that 5 users each submit 3 jobs and then the users all exit. This is what we expect, so the code is working:
#include < stdio.h > #include < pthread.h > #include "printqsim.h" /* * queue of jobs */ typedef struct { Job **jobs; /* array of pointers to jobs */ int head; /* head of queue */ int tail; /* tail of queue */ int njobs; /* number of jobs */ pthread_mutex_t *lock; /* lock for this queue */ } Buffer; void initialize_state(SimParameters *p) { Buffer *b; /* * make space for this queue */ b = (Buffer *) malloc(sizeof(Buffer)); /* * make space to hold an array of pointers to jobs */ b->jobs = (Job **) malloc(sizeof(Job *)*p->bufsize); b->head = 0; b->tail = 0; b->njobs = 0; b->lock = (pthread_mutex_t *) malloc(sizeof(pthread_mutex_t)); pthread_mutex_init(b->lock, NULL); /* * point the global sim parameter structure to this queue structure */ p->state = (void *) b; } void submit_job(Agent *s, Job *j) { SimParameters *p; Buffer *b; /* * get the global sim parameters */ p = s->p; /* * get the globally defined queue structure */ b = (Buffer *)p->state; while(1) { /* * lock this queue */ pthread_mutex_lock(b->lock); /* * if there is space in the que for another job */ if (b->njobs < p->bufsize) { /* * add it at the head */ b->jobs[b->head] = j; b->head = (b->head + 1) % p->bufsize; b->njobs++; /* * drop the lock and bail */ pthread_mutex_unlock(b->lock); return; } else /* otherwise, sleep for a second and try again */ { /* * !!! drop the lock so we don't go to sleep holding * it */ pthread_mutex_unlock(b->lock); printf("%4d: user %2d sleeping because the queue is full\n", time(0)-p->starttime, s->id); fflush(stdout); sleep(1); } } } Job *get_print_job(Agent *s) { SimParameters *p; Buffer *b; Job *j; /* * get the sim parameters from the agent */ p = s->p; /* * get the job queue buffer from the global sim parameters */ b = (Buffer *)p->state; /* * do forever */ while(1) { /* * lock the queue */ pthread_mutex_lock(b->lock); /* * if there are jobs waiting in the queue */ if (b->njobs > 0) { /* * get the job pointer at the tail of the lost */ j = b->jobs[b->tail]; /* * bump the tail around */ b->tail = (b->tail + 1) % p->bufsize; /* * decrement the number of jobs in the queue */ b->njobs--; /* * we've got the job safely off the queue, drop the * lock */ pthread_mutex_unlock(b->lock); /* * return the job */ return j; } else /* otherwise, the queue is empty */ { /* * drop the lock so we don't sleep with it */ pthread_mutex_unlock(b->lock); printf("%4d: prnt %2d sleeping because the queue is empty\n", time(0)-p->starttime, s->id); fflush(stdout); /* * sleep for a second -- maybe there will be more jobs * when we wake up */ sleep(1); } /* * loop back and try again */ } return; }When submit_job() is called and the queue is full, the mutex is released, and sleep(1) is called. Then the queue is checked again. In this way, if a printer thread calls get_print_job() during that second, then it can take a job off the queue, and then user's job may be submitted. Similarly, when the queue is empty and a printer calls get_print_job() it sleeps for a second an checks again. note, it has to release the mutex when it sleeps so that a user thread can actually put a job on the queue.
The code works. Try it out:
This is a workable solution, but it is not a good one. The technique of periodically checking the queue is called polling. It's not really what you want because you'd like for a printer thread to wake up and start printing as soon as a job is inserted into the queue, instead of up to a second afterward. You could cut the second down, or even loop back immediately, but then you'd be inefficient and you'd run the risk of starvation (theoretically). Similarly, you'd like the user to complete submitting a job as soon as a printer thread empties a space in the queue instead of up to a second afterward. Lastly, there is overhead associated with sleeping and waking. Every time a thread re-sleeps (goes through its polling loop without finding new work to do) your program has used system resources without getting any useful work done.
In short, polling is a ok, but not great.
A monitor is a data structure which a thread can "enter" and "exit". Only one thread may be in the monitor at a time (hence they can be used to enforce critical sections). This is just like a mutex, and in pthreads, there is no entity called a "monitor". You just use a mutex for the simple variety. Condition variables allow you to do more sophisticated things with monitors. A condition variable must be associated with a specific monitor. There are three procedures that act on condition variables, and whenever you call them, you must have entered the relevant monitor (i.e. you must have locked the relevant mutex):
This says to release the mutex and block until another thread unblocks you. This is done atomically. Why? By now, you should understand why this call must be atomic with respect to other executing threads. If you are a little fuzzy at this point, you might do well to review the previous lecture.
When pthread_cond_wait() returns, that means that you have been woken up, and you have reacquired the mutex.
This chooses one or more thread that has blocked on the condition variable, and unblocks it. If there is no thread that has blocked on the condition variable, then pthread_cond_signal() does nothing. There are no guarantees about which thread gets unblocked if there are more than one blocked -- just that some thread(s) will be unblocked. The pthreads library does not require that you actually own the mutex when you call pthread_cond_signal(). Some threads packages do, and I think that it's a good idea, so whenever you see me use pthread_cond_signal(), I will have locked the relevant mutex.
This unblocks all threads that have blocked on the condition variable.
Now, here is an odd thing -- if you call pthread_cond_signal() or pthread_cond_broadcast(), then you should own the mutex (i.e. you should have locked the mutex). However, the thread that you are unblocking will have locked the mutex when it called pthread_cond_wait(). This at first appears to be a contradiction, but you must remember that the waiting thread unlocks the mutex while it is blocked. When it is unblocked, it must relock the mutex before returning from pthread_cond_wait.
As it turns out there are a few choices that the threads system has in implementing condition variables.
Read the book (chapter 5) for a further discussion of this.
Likewise, we'll call pthread_cond_wait() in get_print_job() when the queue is empty, and pthread_cond_signal() in submit_job() when a user thread inserts a job into an empty queue.
Note that submit_job() and get_print_job() both use while loops because when pthread_cond_wait() returns, the queue may have become full/empty in the time between when the waiting thread unblocked and the time that it acquired the mutex. Therefore, it may have to wait again.
The code is in ps4.c. When you run it, everything seems to work just fine.
#include < stdio.h > #include < pthread.h > #include "printqsim.h" /* * job queue structure */ typedef struct { Job **jobs; int head; int tail; int njobs; pthread_mutex_t *lock; pthread_cond_t *full; /* condition variable controlling fullness */ pthread_cond_t *empty; /* condition variable controlling emptiness */ } Buffer; void initialize_state(SimParameters *p) { Buffer *b; b = (Buffer *) malloc(sizeof(Buffer)); b->jobs = (Job **) malloc(sizeof(Job *)*p->bufsize); b->head = 0; b->tail = 0; b->njobs = 0; b->lock = (pthread_mutex_t *) malloc(sizeof(pthread_mutex_t)); /* * make space enough to hold condition variables */ b->full = (pthread_cond_t *) malloc(sizeof(pthread_cond_t)); b->empty = (pthread_cond_t *) malloc(sizeof(pthread_cond_t)); /* * initialize the mutex and cond vars */ pthread_mutex_init(b->lock, NULL); /* * if this is set to SCHED_FIFO the solution works */ pthread_attr_setschedpolicy(&attr,SCHED_RR); pthread_cond_init(b->full, NULL); pthread_cond_init(b->empty, NULL); p->state = (void *)b; return; } void submit_job(Agent *s, Job *j) { SimParameters *p; Buffer *b; p = s->p; b = (Buffer *)p->state; /* * lock this buffer so we can test under lock */ pthread_mutex_lock(b->lock); while(1) { /* * if a new job will fit */ if (b->njobs < p->bufsize) { /* * enqueue it at the head */ b->jobs[b->head] = j; /* * bump the head pointer */ b->head = (b->head + 1) % p->bufsize; b->njobs++; /* * if the queue was empty, signal a printer * thread waiting for a job to arrive */ if (b->njobs == 1) { pthread_cond_signal(b->empty); } /* * drop the lock -- we are leaving the critical * section */ pthread_mutex_unlock(b->lock); /* * job successfully queued, bail out */ return; } else /* the queue is full -- we must wait until it has space */ { printf("%4d: user %2d blocking because the queue is full\n", time(0)-p->starttime, s->id); fflush(stdout); /* * wait here -- printer thread will signal * when there is space */ pthread_cond_wait(b->full, b->lock); /* * when we wake up here, we are holding the lock * and we are the only thread in the critical section */ } } } Job *get_print_job(Agent *s) { SimParameters *p; Buffer *b; Job *j; p = s->p; b = (Buffer *)p->state; /* * lock this buffer -- we are going to mess with it */ pthread_mutex_lock(b->lock); while(1) { /* * if there are jobs in the queue */ if (b->njobs > 0) { /* * get the one at the tail */ j = b->jobs[b->tail]; b->tail = (b->tail + 1) % p->bufsize; b->njobs--; /* * if the buffer was full before we took this * job off the queue, we must signal any waiting * user threads */ if (b->njobs == p->bufsize-1) { pthread_cond_signal(b->full); } /* * those threads won't run (if they are there) * until we leave the critical section * * drop the lock to get out */ pthread_mutex_unlock(b->lock); return j; } else /* there are no jobs in the queue -- wait until there are */ { printf("%4d: prnt %2d blocking because the queue is empty\n", time(0)-p->starttime, s->id); fflush(stdout); /* * wait here until a user job signals that the * queue is no longer empty */ pthread_cond_wait(b->empty, b->lock); /* * when we wake up, we know that a user has signaled * us. We are holding the lock and the only thread * in the critical section */ } } }
UNIX> !ps ps4 5 3 5 5 5 3 0: user 0/000: Sleeping for 4 seconds 0: user 1/000: Sleeping for 10 seconds 0: user 2/000: Sleeping for 5 seconds 0: user 3/000: Sleeping for 2 seconds 0: user 4/000: Sleeping for 7 seconds 0: prnt 0/000: ready to print 0: prnt 0 blocking because the queue is empty 0: prnt 1/000: ready to print 0: prnt 1 blocking because the queue is empty 0: prnt 2/000: ready to print 0: prnt 2 blocking because the queue is empty 2: user 3/000: Submitting a job with size 5 2: user 3/001: Sleeping for 10 seconds 2: prnt 0/000: Printing job 0 from user 3 size 5 4: user 0/000: Submitting a job with size 1 4: user 0/001: Sleeping for 1 seconds 4: prnt 1/000: Printing job 0 from user 0 size 1 5: user 2/000: Submitting a job with size 4 5: user 2/001: Sleeping for 6 seconds 5: user 0/001: Submitting a job with size 3 5: user 0/002: Sleeping for 10 seconds 5: prnt 2/000: Printing job 0 from user 2 size 4 7: user 4/000: Submitting a job with size 4 7: user 4/001: Sleeping for 10 seconds 8: prnt 1/001: ready to print 8: prnt 1/001: Printing job 1 from user 0 size 3 10: user 1/000: Submitting a job with size 1 10: user 1/001: Sleeping for 6 seconds 11: user 2/001: Submitting a job with size 3 11: user 2/002: Sleeping for 1 seconds 12: user 3/001: Submitting a job with size 1 12: user 3/002: Sleeping for 10 seconds 12: user 2/002: Submitting a job with size 5 12: user 2/003: Done ...
Look at ps4-bad.txt. This is exactly what happens. There are three user threads and five printer threads. Initially, all of the printer threads block. At the 3 second mark, two user threads submit jobs, but only one printer thread (0) is signalled. Then, more jobs are put onto the print queue, but since njobs is greater than 1, no more printers get awakened. This is a bug.
Fixing this bug is simple (in ps5.c) -- simply remove the if statements around the pthread_cond_signal() calls.
#include < stdio.h > #include < pthread.h > #include "printqsim.h" typedef struct { Job **jobs; int head; int tail; int njobs; pthread_mutex_t *lock; pthread_cond_t *full; pthread_cond_t *empty; } Buffer; void initialize_state(SimParameters *p) { Buffer *b; b = (Buffer *) malloc(sizeof(Buffer)); b->jobs = (Job **) malloc(sizeof(Job *)*p->bufsize); b->head = 0; b->tail = 0; b->njobs = 0; b->lock = (pthread_mutex_t *) malloc(sizeof(pthread_mutex_t)); b->full = (pthread_cond_t *) malloc(sizeof(pthread_cond_t)); b->empty = (pthread_cond_t *) malloc(sizeof(pthread_cond_t)); pthread_mutex_init(b->lock, NULL); pthread_cond_init(b->full, NULL); pthread_cond_init(b->empty, NULL); p->state = (void *) b; } void submit_job(Agent *s, Job *j) { SimParameters *p; Buffer *b; /* * get the sim parameters from the agent */ p = s->p; /* * get the queue from the sim parameters */ b = (Buffer *) p->state; pthread_mutex_lock(b->lock); while(1) { /* * if the job will fit */ if (b->njobs < p->bufsize) { /* * insert it at the head */ b->jobs[b->head] = j; b->head = (b->head + 1) % p->bufsize; b->njobs++; /* * signal anyone who is waiting */ pthread_cond_signal(b->empty); /* * leave the critical section */ pthread_mutex_unlock(b->lock); return; } else /* otherwise, wait until there is space and we are signaled to proceed */ { printf("%4d: user %2d blocking because the queue is empty\n", time(0)-p->starttime, s->id); fflush(stdout); pthread_cond_wait(b->full, b->lock); /* * when I wake up here, I have the lock and I'm * back in the critical section */ } } return; } Job *get_print_job(Agent *s) { SimParameters *p; Buffer *b; Job *j; /* * get the sim parameters */ p = s->p; /* * get the buffer from the parameters */ b = (Buffer *)p->state; pthread_mutex_lock(b->lock); while(1) { /* * if there are jobs in the queue */ if (b->njobs > 0) { /* * get the one at the tail */ j = b->jobs[b->tail]; b->tail = (b->tail + 1) % p->bufsize; b->njobs--; /* * signal any threads waiting because the queue * is full */ pthread_cond_signal(b->full); /* * leave the critical section */ pthread_mutex_unlock(b->lock); return j; } else /* otherwise wait until there are jobs available */ { printf("%4d: prnt %2d blocking because the queue is empty\n", time(0)-p->starttime, s->id); fflush(stdout); pthread_cond_wait(b->empty, b->lock); /* * when I'm here, I've been signaled because there * are jobs in the queue. Go try and get one */ } } return; }This means that submit_job always signals the empty condition variable, and get_print_job always signals the full condition variable. This works fine -- if there are no blocked threads, pthread_cond_signal() does nothing, and if, for example, a user thread is unblocked and there is no room on the queue, it will simply call pthread_cond_wait() again. Try it out. If you look at ps5-good.txt, you'll see the same scenario as in ps4-bad.txt at the 27 second mark, and that it is handled just fine.