Lab 5a: Hash Table Testing

Due: 5/3 11:59pm

1. Goals

Please read the instructions carefully

This lab must be done solo. Pair programming is not allowed for this lab.

The purpose of this lab is to implement a simple version of a hash table using vectors to chain collisions in an array. Specifically, the hash table logic is based on counting occurences of unique words.

The hash key is a string and the value is a sizet variable representing the number of times that word has been added to the hash table. The size of the hash table and the hash function is given - your goal is to complete the WordCount.cpp functions following the specifications.

This lab is different than previous labs since no test cases are provided. Your job is to test your code using our tddFuncs (or some other testing framework if you wish). Be sure to think of test cases to ensure your code works for various cases. Gradescope will test your code with hidden test cases. Your Gradescope submission will show if your code compiles correctly, but you will not know the test results or score since they will be hidden until after the deadline has passed.

2. Getting Started

  • This lab must be done solo. Pair programming is not allowed for this lab.
  • There will be no opportunity for late submissions due to making tests visible after the deadline has passed. The submission window will be closed on the deadline and we will not accept any submissions for this part post-deadline.
  • I strongly encourage you to start early and not wait until the deadline is near. By starting early, you can seek assistance and guidance from our TAs, tutors, or instructor during lab sections and our office/open lab hours.

3. Copying some programs from my directory

Visit the following web link–you may want to use "right click" (or "control-click" on Mac) to bring up a window where you can open this in a new window or tab:

http://cs.ucsb.edu/~emre/cs32/code/lab5/

You should see a listing of several C++ programs. We are going to copy those into your ~/cs32/lab5a github repo all at once with the following command:

cp ~emre/public_html/cs32/code/lab5/* ~/cs32/lab5a

Note: If you get an error message, check the same instructions for previous labs.

After doing this command, if you cd into the /cs32/lab5a directory and use the ls command, you should see several files–the same ones that you see if you visit the link above.

If you don't see those files, go back through the instructions and make sure you didn't miss a step. If you still have trouble, ask your TA or mentor for assistance.

4. Getting the code to pass the tests

In this week's lab, you have the following files:

  • WordCount.cpp
  • WordCount.h
  • tddFuncs.cpp
  • tddFuncs.h

Your job is to modify WordCount.cpp based on the specifications and thoroughly test your code. Do not modify WordCount.h and the given array of vector structure must be used to implement your hash table. No Makefile or test applications are provided for this lab and you must create your own. tddFuncs.* are provided for testing your code. Even though Gradescope will not run the tests you create, you should submit your test file(s) to Gradescope. Name this file lab5Test.cpp (if you wrote multiple test files, you can name each one with lab5Test01.cpp, lab5Test02.cpp, etc.).

Note that, you need to write your own Makefile to compile your code and run your tests. You can use the Makefiles used in previous labs to help you write your own.

4.1. Some notes about the hash table you will be implementing

  • There are many ways to implement a hash table as we discussed in lecture. For this lab, we will implement a hash table that is an array containing vectors of std::pair<std::string, size_t>, where the string is the hash table key and the size_t value is the number of times the key has been inserted into the table. In the case of collisions, you must insert the new pair at the end of the vector.
  • A simple hash function is given to you. Do not modify this hash function.
  • Hash keys are case insensitive
    • "KeY" and "kEy" are considered the same key. When inserting into your hash table, you must convert all valid keys into all lower case characters before hashing the key and updating the hash table.
  • Be sure to carefully read the comments in WordCount.h. Write your tests to make sure your functionality abides by the specification.
  • Be sure to initialize variables. It's possible that your code passes all of your tests locally but Gradescope may not pass your tests if you do not initialize variables (you may get in a lucky state on your computer, but the variables may have a different initial value if you forget to initialize variables).
  • Be careful about size_t types. Your code may compile, but you'll probably see a warning message in stderr if you're using size_t with int types in your code.

5. Submitting via Gradescope

The lab assignment "Lab 5" should appear in your Gradescope dashboard in CMPSC 32. If you haven't submitted anything for this assignment yet, Gradescope will prompt you to upload your files.

You will submit your WordCount.cpp implementation along with a lab5Test.cpp file containing your test application(s) (as previously stated, if you wrote multiple test applications, you can name each one with lab5Test01.cpp, lab5Test02.cpp, etc.). For this lab, you are required to submit your files with your github repo.

As mentioned earlier, you will not know Gradescope's score until AFTER the deadline has passed. When submitting your file, if you pass the compilation checks you will see the following:

Checking stdout from make -B -f Makefile.check (1.0/1.0)
Checking stderr from make -B -f Makefile.check (1.0/1.0)

then that means your code compiled correctly and the test executables have been generated. If Gradescope doesn't pass this test, then your code did not compile correctly and you must fix the issues before resubmitting.

This lab (Lab5a) is worth a total of 50 points. You will have an opportunity to fix your code and pass all tests if you had any tests fail for this lab next week (Lab5b). Lab5b will be worth 50 points (100 points total for both Lab5 parts). If you passed all tests this week, all you need to do is resubmit your code to get all the points.

Author: Mehmet Emre

Created:

The material for this class is based on Prof. Richert Wang's material for CS 32