Lab 8 - Threading Activity with uthash
Goals for this lab
- Gaining a hands-on experience with
uthash
, a header-only library providing hash tables. - Practicing multi-threading in C including starting the threads with arguments, and waiting for their finish.
- Practicing thread synchronization in a parallel processing problem.
- Using the thread sanitizer and detecting race conditions.
Prerequisite Skills for This Lab
- Read about the
uthash
library. - C Skills
- Able to use structs, arrays, pointers
- Able to compile, run, and debug C programs using CMake.
Task 0: Download & Run
- Clone the Lab 8 starter project
- Once you have the files, you are welcome to copy them elsewhere, such as your course-work repo.
- Read
lab8.c::main()
to see what the program does (you don't have to implement any code yet). - Compile and run the program.
Task 1: Sort Table
- Sort the table before printing.
- See the uthash docs for information on sorting.
- Find the code in
main()
which calls the function to print the counts to the screen. - Before calling the function to print, call the uthash function to sort the data.
- Hint: The
sort_func()
function is already written for you! You just need to pass it and theword_map
to the correctuthash
function.
- Output should be:
$ ./main Word Count brown 2 dog 1 fox 2 jumps 1 lazy 1 over 1 quick 1 the 4
Task 2: Implement count_words_parallel(...)
Next, we'll get the program to count the words in parallel on multiple threads.
- Call
count_words_parallel(...)
- In
main()
, it currently callscount_words_seq()
. Comment out this line, and uncomment the line callingcount_words_parallel()
.
- In
- Complete the partially implemented function
count_words_parallel()
:- Initialize the
count_mutex
; it will be shared between all threads to synchronize access to the shared hash-table. - Inside the loop:
- Initialize the
threads_args[]
value for the current thread. Call thepack_args()
function to create the necessary data structure and fill in the values. - Launch the thread by calling
pthread_create(..)
. Pass in:- the thread function
counter_thread_func
- pointer from the current element in
threads_args[]
(passed to the thread function bypthread_create()
).
- the thread function
- Store the thread ID into the correct location in
threads[]
.
- Initialize the
- Wait for each of the threads to finish.
- Hint:
pthread_join()
.
- Hint:
- Cleanup all dynamically allocated memory.
- Hint: Look at
threads_args[]
.
- Hint: Look at
- Initialize the
- Test your program
- Note: It is not yet thread safe; we'll look into that next. Does it still work?
Task 3: Thread Sanitizer
- Enable the thread sanitizer and check the thread-safety of the implementation.
- You'll need to change
CMakeLists.txt
to enable it. - Hint: add the following after the
project
definition in yourCMakeLists.txt
.add_compile_options(-fsanitize=thread -g -O1) add_link_options(-fsanitize=thread -g -O1)
- You'll need to change
- Run the program.
- You should see output similar in format to:
================== WARNING: ThreadSanitizer: data race (pid=10434) Read of size 8 at 0x7ffd2093e8b8 by thread T2: ... Previous write of size 8 at 0x7ffd2093e8b8 by thread T1: .. Location is stack of main thread. Location is global '<null>' at 0x000000000000 ([stack]+0x00000001e8b8) ..
- You should see output similar in format to:
Task 4: Thread-Safe
- Make
add_word_counts_in_chunk(...)
thread-safe.- The function already takes in a mutex: just lock and unlock it as needed.
- Hint: Don't lock/unlock for the whole function, just smaller part.
- Hint: Lock at start of each pass through the loop; unlock at end of each pass through the loop.
4. Optional Challenges
- Optional: Lots of Data
- Change the program to process a lot of data, such as full text of hamlet!
- Optional: Thread Analysis
- Manually time and compare how long it takes to process the data with different numbers of thread: one, two, and three threads.
- How much time does the added complexity of running the operations in parallel save us?
- Optional: Generate data
- Write some code to generate a table of data showing how long it takes to process Hamlet for each number of threads between 1 and about 10. This could be a script or C code.
- Print the result to the screen and copy it into a program like LibreOffice Calc to graph.
- Try running each configuration multiple times and comparing the differences.
Submission
Submit your lab8.c
C code to CourSys; the file name must be an exact match to what CourSys is expecting, otherwise it won't accept it.
Submissions will be marked for completion. It must be valid C code that runs (however we are unlikely to actually compile and run the code). You do not need to complete any optional steps.