Lab 12 - Searching, Timing, Recursion


Directions for Labs


  • Submit your completed lab file online to CourSys (not Canvas). Labs are marked on completion, not correctness, so complete each part to the best of your ability and learn!
  • It is recommended that students attend in-person lab sections for lots of help!
    • If you would like to attend an in-person lab section, you are welcome to come to any (or more than one) lab section.
    • There is no need to attend the section you are enrolled in.
    • There is no need to attend any lab sections: there is no attendance taken.
    • You can complete the lab on your own time or own computer if you like.
  • While completing these labs, you are encouraged to help your classmates and receive as much help as you like. Assignments, however, are individual work and you must not work with another person on assignments.


Overview


We'll create a program that helps the user play the letter/word game Scrabble. We'll read a large list of words from a file and ask the user what letters they have to spell a word. The program will find all possible words they can spell!

Read words


  1. Inside your cmpt120 folder, create a new folder for lab12 and make a lab12.py file.
  2. From the course website, download the randwordlist.txt file into your lab12 folder.
    • Look at the file; what do you notice about the order of the words?
  3. In your Python file, read in all the words into a list.
    • You will want to open the file relative to your current folder. The following code will be helpful:
      import pathlib     # PUT THIS LINE AT THE TOP OF THE FILE
      folder_of_code = pathlib.Path(__file__).parent.resolve()
      file = open(f"{folder_of_code}/randwordlist.txt")
    • Read all the data at once from the file into a list. You can use file.readlines() to return exactly this list!
  4. Print the first 10 elements of the list.
    • Do you see an issue with these words?
  5. Create a well-named function which accepts a list of strings and returns a new list of strings containing each of the original strings, but stripped of any whitespace at the start or end.
    • Notice that you are passing in the list of words as an argument, not from input().
    • Notice that it needs a new list: you are not changing the existing list.
    • Since it's a list, we could have decided to change the original list (pass by reference / alias).
  6. In the main area, call your function to strip the list of words you previously read from the file before printing the first 10 words.
    • Change your main code to call your strip function before the above call to print the first 10 words.
  7. Sample output:
    Lab 12: Searching!
    ['cargos', 'puritanically', 'niter', 'kingliness', 'mainbrace', 'disney', 'chauvinism', 'halophyte', 'vocationalism', 'parrocked']


Get User's Letters & Permutations


In the game of Scrabble, the player has letter tiles to try and spell a word.

  1. Ask the user to type in the letters they have in the game.
    • User can type in any sequence of characters (a string).
    • Duplicate letters are OK.
    • If they type in more than 7 characters, reject it and ask them to try again (loop!).
    • Don't bother checking if they enter characters other than a-z; we're just writing a lab!
  2. Compute a list of all possible permutations (re-orderings) of the user's letters:
    • We'll use a built in permutation function which, given a string or list, returns a list of tuples.
    • This code will give us duplicate strings if there are duplicate letters in the user's input. That's OK for now. See optional challenges below.
    • Here is the code you'll need (assuming letters is the string of 7 or less characters)
      import itertools    # PUT THIS LINE AT THE TOP OF THE FILE
      ...
      letter_permutations = []
      for i in range(1, len(letters) + 1):
          tuples = list(itertools.permutations(letters, i))
          for tup in tuples:
                  letter_permutations.append("".join(tup))
  3. Print a summary of how many permutations were found, such as:
    "There are 15 different permutations of your 3 letters."
    • For debugging, try printing out the list of permutations and test with 1, 2, 3, or 4 letters.
  4. Sample output:
    Lab 12: Searching!
    ['cargos', 'puritanically', 'niter', 'kingliness', 'mainbrace', 'disney', 'chauvinism', 'halophyte', 'vocationalism', 'parrocked']
    Enter your letters: aent
    There are 64 different permutations of your 4 letters.      


Linear Search


  1. Create a linear_search(...) function which is passed the list of words and one word to search for.
    • Return True if the word is in the list; False otherwise.
    • Make the function implement linear search.
    • You may use in for a for loop, but you may not use it to search the list.
  2. Make your main program check which of the letter permutations are actually words:
    • For each permutation, use linear_search(...) to check if the permutation is in the list of words.
    • Create a list words that the letters can make.
  3. Print the list of words in reverse:
    • Use my_list.reverse() to reverse your list of words.
    • Then print out all the words.
  4. Time how long it takes for your main program to run the search over all the words:
    • Compute the time difference between when you start searching and when you are done searching:
      import time      # PUT THIS LINE AT THE TOP OF THE FILE
      ...
      # Place this in main code:
      start = time.time()
      # Your searching code / loop here
      ...
      # Done!
      time_s = time.time() - start
      print(f"It took {time_s}s to find them all.")
  5. As you find words, tell the user of your progress by printing a message every time you have found 20 more words.
    • Hint - Math What mathematical operation allows you to check if the length of a list is divisible by 20?
    • Hint - Math answer Use mod (modulus): if len(my_list) % 20 == 0: ...
  6. [Optional Challenge] Try and find seven letters which make the most words.
    • Try a few options and see what you can come up with!
  7. As you run your program now, it might look like:
    Lab 12: Searching!
    Enter your letters: aent
    There are 64 different permutations of your 4 letters.
    ...
    Using Linear Search
    ...found 20 options so far.
    Found 32 possible words!
    Found tean
    Found tane
    Found neat
    Found etna
    Found ante
    Found ten
    Found tea
    Found tan
    Found tae
    Found net
    Found nat
    Found nae
    Found eta
    Found eat
    Found ean
    Found ate
    Found ant
    Found ane
    Found te
    Found ta
    Found ne
    Found na
    Found et
    Found en
    Found ea
    Found at
    Found an
    Found ae
    Found t
    Found n
    Found e
    Found a


Binary Search


  1. Implement a recursive binary_search(...) function which accepts a list of words and a single word. Using the binary search algorithm, return True if the word is in the list, otherwise return False.
    • You can use the code shown in lecture; however, I encourage you to try implementing your own recursive function using these hints to help you!
    • Hint - 1st line Be sure to get the function's first line correct: def binary_search(data, word):
    • Hint - Recursive plan Plan out what your base case and recursive step will be.
    • Hint - Base case Your base case is when the list is empty, return False.
    • Hint - Recursive step Find the mid point, and check if it's the target word (return True), or if the target word is before the mid-point (recurse on left half of list), or if the target word is after the mid-point (recurse on right half of list).
    • Hint - Mid point code The mid index is: mid_idx = len(data) // 2, and so the middle word is then mid_word = data[mid_idx]
    • Hint - Recurse left if mid_word > word: return binary_search(data[:mid], word)
    • Hint - Recurse right if mid_word < word: return binary_search(data[mid+1:])
  2. Change your code from the above section to use binary_search(...) instead of linear_search(...).
    • You'll need to sort the list of words first; words.sort().
    • Test your code!
  3. Now see what happens when you use binary_search() without first sorting your data:
    • Remove the call to words.sort() and see how many words are found. Is it correct?
    • Put the sort back in.
  4. Record timings:
    • Using binary_search(...) record how long it takes to search for the letters sached
    • Using linear_search(...) record how long it takes to search for the letters sached
    • Note: The overhead of our lists are actually very important to the timing of this code: our code is not very efficient! Normally binary search will be vastly faster as the size of the data increases. For example, see the dictionary optional-challenge below; it performs much better!
  5. Understanding Questions: add comments to your python code to answer:
    • Explaining why binary search needs sorted data, and how it performed on un-sorted data.
    • List how long it took each search algorithm to find all words for sached.


Optional Challenges


  1. In addition to trying linear_search() and binary_search(), try using a Python dictionary.
    • Hint - Populate Instead of using either search algorithm, you'll use in. First, you'll need to populate the dictionary with keys and values.
    • Hint - Key For each word in the word list, add that word as a key in the dictionary.
    • Hint - Value It turns out we don't care what the value is; as long as it's in the dictionary is all we need. I just set each value in the dictionary to be True
    • Repeat the previous test of searching but now using the dictionary and the Python in keyword.
    • Time it; how long does it take to search for the letters sached? How does this compare? WOW!
  2. Allow the user to enter more than 7 characters (but still cap at a specific max).
    • How many permutations are there with 8, 9, 10, ... characters?
    • How long does it take to check them? (You may not want to let it run that long!)
  3. Change the code that generates all the possible permutations to reject any identical duplicates.


Submission


Submit your lab12.py file to CourSys by Sunday midnight. Remember you can submit up to 3 days late with no penalty if needed (but please plan to get it done on time to stay up to date on the course!)

Topics Covered

  • Reading all text from a file
  • Creating functions to process lists of strings.
  • Creating recursive functions
  • Searching:
    • Linear search
    • Binary search
    • Using in
  • List manipulation
    • my_list.sort()
    • my_list.reverse()
    • Using slices