### CS41 : Writing lists and using dictionaries

Two common data structures which are used as the basis of many others are the list and the dictionary. In this topic, we will see examples of both.

We are learning ...
So that we can ...
• Be familiar with the concept of the dictionary as a key : value pair
- Know some applications of dictionaries
- Create, access, add, remove and update data in a dictionary
• Be familiar with the concept of lists and linked lists
- Represent linked lists using pointers and arrays

In this section, we begin by looking at the first of a number of Abstract Data types (ADT) - dictionaries, linear lists and linked lists. These structures are fundamental in the implementation of other ADTs so you need to pay attention!

 Activity 1 Linear Lists (100)

We've met linear lists before when we have used arrays. A linear list stores elements in time order - the first item to be received is the first item in the last and the last item to be received is the last.

We should (hopefully) be experts at representing lists in Python by this stage. For a quick reminder, if you need it, practice creating an empty list in Python (using `myList = []`) and try to implement some of the following list operations ...
• `list.append(x)`
Add an item, `x`, to the end of the list.

• `list.extend(L)`
Extend the list by appending all the items in the given list, `L`.

• `list.insert(i,x)`
Insert an item, `x`, at a given position, `i`. The first argument is the index of the element before which to insert.

• `list.remove(x)`
Remove the first item from the list whose value is `x`.

• `list.pop([i])`
Remove the item at the given position, `i`, in the list, and return it. If no index is specified, `list.pop()` removes and returns the last item in the list. The square brackets indicate that the parameter is optional, not that it's a list.

• `list.clear()`
Remove all items from the list.

• `list.index(x)`
Return the index in the list of the first item whose value is `x`.

• `list.count(x)`
Return the number of times `x` appears in the list.

• `list.sort()`
Sort the items of the list in place.

• `list.reverse()`
Reverse the elements of the list in place.

• `list.copy()`
Return a shallow copy of the list.

The only operation in this list which may seem a little odd is the `list.copy()` method. Surely, making a copy of a list is easy : `listA = listB`! Why implement a `.copy()` method. Try the following task and prepared to be mind blown ...

 Task 1.1 Copying lists This exercise has quite deep implications for the way in which programming languages implement copying operations. You'll have to have your wits about you - think deeply ... Open up the Python programming environment, IDLE For this exercise, for consistency, we are going to use a `copy()` method from the copy module so that syntax is slightly different to the built in `.copy()` shallow copy method. Type the following commands at the prompt, pressing the  ENTER  key after each one. `>>> from copy import copy, deepcopy` `>>> child = ['a','b','c']` `>>> parent = [1,2,3,child]``>>> parentCopy = parent``>>> parentShallow = copy(parent)``>>> parentDeep = deepcopy(parent)` Before you try altering these 4 lists, create a suitable table in your notebooks which should have the following headings and content. Implement each of the operations given in the table. After each one, inspect the contents of the 4 lists you have created and write their contents in the table as well. Underneath the table, explain what has happened and why copying objects in Python is not as easy as you might have thought. You might want to use this website to help you (and probably will). Mind blown? OUTCOME : Explanation of the difference between simply copying an object and performing a shallow and deep copy of the same object.

Multidimensional lists

Linear lists an be 1, 2, 3 or more dimensional, like chickens (although I can't imagine a 4 dimensional chicken).

What about a zero dimensional chicken? Is it really there?

Lists in more than one dimension

This is not as easy as it looks. I've tried to represent as many dimensions as my tiny human brain will cope with as shapes so you can get a feel for 'where' the data lives in the list. This will help you to locate it. Never assume that you have done this right - always check it manually as well.

At the prompt : For each list dimension, create the list in IDLE using the code provided (you can copy and paste this) and then try to locate the items in the list using the syntax structure suggested. The data in the lists is sequential - you should be able to see the pattern in the numbers.

 Zero dimensional list `>>> myList = 1` Access the item in this list using `myList`. Some would represent this as a list with one item in it, but that's technically a one dimensional list with one item - not the same. One dimensional list `>>> myList = [1,2,3]` Access the items in this list using `myList[column]`. Remember that the indices start at zero, not one. Q : Locate the value '2' in this list Two dimensional list `>>> myList = [[1,2,3],``              [4,5,6],``              [7,8,9]]` Access the items in this list using `myList[row][column]`. Remember that the indices start at zero, not one. Q : Locate the item '6' in this listQ : Locate the item '4' in this list Three dimensional list `>>> myList = [[[1,2,3],` `               [4,5,6],` `               [7,8,9]],` `                      [[10,11,12],` `                       [13,14,15],` `                       [16,17,18]],` `                                 [[19,20,21],` `                                  [22,23,24],` `                                  [25,26,27]]]` Access the items in this list using `myList[sheet][row][column]`. Remember that the indices start at zero, not one. Q : Locate the item '10' in this listQ : Locate the item '3' in this listQ : Locate the item '25' in this list

In your notebooks : Make a note of both the structure of and the syntax required to access items from lists in one, two and three dimensions.

OUTCOME : Deeper understanding of and notes about multi-dimensional lists.

Other list operations

Apart from these data structured oriented list operations, there are also a copy of other standard list operations that are common in other programming environments. These are ...
• `head(list)`
Returns the first item in the list (as opposed to Pythons `list.pop()` which returns the last item).

• `tail(list)`
Returns the remainder of the list after the `head()` is removed.

• `empty()`
Returns `True` if the list is empty and `False` if it contains items.

Head, tail and list operations are easy to implement in Python using standard list slicing operations.

 Task 1.3 List operations Download and open the linearListOperations.py file using the Python IDLE programming environment. This contains three functions ... `def head(list): ``  return list[0]` ` def tail(list):``  return list[1:]` ` def empty(list):``  if len(list) == 0 : return True``  else : return False` Create a simple list which contains the following items ... `>>> L = ['London', 'Birmingham', 'Manchester', 'Glasgow', 'Sheffield', 'Leeds']` Use the script and the implemented functions to determine the value of (yes, this looks a lot like recursion) ... `>>> tail(L)` `>>> head(tail(tail(tail(L))))` `>>> tail(tail(tail(tail(L))))` `>>> head(tail(L))` `>>> empty(head(tail(tail(tail(tail(L))))))` OUTCOME : Print out of the console output for 5 compound list operations and the script for your notebooks.

 Activity 2 Dictionaries (75)

When data is stored in a simple list (however many dimensions you use), there is no real connection between the items you have stored. This is where dictionaries come in useful. In Python, a dictionary has this structure ...

`dictionary`` = ``{``key``:``value``,``key``:``value``,``...``}`

... where the `key` is unique within the dictionary and can be any hashable, immutable type such as a number, string, tuple or an object. The `value` is bound directly to the key and is accessed through it rather than through some arbitary index value (which is why dictionaries are better than lists).

In fact, the `value` can itself also be a list of related data. Groups of related data which are tied to one, unique key value are called records and are the building blocks of databases. This gives us the opportunity to store `key``:record` pairs in our dictionary and create a simple, flat file database - nice!

You should have met simple dictionaries before so this first task will be revision.

 Task 2.1 Creating and using dictionaries Open up the Python programming environment, IDLE.For this task, you should create a blank word processed document with a suitable header and footer and use a combination of screenshots and written explanation to document what you have done and what you have learnt. Create a simple dictionary At the prompt : Copy the following dictionary, pressing  Enter  after every line. Press  Enter  twice at the end. `>>> ``definition = {` `        "qualitative":"involving distinctions based on qualities",` `        "calisthenics":"light exercises designed to promote general fitness",` `        "boardwalk":"a walkway made of wooden boards; usually at seaside",` `        "logo":"a company emblem or device",` `        "``backlash``":"``a movement back from an impact``"` `        }` Can see how the key (the word) is linked to its value (the definition)? At the prompt : Look up the definition for a word like you would look up a value in a list, except that this time, you don't use the numerical list index, you use the key - in this case the word. For example ... `>>> definition["boardwalk"]` Look up some of other words to find the definitions and document what you have done At the prompt : Suppose we want to iterate through our dictionary, printing out the word : definition pairs. We can do this a little like we iterate through a normal lists but with some subtle differences. Type the following commands, pressing  Enter  once after each line and twice at the end. `>>> for key, value in definition.items():` `        print('{0:<20} : {1}'.format(key,value))` Notice the `.items()` dictionary method needs to be included to make this work (you do not need this for iteration through a simple list). The first format specifier reserves a 20 character, left aligned space for the word - it just makes it neater. Create simple table of records At the prompt : In this case, the key will be a persons name and the value will be four characteristics of that person stored as a list. Using this example to help, store the following data in a dictionary. `>>> students = {` `        ``"Simon":[14,"Green","Dark Brown","ML"],` `        "Graham":[13,"Hazel","Blonde","FJ"],` `        ``your own record``,` `        ``your own record``,` `        ``your own record` ```        ````}` Did you *really* type '`your own record`' and get an error? If you did, go back and make up three more records to type into your dictionary! Duh! At the prompt : If we wanted to find Simon's eye colour (index 1 in the value list), we could type ... `>>> print(students["Simon"][1])` Look up some of Simon's other attributes and document what you have done. At the prompt: Suppose we wanted to iterate through our dictionary, printing out certain attributes from the record. Type the following commands, pressing  Enter  once after each line and twice at the end. `>>> for key,record in students.items():` `        print('{0:<12} : Age {1}'.format(key,record[0]))` Now try iterating through and printing out each of the persons Tutor group. Document what you have done. Now print out your word processed document for your notebooks. OUTCOME : Word processed document outlining how to create and access items in dictionaries.

I suppose that, in the above examples, the record itself could be a dictionary as well, though it would be more inefficient storing this as you would have the attributes names stored for every value in the record - not good. If you want to learn more about dictionaries, the best (but probably driest) place to start is the Python Data Structures reference.

Dictionary operations

Just like for lists, there are lots of special dictionary operations that you can perform using the standard Python library. Because you probably haven't used dictionaries much, I'm going to ask you to complete the following task formally by evidencing what you have done!

 Task 2.2 Dictionary operationsOpen up the Python programming environment, IDLECreate a blank word processed document with a suitable header and footer and use a combination of screenshots and written explanation to document what you have done and what you have learnt.At the prompt : Create a new dictionary object using the following code snippet (you can copy and paste) ...`>>> definition = {``        "qualitative":"involving distinctions based on qualities",``        "calisthenics":"light exercises designed to promote general fitness",``        "boardwalk":"a walkway made of wooden boards; usually at seaside",``        "logo":"a company emblem or device",``        "backlash":"a movement back from an impact"``        }`At the prompt : Now investigate the following dictionary operations, one at a time, recording what you have done and what you have learnt in your word processed document. For each operation, I have described the job it does and the syntax you are required to use. You have to come up with suitable operations to perform on this dictionary to illustrate the operation.`len(dictionary)`Returns the number of items in the dictionary.`dictionary[key] = value`Set `dictionary[key]` to `value` or create a new entry if `dictionary[key]` does not exist.`del(dictionary[key])`Removes `dictionary[key]` from the dictionary raising a `KeyError` if `key` is not in the dictionary.`key in dictionary`Returns `True` if `key` is in dictionary, `False` otherwise.`dictionary.clear()`Removes all items from the dictionary. Be careful!`dictionary.copy()`Returns a shallow copy of the dictionary (see list operations).`dictionary.get(key[,default])`Returns the value for the `key` if `key` is in the dictionary, else returns `default`. If `default` is not specified, returns `None` so that this method never raises a `KeyError`.`dictionary.items()`Returns a list of tuples containing the dictionary's items as `(key,value)` pairs. Used for iterating over dictionaries. You would usually implement this in a loop.`dictionary.keys()`Returns a list of `keys` from the dictionary. You would normally implement this in a loop.`dictionary.pop(key[,default])`Remove `key` and it's `value` from dictionary and return `value`, else return `default`. If `default` is not specified, a `KeyError` will be raised if the `key` is not in the dictionary.`dictionary.popitem()`Remove and return an arbitary `(key,value)` tuple. Will raise a `KeyError` if the dictionary is empty.`dictionary.setdefault(key[,default])`If `key` is in the dictionary, return it's `value`. If not, insert the `key` with a `value` of `default` and return `default` (`default` defaults to `None`).`dictionary.values()`Returns a list of `values` from a dictionary. You would normally implement this in a loop.Now print out your word processed document and stick it in your notebookOUTCOME : Deeper understanding of dictionaries and evidence for your notebooks.

In a linked list, the structure of the data does not necessarily reflect the way in which the data is stored in the computers memory.  A linked list uses the important concept of pointers, where a pointer is simply a number stored in memory that points to the location where another item of data is to be found.

Consider this simple, fixed length, linear list of names ...

If we wanted this list of names in alphabetical order, we could just perform a conventional sort algorithm on it but, for a large list or one which is updated often, this would be very resource consuming operation. Instead, we can construct a logical list within this physical one by introducing a start pointer which points to the first item in our sorted list and then add a next pointer to each record which points to the next item ...

The funny symbol at the end is called a null pointer and signifies the end of the list. Notice that I have used the zeroth position in the list to store this variable in - it makes sense. Basically, we have created a logical list like this without actually having to sort the existing items ...

Working through the linked items in a list like this is called list traversal

So, we know where to start looking at our list and where to stop looking at it, but where do we put a new item? We need another pointer for this - I call it a next free pointer - which marks the position of the first free cell in our list.

This essentially creates another, separate logical list, starting at position 6 and ending at position 10.

In fact, if we really wanted to, we could create any number of separate logical lists within this one, simply by adding an extra start pointer and next pointers in each record.

 Task 3.1 A linked list challenge Using the simple example given to help you, construct a single, 10 position, physical list containing four logical lists. One should be the free position list but the others should link together the following items in each record ... Name, Age, Salary ... where the Name logical list should be linked in alphabetical order (i.e. starting with earliest letter of the alphabet), the Age logical list should be linked in descending order (i.e. starting with the oldest person) and the Salary logical list should be linked in ascending order (i.e. starting with the lowest salary). You will therefore need four pointers in the zeroth position of the physical list rather than the two shown in the example. Store the following records in your list. Harold, 56, 32600 William, 24, 24500 Steven, 35, 26000 Leanne, 65, 14000 Benedict, 32, 54000 Sally, 45, 52700 Label your list carefully so it's easy to tell which start pointer is which. You may wish to use coloured pens / pencils to help to differentiate them. Also, write out all four logical lists beneath the physical list just to prove that you know what you have done! OUTCOME : Physical list and four logical lists. Unplugged. Boom!

Before we start, a word of warning. These are possibly the most complicated algorithms we've met so far. Don't be deterred however. Take your time and work it out. Before you start, download the linkedList.py file run it in IDLE.

Traverse / print a logical linked list

We use the start pointer to work out where to begin traversal and copy it into a follower variable called `currentNode`. Next, we check to see if `currentNode` is `None`, implying that the list was empty, and print a suitable error. If this is not the case, the list must have items in so we follow the pointers, printing out each node's `data` until the `pointer` is `None` indicating the end of the list!

`currentNode ← start`
`IF currentNode = None THEN`
`  OUTPUT 'List is empty'`
`END IF`
`ELSE`
`  WHILE currentNode <> None`
`    OUTPUT list[currentNode].data`
`    currentNode ← list[currentNode].pointer`
`  END WHILE`

 Task 3.2 Getting to know the algorithm I must presume that, at this stage, you have read the description above and looked carefully at the algorithm describing traversal of a logical linked list. If you haven't, go back and do it before you continue. Next, run the script (linkedList.py) and invoke the `.traverse()` method, using `myList.traverse()` and look carefully at the output. Try to work out what the algorithm has done and describe it to your shoulder partner. Next, listen to this audio script explaining how the algorithm works (headphones please!) Now use yEd (or an equivalent offline or online flowchart editor) to create a flowchart representing the algorithm. Print out your flowchart for your notebooks. OUTCOME : Flowchart representing the 'Traverse / print a logical linked list' algorithm.

Search for item in a logical linked list

The search method is designed mainly as a helper function for the delete method. The delete method requires that two pointers be returned from this search - a `previousNode` and a `currentNode` pointer. The first operation that the search method performs is to grab the value of the `start` pointer and store it in the follower variable `currentNode`.

The while loop searches through the logical linked list until the end of the list is reached, indicated by `currentNode` being `None`. If the search item is found, then we break out of the while loop and return the latest pointer values. Otherwise, we follow the pointers in the list, keeping track of the `previousNode` and `currentNode` values each time.

There are three different patterns for the `previousNode` and `currentNode` pointers depending on whether and where the search item is in the list :

`previousNode = None, currentNode > 0 : Item at start of list`
`previousNode > 0, currentNode > 0    : Item in middle or end of list`
`previousNode > 0, currentNode = None : Item not in list`

Any other method which uses this method will check the values of `previousNode` and `currentNode` and will be able to tell whether and where the item is in the linked list.

`current``Node ← start`
`previousNode`  None` `
`WHILE currentNode <> None THEN`
`  IF list[currentNode].data = item THEN`
`    BREAK`
`  previousNode = currentNode`
`  currentNode = list[currentNode].pointer`
`END WHILE`
`RETURN previousNode, currentNode`

 Task 3.3 Getting to know the algorithm I must presume that, at this stage, you have read the description above and looked carefully at the algorithm. If you haven't, go back and do it before you continue. Next, run the script (linkedList.py) and invoke the `.search()` method, using `myList.search(parameter)` with three different parameters ... `myList.search('Aaron')` `myList.search('Mark')` `myList.search('Harry')` Look carefully at the output. Try to work out what the algorithm has done and describe it to your shoulder partner. Next, listen to this audio script explaining how the algorithm works (headphones please!) Now use yEd (or an equivalent offline or online flowchart editor) to create a flowchart representing the algorithm. Print out your flowchart for your notebooks. OUTCOME : Flowchart representing the 'Search for item in a logical linked list' algorithm.

Insert an item in a logical linked list

Once again, we use the `start` pointer to work out where the beginning of the list is and copy it to a follower variable called `currentNode`. The obvious thing to do next, before we start trying to insert a new data item in the list, is to check whether it is full. We do this be inspecting the `nextFree` pointer to see whether it contains `None`, implying that we have run out of space, where we output a suitable error message. If this is not the case, the list must have some space free, so we insert the new data `item` at the position indicated by the `nextFree` pointer.

There are two cases which follow. First, we would always insert an item at the head of the list if either the list is already empty or the value of the item to be added is less than the data item at the first node. As we have already added the new data item to the list, all that is required is to set the relevant pointer values. In this case, we set the `start` pointer, the new node pointer and the `nextFree` pointer using a temporary variable called `tmp` to help us.

The only alternative remaining is that the new item is either within the current list or at the end of it. As we have already inserted the new item at the position indicated by the `nextFree` pointer, all that remains is to set the relevant pointer values. The `while` loop follows the existing pointers until it finds the correct place for the new data item and , using a temporary variable, sets the `nextFree` pointer and the two relevant node pointers.

`currentNode ← start`
`IF nextFree = None THEN`
`  OUTPUT 'List is full'`
`ELSE`
`  `list[nextFree].dataitem
`  IF (currentNode = None) OR (list[currentNode].data > item) THEN`
`    tmp ← nextFree`
`    nextFree ← list[nextFree].pointer `
`    list[tmp].pointer ← start`
`    start ← tmp`
`  ELSE`
```    WHILE (list[currentNode].pointer <> None) END           (list[list[currentNode].pointer].data < item)```
`      currentNode ← list[currentNode].pointer`
`    END WHILE`
`    tmp ← nextFree`
`    nextFree  ← list[nextFree].pointer `
`    list[tmp].pointer ← list[currentNode].pointer`
`    list[currentNode].pointer ← tmp`
`END IF`

 Task 3.4 Getting to know the algorithm I must presume that, at this stage, you have read the description above and looked carefully at the algorithm. If you haven't, go back and do it before you continue. Run the script (linkedList.py) and invoke the `.insert()` method, using `myList.insert('Harry')`. Print out the raw list using the command `print(myList)` (which actually invokes the `__str__` magic method) followed by `myList.traverse()` which will print out the linked list in alphabetical order by following the pointers. Ensure that 'Harry' is inserted in the correct place in the list. Try to work out what the algorithm has done and describe it to your shoulder partner. Next, listen to this audio script explaining how the algorithm works (headphones please!) Now use yEd (or an equivalent offline or online flowchart editor) to create a flowchart representing the algorithm. Print out your flowchart for your notebooks. OUTCOME : Flowchart representing the 'insert an item in a logical linked list' algorithm.

Deleting an item from a logical linked list

We use the `search()` helper function to find the values of the `previousNode` and the `currentNode` pointers for the search item. As mentioned previously, there are three situations which might occur :

`previousNode = None, currentNode > 0 : Item at start of list`
`previousNode > 0, currentNode > 0    : Item in middle or end of list`
`previousNode > 0, currentNode = None : Item not in list`

The function first checks `currentNode` for `None`, which implies the item is not in the list, and returns `False` in that situation. If this is not the case, the item must be somewhere in the list. Next, we check `previousNode` for `None`, which would imply the the item is at the start of the list and, depending on the result of this, we either set the `start` pointer or the `previousNode`s pointer to skip over the deleted node.

Finally, we clear the `currentNode` data item, set the deleted nodes pointer to point to the `nextFree` node and set the `nextFree` pointer to point to the deleted node, so we can use it again.

previousNode, currentNode = search(item)
IF currentNode = None THEN
RETURN False
ELSE
IF previousNode = None THEN
start = list[currentNode].pointer
ELSE
list[previousNode].pointer = list[currentNode].pointer
list[currentNode].data = ''
list[currentNode].pointer = nextFree
nextFree = currentNode

 Task 3.5 Getting to know the algorithm I must presume that, at this stage, you have read the description above and looked carefully at the algorithm. If you haven't, go back and do it before you continue. Run the script (linkedList.py) and invoke the `.delete()` method, using `myList.delete(parameter)`. with three different parameters ... `myList.delete('Aaron')` `myList.delete('Zak')` `myList.delete('Berty')` Each time you invoke the .delete() method, issue both a print(myList) and myList.traverse() commands to inspect the state of the list in each case. Try to work out what the algorithm has done and discuss it with your shoulder partner. Now that we have got some space in our list, let's try inserting two new values. Invoke the `.insert()` method using `myList.insert(parameter)` with two different parameters ... `myList.insert('Andrew')` `myList.insert('Zeb')` ... which will test the `.insert()` method to ensure that it updates the correct pointers. Each time you invoke the `.insert()` method, issue both the `print(myList)` and `myList.traverse()` commands to inspect the state of the list in each case. Try to work out what the algorithm has done and discuss it with your shoulder partner. Next, listen to this audio script explaining how the algorithm works (headphones please!) Now use yEd (or an equivalent offline or online flowchart editor) to create a flowchart representing the algorithm. Print out your flowchart for your notebooks. OUTCOME : Flowchart representing the 'delete an item in a logical linked list' algorithm.

The Heap

Linked list implementations in real life are not fixed length. They are dynamic data structures rather than fixed length data structures and the only thing that would limit the length of the list was the available memory on the computer system. In practice, the next free position for a list comes from an area of memory called the heap which is maintained by the operating system.

• In practice, memory for linked lists is not allocated in advance of initialisation of the data structure.
• New memory for nodes is allocated dynamically from available memory locations stored on the Heap.
• Special dynamic pointers are used instead of the pointers we have seen which point to memory locations rather than the index in the list.
• When items are deleted, memory is freed. If it is not, memory leakage occurs.

 Task 3.6 Struggling a bit Struggling a bit for an activity on this since a) Python doesn't really do heaps and b) discussions about the heap tend to be a little to advanced for A Level. Therefore, just make some notes. OUTCOME : Notes about the heap.

 Extension Activities

Recursive list printing

Look back at the topic called For recursion, see recursion. You'll find that there is a recursive algorithm called printList.py which implements a recursive print and reverse print algorithm. Use it to print the in order and reverse order values from the following list ...

`norseGods = ['Odin', 'Thor', 'Loki', 'Freyr', 'Vanir']`

 What's next? Before you hand your book in for checking, make sure you have completed all the work required and that your book is tidy and organised. Your book will be checked to make sure it is complete and you will be given a spicy grade for effort.

END OF TOPIC ASSESSMENT