### CS16 : Sorting and Searching

Sometimes, there is no point in reinventing methods of solving problems. There are a number of 'standard algorithms' which can be adapted to suite all variety of different problems. In this section, we look at standard sorting and searching algorithms.

 Learning OutcomesIntroduction to Computer Systems

We are learning ...
• About standard sorting and searching algorithms
So that we can ...
• Describe the principles of operation of standard sorting algorithms
- bubble sort
- merge sort
- insertion sort
• Describe the principles of operation of standard searching algorithms
- linear search
- binary search
• Compare and contrast searching and sorting algorithms
- there is a choice!
- influenced by data structures and data values

Visualization and Comparison of Sorting Algorithms (4:25)
Apologies for the horrible sound track ...

 Activity 1Bubble sort

Click to engage

A bubble sort (or sinking sort) is so called because the 'lighter' items in an unordered list of values 'bubble' to one end of the list whilst the 'heavier' items 'bubble' to the other. As you can see from the animation, the effect of this algorithm is to find the largest item and move it up to the top of the list. After every pass, there is no need to check the largest item again because it's always in the right place.

AlgoRhythmics Bubble Sort (5:16) - Follow one dancer at a time

Task 1.1 Playing card bubble sort
Where you simulate a bubble sort using playing cards

Get a suit of playing cards

Get one suit of 13 playing cards - from 'Ace' to 'King'.

What the ordered list should look like

Shuffle the suit

Shuffle the suit and lay them face down in a line on the desk in front of you.

Eek! The list is unordered - but where are the cards?

Bubble sort!

Follow the algorithm carefully to bubble sort the cards into ascending order. Don't cheat - you are only allowed to look at two cards at a time or else you'll spoil the magic. Your teacher will demonstrate this to you as well.

Bubble Sort Algorithm

There are two ways of doing this - by doing something until a condition is true ...

 Remember 'n' to be the total number of cardsStart at the left hand side of the spreadRepeat until I do not swap cards or I have no more cards to check    For every card in the list from the start up to two less than 'n'        Turn over that card and the one to it's right        If the value on the left hand card is greater than the value on the right hand card            Swap the two cards over            Turn the cards face down            Set a flag to indicate that a swap has occurred    Take one off the value of 'n'

... or doing something while a condition is true ...

 Remember 'n' to be the total number of cardsStart at the left hand side of the spreadPretend that two cards have just been swappedWhile I have just swapped two cards    Forget that two cards have been swapped    For every card in the list from the start up to two less than 'n'        Turn over that card and the one to it's right        If the value on the left hand card is greater than the value on the right hand card            Swap the two cards over            Turn the cards face down            Remember that two cards have been swapped    Take one off the value of 'n'

... it doesn't really matter which you choose - the outcome is the same. Try following either of these algorithms really carefully. You might want to ask your shoulder partner to give you the instructions whilst you carry them out. When you have finished, turn over all the cards and be amazed! They are in ascending order!

Where we learn how to represent a bubble sort using pseudocode

You will (probably) need to remember the pseudocode version for your exam! Nightmare!

In your notebooks / on paper

Copy the pseudocode algorithm below by hand into your notebooks / on paper. In this algorithm ...
• `n` is the length of the sequence
• `i` is the loop index
• `t` is the temporary variable used to enable the swap
Note that the `FOR` loop runs from `0` through `n-2` because the list indices start at `0` and run to `n-1` and we ignore the last item anyway so we have to stop at `n-2`. This always causes confusion!

`SET n ← length(sequence)`
`SET swap ← true`
`WHILE swap = true`
`  SET swap ← false`
`  FOR i ← 0 TO n - 2`
`    IF sequence[i] > sequence[i + 1] THEN`
`      SET temp ← sequence[i]`
`      SET sequence[i] ← sequence[i + 1]`
`      SET sequence[i + 1] ← temp`
`      SET swap ← true`
`    END IF`
`  NEXT i`
`  SET n ← n - 1`
`END WHILE`
You can perform a bubble sort with a `repeat ... until` loop but Python doesn't have one 😢

Flowchart it!

Use your favourite search engine to look for a 'bubble sort flowchart' - look carefully at the ones you have found and compare them to the algorithm above. When you have found one which looks the same or similar, print a copy out for your folders - remember to write your name on it and make sure it's got a suitable title.

Watch the animation

Watch the animation carefully. It shows an example of a bubble sort using the algorithm you have learnt.

Bubble sort step by step (click to advance the animation)

Trace the operation of a bubble sort

On paper, use a bubble sort algorithm to rearrange the following unordered list into ascending order, showing the list of numbers after each "pass". For each pass, state the number of comparisons you have made and the number of swaps you have made. To make this task easier, you could use playing cards or numbers on pieces of paper.

7,2,5,9,3

Consider laying out your work in a trace table like this (I've done the first one for you) ...

In your notebooks or on paper, answer the following questions in full sentences ...
1. After the first pass, one number will definitely be in the correct position and need never be checked again. Which number is it and what position is it in?
2. What happens to the number of comparisons on each pass?
3. There are two situations when the bubble sort algorithm will stop - what are they?
4. Write down a list of five numbers which will only require one pass (four comparisons) and no swaps. This is known as the best case.
5. Write down a list of five numbers which represent the worst case. How many passes, comparisons and swaps will you need in total now?

6. What is the maximum (worst case) number of comparisons and swaps for a list of 10 numbers?
7. What is the mathematical connection between the number of items in the list and the worst case behaviour of the sorting algorithm (passes, comparisons and swaps)?
8. Now that you have worked out the best and worst case, what is the average case for passes, comparisons and swaps?

Where we learn how to implement a bubble sort using Python

Of course, there is little point talking about Bubble sort unless we can implement it in Python! Let's go ...

Open up the Python programming environment and create a new, blank script. Save this in a suitable place in your documents and give it a suitable filename (for instance, `bubbleSort.py`).

Programming challenge

Carefully type in the following *unfinished* code into the script.

`import random`

`# Title`
`print('Bubble sort')`

`# Generate a random sequence of 16 integers`
`sequence = random.sample(range(100),16)`

# Print out unordered sequence
print('Unordered :',sequence)

# Start bubble sort

`# Finish bubble sort`

`# Print out the ordered sequence`
`print('Ordered   :',sequence)`

Use the pseudocode algorithm in Task 2.2 to complete the Python script between the comments! Don't give up if it doesn't work first time - you will probably type in the pseudocode expecting it to run, but it won't, however, the pseudocode is very similar to the Python code ... so ...

When you run the script (by choosing 'Run > Run Module' or pressing 'F5') you should get something like ...

Print the script

Print the script out in colour using Notepad++, write your name on the printout and annotate the script to explain how it works. Hand it into your teacher for assessment.

 Activity 2Insertion sort

Click to engage

An insertion sort looks at each item in the unordered list and works back through the list to insert it in its rightful place. When you find its place, you insert it and start to work with the next item in the unsorted portion of the list until you have worked all the way through the list.

AlgoRhythmics Insertion Sort (4:04)

Task 2.1 Playing card insertion sort
Where we learn how to simulate an insertion sort using playing cards

Get a suit of playing cards

Get one suit of 13 playing cards - from 'Ace' to 'King'.

What the ordered list should look like

Shuffle the suit

Shuffle the suit and lay them face down in a line on the desk in front of you.

What? The list is unordered again - but how do I get them back in order?

Insertion sort!

Follow the algorithm carefully to insertion sort the cards into ascending order. Don't cheat - you are only allowed to look at two cards at a time or else you'll spoil the magic. Your teacher will demonstrate this to you as well.

Insertion Sort Algorithm

 For each card starting at the second card from the left to the last card on the right    Turn over the card and hold this one in your hand over the space    Repeat until I reach the left most card or a card which is smaller than the card I am holding         Turn over the card immediately to it's left        Move that card one position to the right        Move the card I am holding one position to the left    Place the card face down in the space

When you have finished, turn over all the cards and be amazed! They are in ascending order!

Where we learn how to represent an insertion sort using pseudocode

You will need to remember the pseudocode version for your exam! Super nightmare!

In your notebooks / on paper

Copy the pseudocode algorithm below by hand into your notebooks / on paper. In this algorithm ...
• `i` is the loop index
• `item` is the item we are trying to insert
• `current` is the next position to check
• `placed` is a 'sentinel' which indicates whether the item has been placed.
Using the Boolean sentinel is one way of implementing this algorithm. There are simpler ways but this is the safest.

`FOR i ← 1 TO length(sequence)-1`
`  SET `item ← sequence[i]
`  SET `current ← i - 1
`  SET `placed ← false
`  WHILE (current >= 0) AND NOT placed    IF sequence[current] > item:`
`      SET sequence[current + 1] ← sequence[current]`
`      SET current ← current - 1    ELSE      placed = true    `END IF
END WHILE
SET sequence[current + 1] ← item
`NEXT` i

Flowchart it!

Use your favourite search engine to look for a 'insertion sort flowchart' - look carefully at the ones you have found and compare them to the algorithm above. When you have found one which looks the same or similar, print a copy out for your folders - remember to write your name on it and make sure it's got a suitable title.

... in the amazing Flowgorithm ...

Watch the animation

Watch the animation carefully. It shows an example of a insertion sort using the algorithm you have learnt. I have used the same short list of numbers from the bubble sort so you can compare them more easily.

Insertion sort step by step (click to advance the animation)

Trace the operation of the insertion sort

On paper, use an insertion sort algorithm to rearrange the following unordered list into ascending order, showing the list of numbers after each "pass". For each pass, state the number of comparisons you have made and the number of swaps you have made. To make this task easier, you could use playing cards or numbers on pieces of paper.

7,2,5,9,3

Consider laying out your work in a trace table like this (I've done the first one for you) ...

In your notebooks or on paper, answer the following questions in full sentences ...
1. At the start of an insertion sort, which element of the unordered list is actually in order?
2. What happens to the number of comparisons you need to make as you work through the unordered list?
3. Write down a list of five numbers which requires just 4 comparisons and no swaps. This is known as the best case of the algorithm.

4. Write down a list of five numbers which represents the worst case. How many comparisons and swaps would you need to carry out for this situation?
5. Work out the worst case for a list of ten numbers.
6. What is the mathematical connection between the number of items in the unordered sequence and the maximum (worst case) number of passes, comparisons and swaps?
7. What is the average case for passes, comparisons and swaps?

Where we learn how to implement an insertion sort using Python

Again, there is little point knowing the theory if we can't implement the algorithm in Python! Let's go ...

Open up the Python programming environment and create a new, blank script. Save this in a suitable place in your documents and give it a suitable filename (probably '`insertionSort.py`').

Programming challenge

Carefully type in the following *unfinished* code into the script.

`import random`

`# Print title`
`print('Insertion sort')`

`# Generate a random sequence of 16 integers`
`sequence = random.sample(range(100),16)`

`# Print out unordered sequence`
`print('Unordered :',sequence)`

`# Start insertion sort`

`# End insertion sort`

`# Print out the ordered sequence`
`print('Ordered   :',sequence)`

Use the pseudocode algorithm in Task 3.2 to help you to complete the Python code between the comments. Remember that the algorithm is not written in Python - you have to convert it into Python for it to work.

When you run the script (by choosing 'Run > Run Module' or pressing 'F5') you should get something like ...

Use Notepad++ (or equivalent) to print out your script in colour. Make sure you put your name on it and annotate the script to explain how it works. Hand it in to your teacher for assessment.

 Activity 3Merge sort

Click to engage

Merge sort is a funny algorithm to study at this level because it uses a technique called recursion which you wouldn't really learn until Stage 5, so I'm not going to expect you to understand the mechanics of the pseudocode version or certainly to implement a Python version of it.

A merge sort is in a class of algorithms known as divide and conquer because it relies on decomposing a larger problem into smaller ones until each smaller problem is trivial to solve.

It relies on breaking a list of 'n' items into 'n' single-item, ordered (!) sequences and then merging the sequences back together again, choosing the correct value from the available sublists. The animation above may make it clearer or you might want to have a little dance ...

AlgoRhythmics Merge Sort (4:17)

Task 3.1 Playing card merge sort
Where we learn how to simulate a merge sort using playing cards

Get a suit of playing cards

Get one suit of 13 playing cards - from 'Ace' to 'King'.

Ah, yes - an ordered list of cards

Shuffle the suit

Shuffle the suit and lay them face up in a line on the desk in front of you.

At least I can see where the cards are now! Your deck won't be the same as mine though ...

Merge sort!

Follow the algorithm carefully to merge sort the cards into ascending order. It's important that you can see the cards this time - it will make the algorithm easier to follow. Your teacher will demonstrate this to you as well.

Merge Sort Algorithm

 Repeat  Split the unordered deck in half into two subdecksUntil the length of each subdeck is oneWhile there are still unmerged subdecks  Choose a pair of subdecks  For each card in both subdecks    Choose the next lowest card and place it in a new subdeck

When you have finished, turn over all the cards and be amazed! They are in ascending order!

Task 3.2 Tracing the operation of a merge sort
Where we learn how to trace the operation of a merge sort

The merge sort uses a technique called recursion which is well suited to problems like this. Since recursion can be a little tricky to get your head around, we'll not cover this as a programming technique in detail until Stage 5. We will trace the operation of the merge sort in a different way than the bubble or insertion sorts but still we will keep track of the stages and operations like we did before.

In your notebooks / on paper

As the merge sort uses a recursive algorithm, the pseudocode version would require some quite advanced techniques. It's enough, at this stage to recall the 'structured English' version. Copy the following into your notebooks / onto paper by hand.

`REPEAT`
`  Split the unordered list in half into two sublists`
`UNTIL the length of each sublist is one`

`WHILE there are still unmerged sublists`
`  CHOOSE a pair of sublists`
`  FOR each value in both sublists`
`    CHOOSE the next lowest value and place it in a new sublist`

Watch the animation

Watch the animation below carefully. You should clearly see the two stages in the merge sort.

Merge sort step by step (click to advance the animation)

Represent a merge sort on paper

Using a suitable diagram on paper, describe the stages in the merge sort of the following unordered sequences. For each step in each sequence, make a note of the number of operations including splitting the original sequences and comparing and assigning items in the subsequence to the new sequence.

5,8,7,1,2,6,3,4

4,6,3,7,5,1,2

1,5,4,2,6,3

 Activity 4Comparing sorting algorithms

There is no cut and dried answer to 'which sorting algorithm should I use'. I totally depends on both the size and the initial state of the unordered sequence ...
• Reversed : The worst case where the numbers are in completely the wrong order - back to front!
• Random : There is no pattern in the sequence - totally random!
• Nearly sorted : The sequence is 'nearly' sorted - difficult to quantify what this means!

Why would you think that?

Where we complete a worksheet comparing sorting algorithms

Download a copy of the Comparing sorting algorithms worksheet and save this in a suitable place in your documents. You will have to print out the worksheet before you complete it. If you haven't got access to the algorithms for the three sorting methods, you can download Sorting algorithms handout - a handy reference :)

You will need to focus on the basic operations of all sorting algorithms ...
• Comparisons
Where one number is compared with another to check which is biggest (or smallest). A comparison is counted as one operation.

• Swaps
Where two numbers are swapped over. In practice, this is actually three operations because a temporary variable is used to facilitate the swap.

Complete the worksheet

You might need to use some extra paper for working and you might need to use the World Wide Web to find the answers to some of the questions. You could use this Sorting algorithms visualisation website to help you, remembering which algorithms you are studying. You might also need some friends with stopwatches! It might help with your timing efforts if you increase the length of the list of items to 40 or 50 so that each algorithm takes longer to finish.

 Activity 5Linear search

The simplest type of search simply looks through a sequence of values one-at-a-time until either the item is found or the end of the list of reached. As you can imagine, this is very inefficient for large sets of data. However, it will work with unordered data sets which is useful in some situations

CS50 Linear search (2:50)

Task 5.1 Following a linear search algorithm
Where we learn how to follow a linear search algorithm

As we have seen, the linear search algorithm is only really suited to sequences which are unordered.

An unordered sequence of names

Describe the process, verbally

Working with your shoulder partner, describe the algorithm verbally. Get your shoulder partner to write down your explanation whilst you are doing this. Then read through the algorithm - does it make sense? Refine and share.

Inspect the algorithm

Look carefully at the following algorithm for a linear search. It presumes an unordered list of integer values called sequence. Read it out loud to your shoulder partner and compare it to the process described in the video.

`INPUT target`
`SET match ← False`
`SET index ← 0`

`WHILE match = False AND index <= length(sequence)-1`
`  IF sequence[index] = target THEN`
`    SET match ← True`
`  `ELSE
`    SET index ← index + 1`
`  `END IF

`IF match = True THEN`
`  OUTPUT 'Item Found'`
`ELSE`
`  OUTPUT 'Item not found'`
END IF

Copy the pseudocode algorithm by hand into your notebooks / on to paper.

Watch the clever lion ...

Click the lion to load a scratch implementation of a linear search. Lion chooses a random number and then searches for it in an unordered list of numbers. He's not very clever though, so he has to follow the steps in the algorithm ...

Click Lion to watch him perform a linear search

Represent the algorithm as a flowchart

Task 5.2 Linear search in Python
Where we learn how to implement a linear search in Python

So, ready to implement this yourself? Again, I'm not really going to give you much help with this!

Open up the Python programming environment and create a blank script, save it in a suitable place with a suitable filename (like '`linearSearch.py`' for instance).

Programming Challenge

Carefully type the following *unfinished* Python code into the script. Correct, it's not finished ...

`import random`

`# Print title`
`print('Linear search')`

`# Generate random sequence of 20 integers`
`sequence = random.sample(range(100),20)`

`# Print unordered sequence`
`print('Sequence :',sequence)`

`# Start linear search`

`# End linear search`

Use the pseudocode algorithm from Task 6.1 to implement a Python linear search between the comments. Don't just type in the pseudocode commands - they are not Python (but they are not a million miles away!)

When you run the completed script, it should produce output something like this for an item in the sequence ...

15 comparisons are required to find the number 38

... or like this for an item not in the sequence ...

20 comparisons are required to find that the number 14 is NOT in the list!

Use Notepad++ to print out your attempt at this challenge. Remember to write your name on it and annotate the script to explain what it is doing. Hand it in to your teacher for assessment

 Activity 6Binary search

Binary search is an example of a divide and conquer algorithm. The sequence must be sorted in order for this algorithm to work correctly. Watch the video ...

CS50 Binary search (9:31)

Task 6.1 Following a binary search algorithm
Where we learn how to follow the progress of a binary search algorithm.

As you have seen, the binary search algorithm relies on the sequence being ordered.

An ordered sequence of names

Verbalise!

Working with your shoulder partner, explain how a Binary search works. Get your partner to write down your instructions carefully. Read back over what has been written - does it make sense? Rewrite, share!

Inspect the algorithm

Look carefully at the algorithm for a binary search. Read the algorithm aloud to your shoulder partner and compare it to the procedure described in the video and also your response to Step 1 ...

`INPUT target`
`SET start ← 0`
`SET end ← length(sequence)-1`
`SET middle ← 0`
`SET match ← False`

`WHILE (match = False) AND (start <= end)`
`  SET middle  ← (start + end) DIV 2`
`  IF sequence[middle] = target THEN`
`    SET match ← True`
`  ELSE IF sequence[middle] < target THEN`
`    SET start ← middle + 1`
`  ELSE`
`    SET end ← middle - 1`
`  END IF`

`IF match = True THEN`
`  OUTPUT 'Item found'`
`ELSE`
`  OUTPUT 'Item not found'`
`END IF`

Copy the pseudocode algorithm by hand into your notebooks / on paper.

Watch the clever ducky!

Click the ducky to load a scratch implementation of a binary search. Ducky chooses a random number and then searches for it in an ordered list of numbers. He's not very clever though, so he has to follow the steps in the algorithm ...

Click ducky to watch him perform a binary search

Represent the algorithm as a flowchart

Now use a tool like yEd Live (or even paper) to draw out a flowchart for this algorithm.

Task 6.2 Binary search in Python
Where we learn how to implement a binary search in Python

So, ready to implement this yourself? Again, I'm not really going to give you much help with this!

Open up the Python programming environment and create a blank script, save it in a suitable place with a suitable filename (like 'binarySearch.py' for instance).

Programming Challenge

Carefully type the following *unfinished* Python code into the script. Correct, it's not finished ...

`import random`

`# Print title`
`print('Binary search')`

`# Generate random sequence of 20 integers`
`sequence = random.sample(range(100),20)`
`sequence.sort() # WHAT! You can sort it like this!!!`

`# Print unordered sequence`
`print('Sequence :',sequence)`

`# Start binary search`

`# End binary search`

Use the pseudocode algorithm from Task 7.1 to implement a Python binary search between the comments. Don't just type in the pseudocode commands - they are not Python (but they are not a million miles away!)

When you run the completed script, it should produce output something like this for an item in the sequence ...

Only THREE operations are required in order to find the target is in the list

Only FOUR operations are required in order to find the target is NOT in the list

Use Notepad++ to print out your attempt at this challenge. Remember to write your name on it and annotate the script to explain what it is doing. Hand it in to your teacher for assessment

 Assessment Task (Homework)Get yourself a pack of playing cards and use them to practice the algorithms for all the searching and sorting algorithms that you have learnt. A number of you will be asked to demonstrate the algorithms to the rest of the class on the deadline for the assessment task, so you better all be ready!Grading rubricMASTER : You demonstrate the allocated sorting or searching algorithm with flair worthy of a performance on stage at the Royal Variety Performance. You should (and probably could) teach.APPRENTICE : You replicate exactly the demonstration from your teacher or from one of the videos you have watched (though probably not the AlgoRhythmics videos unless you have got a dance troop).NOVICE : Your delivery is awkward, but factually correct. You would not win X-Factor, that's for sure.

Quick win - sorting algorithms

Quick win - searching algorithms

15 Sorting Algorithms in 6 Minutes
Remember, you only need to know Bubble, Insertion and Merge for the exam!

 Hungry for more?

You should be using structured versions of these sorting and searching routines at some point in your programming. When we look at the section on structured programming, you will be asked to come back to this page and download the structured versions.
Look back at the video (with the horrible soundtrack) start of the lesson. Investigate how some of the other sorting algorithms demonstrated work. Remember that the merge sort implementation is too complicated for Stage 4.
Ask your teacher to show you how to use these structured versions.