### CS45 : Which one is best?

Humans are always searching for more efficient ways of solving problems. Enter BigO ...

We are learning ...
• That some problems can be solved by algorithms
• That the efficiency of the algorithm can be measured
So that we can ...
• Understand that we can compare algorithms using their time and space complexity as a function of the size of the problem / input.
• Understand that the same problem can be solved using different algorithms having different complexities
• Use BigO notation to express worst time complexity for algorithms in the following categories :
- Constant (linear search)
- Logarithmic (binary search, merge sort, quick sort) (y = log10x)
- Linear (y = 2x)
- Polynomial (bubble sort) (y = 2x2)
- Exponential (y = 2x)
- Factorial
• Be able to derive the time complexity of an algorithm through analysis
• Show awareness that algorithmic complexity and hardware impose limits on what can be computed

 Activity 1 What is an algorithm? (30)

An algorithm ...
• is a set of unambiguous instructions for solving a problem
• is used to transform inputs into outputs
Consider the following questions ... Q : Does an algorithm need to be efficient? Under what conditions would this be better? Q : Does it matter if an algorithm doesn't produce quite the correct output? Q : How long should we wait for an algorithm to run for before it produces an answer? Q : Should it matter how much memory my computer has got to run the algorithm?  Task 1.1 A 'good' algorithm Web browser First, read and make notes on Donald Knuth's definition of algorithmic 'goodness'. Next, use Google Maps to find the route from your school to central Manchester. Use the textual directions to complete this task, rather than the map. Finally, identify how the algorithm that Google uses to calculate your route demonstrates Don's principles. OUTCOME : Notes on algorithmic goodness and annotated route to Manchester Whether or not an algorithm gives the correct answer, however, is not the only consideration. We need also to consider how quickly the algorithm calculates the answer and how much memory it consumes in the process.

 Activity 2 Different methods, same answer (60)

Problem : Summing first 10 natural numbers

1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 = 55

As a computer can only perform simple addition of two numbers, we need to use a running total to perform this addition. We can break this process down into steps ...

1 + 2 = 3
3 + 3 = 6
6 + 4 = 10
10 + 5 = 15
15 + 6 = 21
21 + 7 = 28
28 + 8 = 36
36 + 9 = 45
45 + 10 = 55

Algorithmically (Algorithm A) ...

`sum ← 0`
`for count`` ←`` 1 to n`
`    sum`` ←`` sum + count`
`end for`
`output sum `

However, mathematicians (clever) know that this sum can also be completed using the formula, sum = n(n+1)/2 ...

Algorithmically (Algorithm B) ...

`sum`` ←`` n(n+1)/2`
`output sum`

... which reassuringly gives the same answer, 55. Task 2.1 Analysis Two algorithms for computing a sum STAGE ONE Open up the Python programming environment, IDLEIn a script : There are scripts available in the resources for the topic which automate both solutions to the sum of natural numbers problem and provide a timer so you can see how long each script takes to run. Download each script and save it to a suitable place in your user area.Now run each script for the following values of n and record how many milliseconds it took each script to complete it's operations. The answers should be about the same.n = 1n = 100n = 10000n = 1000000n = 100000000In your notebooks : Which algorithm is more efficient? Should be quite obvious! Write about what you have done and what you have found out.STAGE TWO In this task, you will manually analyse the two algorithms to determine how many 'operations' each one takes to complete based on different values of n. Create a table like this ... During this, and any future analysis, an 'operation' could be ... An input An assignment A comparison operation A mathematical operation An output For each of the following values of n, work out how many operations each algorithm performs in order to calculate the sum. Record your answers in the table (you'll need to be logical for higher values of n!) n = 1, 2, 3, 4, 5, 10, 100, 1000, 10000 Can you explain why Algorithm B is more efficient than Algorithm A? OUTCOME : Completed table and explanation of why one algorithm is more efficient than another. Activity 3 Complexity Theory (65)

A branch of computer science that investigates the problems related to resources required to run algorithms. There are two aspects to algorithmic complexity - related to time and space. In most algorithmic solutions, there is often a tension between time and space (as in life). Time complexity can often be improved at the expense of space complexity and vice versa.

 Time Complexity Space Complexity A measure of how many atomic operations the algorithm will execute to finish running based on the size of the input. A measure of how much storage space (for variables) is required during execution of the algorithm and the dependance this has on the size of the input. Task 3.1 Thetaman to the rescue! Web browser Comic book drawing skills First, listen to this podcast ... Your job is to create a cartoon strip telling the story of Thetaman. the You can download the script to help you with this task if you need to. OUTCOME : Comic strip entitled "The Adventures of Thetaman" There are three cases we could consider ...

 Best-case complexity The complexity computed when we choose an input that requires the shortest time or the smallest workload to complete. Worst-case complexity The complexity computed when we happen to choose an input which requires the longest time or the greatest workload. Average-case complexity The complexity computed by averaging the times for every possible input.

At this level, we only need concern ourselves with worst case or BigO complexity. The worst case of an algorithm is the particular set of inputs which cause the algorithm to run for it's longest possible time (or use the most space). Task 3.2 Worst cases Resources What is the worst case for ... Searching for a single item item in a given unsorted list of items? Sorting an arbitrary unsorted list of items with any given sorting algorithm? Write your ideas in your notebooks. OUTCOME : Your ideas about worst case scenarios. As we have already seen, measuring complexity is based on the assumption that every operation takes a finite time to execute on the computer. Therefore, there is no definitive solution to the worst case for any particular algorithm - it depends on how many resources we throw at it. If my computer is fast, an inefficient algorithm might not seem so bad.

Let's consider each common complexity band, one at a time ...

 Activity 4 Constant Complexity (15) O(1) An algorithm has constant time complexity if the time taken for its execution does not grow as the size of the input grows. For instance, an algorithm which works out if a number is odd or even will have O(1) complexity - no matter how large n gets, it will still take the same number of operations to find out the answer.

`If n % 2 == 1:`
`  odd = True `
`else `
`  odd = False` Task 4.1 Squaring a number Calculator Use a calculator to work out the value of a number squared.  Now double the number and work out the square of this number.  How many more operations are required the second time? Write about this example in your notebooks. OUTCOME : Description of a constant time complexity algorithm Activity 5 Logarithmic Complexity (25) O(log2n) For an algorithm of Logarithmic complexity, O(log2n), the time for execution grows but not as quickly as the size of the input grows. For instance, an algorithm called a binary search sits in this time complexity band.  Task 5.1 Binary search Brain Use the following lists to perform a binary search for the number 80 (I know it's not in the list!) 12, 30, 75, 83, 92 01, 11, 13, 17, 24, 38, 44, 59, 75, 84 04, 07, 14, 24, 24, 28, 39, 47, 51, 52, 53, 59, 60, 63, 64, 70, 72, 81, 84, 93 How many extra comparisons do you need to perform each time? Write about this example in your notebooks. OUTCOME : Description of a logarithmic time complexity algorithm. [ ADD IN MERGE SORT ]

 Activity 6 Linear Complexity (25) O(n) Algorithms of linear time complexity, O(n), are less time efficient than constant and logarithmic complexity algorithms. As the input n gets bigger they take more time; the amount of time taken grows at the same rate as the increase in n.

For instance ...
• Mowing a lawn - as the area of the lawn increases, the time taken to mow the whole lawn (worst case) increases at the same rate.
• Linear search in an unsorted list - as the size of list increases, the time taken to find an item not in the list (worst case) increases at the same rate.  Task 6.1 Colouring a chess grid Brain Take a piece of graph paper and draw out a 4x4 grid.  Colour alternate squares in black to form a chess board.  How many squares did you colour in?  Now quadruple the size of the grid to 8x8 squares and repeat the exercise.  How many squares did you colour in this time? Write about this example in your notebooks. OUTCOME : Description of a linear complexity algorithm Activity 7 Polynomial Complexity (35) O(nx)
In algorithms with polynomial time complexity, the execution time of the algorithm increase faster than the size of the input does. Often, x in O(nx) is 2 which leads to quadratic time complexity, but it can be 3 or 4 ...

For example, if you perform a complete handshaking exercise with your class.  How does the number of handshakes increase as you add one to the number of people you have to handshake with?  In fact, the number of handshakes is proportional to (n2-n)/2 which approximates to n2 as n gets very big!  Task 7.1 Bubble sort Resources Take a suit of cards from a pack.  Bubble sort 6 cards in their worst case (i.e. in reverse order).  Keep a track of how many comparisons and swaps you perform.  Now repeat this with 12 cards.  How many extra comparisons and swaps did you have to perform this time?  Was it exactly double or more or less than double? Can you work out the relationship between n and the running time? OUTCOME : Description of a quadratic time complexity algorithm. Special focus : The Bubble Sort It works by going through the list swapping items with the items next to them if they are the wrong way round. It may need to go through the list multiple times to sort it. If there are n items in the list then the algorithm will make n-1 passes through the list at most and on each pass it will make n-1 comparisons – giving a maximum of (n-1)x(n-1) comparisons which gives a time complexity of O(n2) in the worst case (where the list is in reverse order). This means that as the list size grows bigger the Bubble Sort becomes increasingly inefficient.  Interestingly, however, the space complexity of the Bubble Sort algorithm is O(1) – it only ever needs one additional memory location (to store one of the two values being swapped) no matter how many items are in the list! Activity 8 Exponential Complexity (30) O(kn) At this point, algorithms start to behave a little differently. They will run OK for small values of n, but as n increases, the time for execution increases much faster to the point where it's not worth waiting for the answer. Usually, k in the proportionality is 2, but it could be 3 or 4 ... Rates of bacterial growth are initially exponential Task 8.1 The Legend of the Chessboard Brain Another interesting example of exponential growth is described in 'The Legend of the Chessboard'. Watch the video and write about it in your notebooks. OUTCOME : Description of an exponential time complexity algorithm.  Activity 9 Factorial Complexity (25) O(n!) Here we reach the point where the running time of the algorithm increases extremely rapidly compared to the size of the inputs. Algorithms in this complexity band complete in reasonable time for small input sizes but quickly become unusable as n increases. We'll discuss this issue more in a future topic. Task 9.1 Travelling salesman Brain There are plenty of animations out there on Youtube about the travelling salesman problem and the chinese postman problem (both closely related).  Have a look on Youtube (use the links) and find a nice video or example which will help you understand the dependence of the number of cities and the number of routes.  The worst case (BigO) is the brute force method where each possible route is calculated and the one with the lowest distance accepted.  Investigate some other strategies which are employed to reduce the time taken to solve this problem. Write about what you have discovered in your notebooks. OUTCOME : Description of a factorial time complexity algorithm. Activity 10 Summary (40)

Remember that time complexity is a measure of how the running time of a particular algorithm increases as a function of the size of the input, n. Consider the following graph ... Task 10.1 Written summary Notebooks Summary of BigO Complexities with examples.docx Make suitable notes on (i.e. copy) the diagram below. Try to add examples of algorithms in each complexity band to your diagram as well. Download and print a copy of Summary of BigO Complexities with examples.docx to help you with this and stick this in your notebooks as well. Click to enlarge OUTCOME : Set of summary notes about algorithmic complexity plus examples of algorithms in those complexity bands. You need to remember the order of the complexity levels for the examination! Extension Activities