CS05 : Character Encoding

How denary numbers are represented in computer systems.  Introduction to Hexadecimal as an abbreviation of binary.  It is required by all examination boards.

https://docs.google.com/presentation/d/1xVdhF9H10M8IoNRdsid3JN5qcNtLtMcVF-6btgsIcoI/preview?slide=id.g3cb6bc7fc1_0_0Learning Outcomes
Character Encoding

We are learning ...
  • How characters are encoded using ASCII, extended ASCII and Unicode
  • To describe the purpose of Unicode
  • About character sets
  • That character codes are commonly grouped in sequences.
So that we can ...
  • Describe the use of binary codes to represent characters
  • Understand the connection between the number of bits used and the number of characters that can be represented
  • Explain how character codes are commonly grouped and run in sequence in coding tables

# Get Ready.png

Activity 1
Representing characters using binary

Click to engage

Everything on a computer is a code. Numbers, letters, symbols, images, sounds. In order to encode data, we use binary.  A system of binary encoding can be used to represent any number of different (and often related) items of data. The number of binary digits used in each code determines the number of different items of data that can be represented.

1 binary digit      : 2 items of data represented by 0 and 1
2 binary digits    : 4 items of data represented by 00, 01, 10 and 11
3 binary digits    : 8 items of data represented by 000, 001, 010, 011, 100, 101, 110 and 111
4 binary digits    : 16 items of data can be represented ...
5 binary digits    : 32 items of data can be represented ...

... etc ...

To use binary encoding to represent the directions of a rocket, I could use a 1-bit encoding table ...

To use binary to encode the direction of movement of a game character, I could use a 2-bit encoding table ...

We use as many bits to encode data as we need to ensure that there are at least enough combinations available for encoding. Sometimes, this means that we have some codes left unused, but that's OK.

Task 1.1 Encoding systems
Where we learn how you can 'encode' data using binary for different purposes

In your notebook / on paper

Answer the following questions in full sentences in your notebooks.
  1. Design a binary encoding table for all 8 major directions on a compass (using 3 bits).
  2. Design a binary encoding table for each of the 8 legs of an robot octopus (using 3 bits).
  3. Design a binary encoding table for days of the week (using 3 bits).
  4. Design a binary encoding table for each month of the year (using 4 bits).
  5. Design a binary encoding table for the numbers 1 to 9 (using 4 bits).
  6. Design a binary encoding table for EU member countries (using 5 bits).

Demonstrate your learning

Explain to Erik how these encoding systems work. How do you decide how many bits you need to use to encode a certain collection of things and what do you do with the codes left over?

Task 1.2 Character encoding system
Where we make your own character encoding system using coloured cards

There are 26 letters in the English alphabet therefore, you need at least 5 bits to encode it. This actually gives you 32 possible combinations - only 26 are needed and therefore, there are 6 left over unused ...

Download the resource sheet

Download the '5 Bit binary character encoding' document (your teacher may give you a copy) and use this to communicate in code with another student. You will need to ask your teacher for a set of encoding cards.

Demonstrate your learning

Explain to Erik what you have found out by writing a description of the task you have just done in your notebooks or on paper in your folders.

https://docs.google.com/presentation/d/1xVdhF9H10M8IoNRdsid3JN5qcNtLtMcVF-6btgsIcoI/preview?slide=id.g41e978c5f6_0_9Activity 2

Click to engage

With a 5-bit encoding system, you can represent the letters of the alphabet plus a couple of symbols and control characters. We can't represent upper and lower case characters or numbers using this system however because we haven't got enough codes available.

Task 2.1 The ASCII encoding table
Where we learn about the ASCII character encoding system

ASCII stands for 'American Standard Code for Information Interchange'. It was designed to standardise the encoding table for characters across the whole world!

Print out a copy of this table - it's dead useful!

Print out the following popup and stick it in your books (or ask your teacher for a copy). There is a lot of information on this sheet - your teacher will guide you through it.

Click to engage


Using wooden beads / string

Find the binary code for your first initial in ASCII. With 8 beads, colour in the '1' beads and leave the '0' beads natural. String them on a 'necklace' for your own geeky jewellery!

What is my first initial?

Demonstrate your learning

Now complete the following tasks, writing your answers in your notebooks in full sentences.
  1. Encode your first name in ASCII - write down a comma separated list of decimal ASCII codes from the table. Now encode the name "Isaiah" in ASCII as well.
  2. ASCII is a 7-bit encoding system represented using an 8-bit byte.
    a) How many different codes does a 7-bit encoding system give you?
    b) Why is ASCII represented using an 8-bit byte?
    c) What is the 8th (leftmost) bit sometimes used for? Can you find out?
  3. Look carefully at the ASCII table - write down the range of decimal codes for the characters 0 to 9.
  4. Look carefully at the ASCII table - write down the range of decimal codes for the characters a to z.
  5. Look carefully at the ASCII table - write down the range of decimal codes for the characters A to Z.
  6. Using just 140 characters, write Isaiah a definition of "ASCII".

Task 2.2 Patterns in the codes
Where we look for patterns in ASCII codes - there are lots!

ASCII is a binary encoding system. In the table in Task 2.1, I've also given the character a decimal code to make it easier for you to write down the codes and, as we'll see later, these decimal codes are used in programming as well.

The character ranges in the ASCII table have a special significance which is not entirely obvious looking at the numerical codes. Carry out the following activities, recording what you have done in your notebooks. For each activity, you will need to create a table with the following headings (I've put sample content in to show you what to do) ...

Concerning the numbers 0 to 9

Find the numbers 0 to 9 on the ASCII encoding table you have in your notebooks / folders. Make a table using the format above containing the 8-bit binary ASCII codes for the numbers 0 to 9.

In your notebooks / on paper

Look carefully at the codes - can you see a pattern? (HINT : there is one!) Write about what you see.

Concerning lower case and upper case letters

Make two tables side-by-side, one for lowercase letters and one for uppercase letters.

In your notebook / on paper

Look carefully at the 8-bit binary codes - can you see a pattern? (Hint : there is one!) Write about what you see.

# Fact.png

Task 2.3 ASCII codes in programming
Where we learn how to use ASCII codes in Python programming

Get ready to code!

Open up the Python programming environment.

At the prompt

Type the following instructions at the console, pressing  Enter  after each one ...

>>> print(chr(70))
>>> print(ord('F'))

In your notebook / on paper

Write about what happened in your notebooks and try to explain to Isaiah what the chr() and ord() commands do. What use are these? Try typing the following commands at the prompt ...

>>> print(chr(ord('h')-32))
>>> print(chr(ord('T')+32))
>>> print(ord('4')-48)

Can you explain to Isaiah what these code snippets are actually doing? HINT : Use your ASCII table to help you!

In a Python Script

Here is a Python program for you called 'ascii_converter.py' which defines two Python functions ...
  • ASCIIToString(sequence) which converts a sequence of ASCII codes into a string
  • StringToASCII(string) which (unsurprisingly) converts a string into a sequence of ASCII codes
Download the script, save it to a suitable place in your documents. Find the script, right click it and choose 'Edit with IDLE' and press  F5  run it in your Python programming environment. The scripting window will appear and the script will run although it will look as though nothing has happened

Now type the following commands exactly in the console, pressing  Enter  after each one.

>>> ASCIIToString([72,101,108,108,111,33])
>>> StringToASCII('Goodbye!')

Explain the functions worksheet

Inspect the two functions in the script and try to figure out how each one works. Write your ideas on the 'Explaining the functions' worksheet, print it and give it to your teacher (or Isaiah if you can find him).

# Faster Workers.png

Can you use the script to help you to write and test another two functions ...
  • toUpper(string) - converts a string into all UPPERCASE letters
  • toLower(string) - converts a string into all lowercase letters

https://docs.google.com/presentation/d/1xVdhF9H10M8IoNRdsid3JN5qcNtLtMcVF-6btgsIcoI/preview?slide=id.g41e978c5f6_0_23Activity 3
Extended ASCII and Unicode

Click to engage

With pure 7-bit binary encoding, the 8th bit of the byte can be used for an error checking system called 'parity'. However, we could use it for more character encoding.

Task 3.1 Extended ASCII
Where we learn what difference an extra bit makes in a character encoding system

In your notebook

Attempt the following activities. Where necessary, write the answers in your notebook.
  • How many more characters could I encode if I used 8 bits instead of 7?
  • Click on the following popup, print it out and stick it in your notebooks (or ask your teacher for a copy).
  • Why might people want to use Extended ASCII?

Task 3.2 Unicode
Where we learn about the third and most modern character encoding system that has nothing to do with unicorns

Even 8 bits is not enough for some languages! Greedy!

Visit the website shown in the presentation and take a little time to look through some of the Unicode code charts (they are PDF documents and may need downloading). 

In your notebook

Attempt the following exercises, recording your responses in your notebook where possible.
  1. Unicode is a 16 / 24 or 32 bit encoding system. Why is it necessary to have this many bits?
  2. The Unicode Consortium was founded in 1991. Can you think why it was founded then? (HINT : A really important, global technology was invented in 1989 ...) 
  3. Challenge : Can you find the emoticons character set? If you can, print out the second page for your folders.
  4. Download and inspect the 'Devanagari' Unicode table - this is a correct table for Reuben to use to encode the Hindi character set for this computer.

Assessment Task (Homework)

Download the '(Not So) Secret Codes' worksheet, print and complete. Please accept my apologies for the tedious nature of this, but you'll certainly get to know your ASCII table by the end of it!

Grading rubric

MASTER :  You converted the whole message from ASCII codes to characters using the ASCIIConverter.py script and you provided me with some evidence that you did it (because I wouldn't be able to tell otherwise).
APPRENTICE You persevered and converted the whole message from ASCII to characters by hand. Good on you.
NOVICE : You gave up after converting half the message because you got bored. It's a jolly good job computers don't get bored, isn't it?

# Flash cards.png
Click to load key word list to help you make your own flash cards 

Hungry for more?

Baudot code

Technically, 5 bit character encoding began in the 16th century with the development of Bacon's cipher but 5 bit machine encoding was developed in 1870 by a French telegraph engineer called Émile Baudot.

Do some web research into Baudot code and write about what you have found out. If you like Coldplay, take a look at the following website and read about the X&Y album cover here. Can you find a 'baudot tape' which has the same encoding system as the X&Y album cover image?

Websites worth a read

Crack on and learn a bit more!