Login

Please fill in your details to login.





s5cs08 encoding characters and symbols

I am not a number
image

With some bizarre reference to the 1960's TV series 'The Prisoner', we enter the world of character encoding.

We are learning...

About the ways in which computers can represent characters

So that we can...

Differentiate between the character code representation of a decimal digit and it's pure binary representation
Describe ASCII and Unicode systems for coding character data
Explain why Unicode was introduced
Use relational operators with characters
Have practical experience in the use of string manipulation functions in programming

Computers are great at binary. That's it - they are great at binary. They can do binary in their sleep, binary is their 'thing'. Ask a computer to do some binary and it'll say "Yep - I can do that". Ask a computer to do 'characters' or 'letters' and it'll say "Huh?" (but in binary, of course).

It's up to us, the humans, to design encoding systems which allow computer to represent characters and letters in binary. It doesn't mean that the computer actually understands it; it still needs a human to interpret the codes.

Activity 1
Baudot code

Character encoding is not a new thing. Baudot code is a coding system originally developed by Emile Baudot in 1874 for use on teleprinters to send messages along the reasonably recent international telegraph systems. The voice telephone was only invented in the late 1870s and was not widely adopted for communication until the 20th century.

image

time limit
Task 1.1 Baudot Code

Research Baudot code on the World Wide Web and find out...

when it was developed;
what it was used for;
who used it;

Do all Baudot representations work the same way? Find two different code representations from the web and compare them. Write your findings in your assessment book and give the website references for what you have found and the date you accessed them.

image
Baudot Code

OUTCOME : Summary of Baudot Code.

Checkpoint

image

time limit
Task 1.2 Coldcode

Read the article "Decoding Coldplay's X&Y Album Cover" and then visit http://ditonus.com/ or download and unzip the "coldcode.zip" file and have a play with creating your name in Baudot code. Print out what you have done and put it in your assessment book and explain how it works.

OUTCOME : Your name as a Coldplay Album cover plus an explanation of how it works.

Checkpoint

Activity 2
Night writing

Night writingI have no idea what this means, or sonographyI have no idea what this means, was a system of code that used symbols of twelve dots arranged as two columns of six dots embossed on a square of paperboard. It was designed in 1808 by Charles Barbier in response to Napoleon's demand for a code that soldiers could use to communicate silently and without light at night. Each grid of dots stands for a character or phoneme. Unfortunately, the system proved too complicated to use in the field.

Undeterred, Barbier intended to adapt the system for civilians and presented his system at the Royal Institute for Blind Youths in Paris. The lecture was attended by a 12 year old boy called Louis Braille who later adapted the system for use by blind people. You might want to read this article.

image

time limit
Task 2.1 A pattern in the dots

Look at the following Braille alphabet.

image

1
Can you see any pattern in the dots? Write down your ideas.
2
This alphabet is used to represent 26 characters. How many characters could this system represent in total?
3
Using a piece of paper, blutack and a pen, write a braille message to your friend. How easy is it to read? Stick your effort into your assessment books.

OUTCOME : Answers to question 1 and 2 and a message in Braille (stuck in your book)

Checkpoint

Activity 3
Morse code

Morse code has been in use for naval and civilian communication since the mid 1800s. Samuel Morse helped to develop the system of Morse Code for long distance communication ...

image
“In 1825 New York City had commissioned Morse to paint a portrait of Lafayette, then visiting Washington, DC. While Morse was painting, a horse messenger delivered a letter from his father that read, "Your dear wife is convalescent". The next day he received a letter from his father detailing his wife's sudden death. Morse immediately left Washington for his home at New Haven, leaving the portrait of Lafayette unfinished. By the time he arrived, his wife had already been buried. Heartbroken that for days he was unaware of his wife's failing health and her death, he decided to explore a means of rapid long distance communication.”

http://en.wikipedia.org/wiki/Samuel_Morse#Telegraph

time limit
Task 3.1 The Morse Code Activities

Use the following activities to familiarise yourself with Morse Code and write some notes in your notebooks.

Use the following website to practice your Morse Code skills. Write up what you have done in your notebooks.
As an extra bit of fun, visit this website and write your own 'secret' morse code message to your friends.
Print out the following Morse code table and stick it in your notebooks.

image

Play the following sound. What does this have to do with Morse Code? Write down your ideas in your notebook.


OUTCOME : Description of Morse Code.

Checkpoint

Activity 4
No standard for us!

Early computers did not communicate with each other. Therefore, there was no need for any standard system to be used.

image

Individual manufacturers used their own encoding systems...

BCD (Binary Coded Decimal) was used in calculators and early computers from around 1959 onwards
EBCDIC (Extended Binary Coded Decimal Interchange Code) was used for IBM Mainframe computers (1963)


time limit
Task 4.1 Early encoding systems

Research these two early character encoding systems used by computers. Produce a simple summary of each using mind-mapping software like Freeplane or an online mind mapping service like Bubbl.us (click 'Start Brainstorming >' to create a mindmap without signing in. Consider...

when it was created,
why it was created,
what it was used for,
the main principles of encoding,
some examples of data encoded using this method.

Print your mindmap out for your notebooks.

OUTCOME : Mindmap showing details of BCD and EBCDIC

Checkpoint

Activity 5
ASCII

A keyboard is an input device which turns keypresses into binary electrical signals. Clearly, each letter / character / number on the keyboard must produce a different set of binary signals and each signal must be able to be interpreted as a different one by the computer.

Task 5.1 My very own code table!

If we represented all the characters on a computer keyboard using binary, how many different patterns would we need? Consider lower and upper case characters, digits and other symbols. How many bits would you need to use to do this? Which codes would you use for each symbol?

image

Create your own code table using the handout my-very-own-code-table.docx available in the lesson resources. To do this activity properly, you should abandon any previous knowledge you have about ASCII!

image

OUTCOME : Your own character encoding table for the UK keyboard.

Checkpoint

image

The currently accepted standard for character encoding is ASCII. ASCII stands for American Standard Code for Information Interchange. It was developed as an international standard for character coding. ASCII is a 7-bit code and can therefore be used to represent up to 128 different characters.

Refer to the handout ascii-code-table.pdf and complete the following task.

Task 5.2 Analysing ASCII

Colouring in

On a copy of ascii-code-table.pdf, using 4 different coloured pencils, shade the following sections of the character set...

Uppercase and lowercase letters in ORANGE
Numbers in BLUE
Other characters in GREEN
Commands in RED

Patterns in the code

If you look carefully, you should be able to see patterns in the binary codes? For instance...

0111000 : A code for a picture of a number eight
0001000 : The decimal value for the number 8

Can you see any more patterns in the binary codes? Discuss your ideas with your peers and write down what you have discovered in your notebooks.

History of ASCII

There is a document available called ascii-explained-in-considerable-detail.pdf. Try to find out (and write about) the significance of the 'DEL' symbol and why it was used. This document also contains the original specification for ASCII which does make some interesting reading regarding the origins of the system.

OUTCOME : Coloured ASCII code tables, identification of patterns in the binary codes and a written explanation of the significance of the DEL symbol.

Checkpoint

Activity 6
Character (and string) functions

String handling in programming languages involves two aspects - the handling of ASCII codes and characters (or UnicodeInternational character encoding system. Represents a wide range of character sets using 8, 16 or 32 bits. Most common format is UTF-8 and UTF-16. to be precise) and string manipulation. In this section, we will look at handling of ASCII values and characters.

For the following task, create a word processed document with suitable headers and footers. Record what you have done and what you have learnt using screenshots and written explanations. Do not copy and paste code.

Task 6.1 Demonstrating that you get it

image

1
Python!

Open up the Python programming environment, IDLE.

2
Converting to and from ASCII

Investigate the
chr()
and
ord()
functions by typing in the following commands at the prompt, pressing the ENTER key after each one.

>>> chr(78)
>>> ord("N")


Can you explain what is happening? Write your ideas in your word processed document.

3
Converting ASCII code to pure integer (the hard way)

We can specify pure binary numbers using binary literal format where you prefix the binary number with
0b
or
0B
. We can also convert numbers to binary using the
bin()
function or use the string format functions to force a binary string in the output. Type the following commands at the prompt, pressing the ENTER key after each one.

>>> 0b11001100
>>> 0B00110011
>>> bin(56)
>>> bin(145)
>>> "{0:07b}".format(49)
>>> "{0:07b}".format(0b1001)


Notice that the
bin()
function only generates as many binary digits as it needs to represent the denary number whereas the
string.format()
method generates a suitably padded binary string. So, what has this got to do with ASCII?

Look carefully at the following table ...

Digit
0
1
2
3
4
5
6
7
8
9
ASCII
0110000
0110001
0110010
0110011
0110100
0110101
0110110
0110111
0111000
0111001
Binary
0000000
0000001
0000010
0000011
0000100
0000101
0000110
0000111
0001000
0001001

image
Can you see that the four least significant digits in the ASCII codes for the numerals 0 - 9 represent the denary values? We can use masks to convert binary ASCII codes into their corresponding integers. You can apply a mask using either an AND (
&
) operation (to unset bits) or an OR operation (
|
) (to set bits) depending on what you want to achieve. In this case, we only want to allow the four least significant bits to appear in our masked value by unsetting the most significant three, so our AND binary mask will be...

ASCII code to Denary :
code & 0b0001111
Denary to ASCII code :
code | 0b0110000

Type the following commands at the prompt, pressing the ENTER key after each one.

>> "{0:07b}".format(ord("5"))
>> "{0:07b}".format(ord("5") & 0b0001111)
>> int(ord("5"))
>> int(ord("5") & 0b0001111)
>> int(ord("5") & 15)


Can you explain what is happening? Write your ideas in your word processed document.

image

4
Converting characters into upper and lowercase (the easy and the hard way)

First, we'll look again at the straightforward method using string modifiers. Type the following commands at the prompt, pressing the ENTER key after each one.

>> message = "HeLlO aNd WeLcOmE"
>> message.upper()
>> message.lower()


Explain what you have discovered in your word processed document. Easy. I realise that we have used strings in this example rather than single characters - I just thought that the example would be a bit clearer that way 😄

image
So, how about making it a little harder? After all, this is A Level! If you look closely the ASCII codes for the capital letters start at 1000001 for 'A' through to 1011010 for 'Z' and the lower case letters start at 1100001 for 'a' through to 1111010 for 'z'. The only difference in the codes is the presence of the 6th bit in the lowercase letters. If we flip this bit, we can convert from upper to lowercase; we can use an 'OR' mask to convert uppercase into lowercase (by setting the 6th bit) and an 'AND' mask to convert lowercase to uppercase (by unsetting the 6th bit) :

Uppercase to Lowercase :
code | 0b0100000
Lowercase to Uppercase :
code & 0b1011111

Type the following commands at the prompt, pressing the ENTER key after each one.

>> "{0:07b}".format(ord("h"))
>> "{0:07b}".format(ord("h") & 0b1011111)
>> "{0:07b}".format(ord("H"))
>> "{0:07b}".format(ord("H") | 0b0100000)
>> chr(ord("h") & 0b1011111)
>> chr(ord("h") & 95)
>> chr(ord("H") | 0b0100000)
>> chr(ord("H") | 32)


Can you explain what is happening? Write down your ideas in your word processed document.

image

5
Checking the identity of a character

OK - a bit of light relief. These Python functions are easy and will work on either single characters or strings.

Use
string.isdigit()
to check for digit 0-9
Use
string.isalpha()
to check letters
Use
string.isalnum()
to check for letters and/or digits

Your job is to come up with some examples to show how these work in practice. Document what you have done and what you have learnt in your word processed document.

Now print out your word processed document and stick it in your notebook.

OUTCOME : Word processed document which explains how character (string) functions operate.

Checkpoint

Activity 7
Character codes are ordinal

Which is 'bigger'? A or Z? Clearly, the letters themselves have no greater or lesser significance so how can they be 'orderable'? Since characters in any character set are represented by binary (integer) codes, you can order the characters based on their character code. For example...

image

...only works because the ASCII code of A is 65 and the ASCII code of Z is 90. See?

Task 7.1 Orders please!

Decide whether the following comparisons are True or False. Hint : Use the ASCII Code Table!

"G" > "F"
"6" > "9"
"[" > "]"
":" < "{"
"<" <= ">"

Can you come up with three more to test your peers?

image

OUTCOME : Statements of ordinality.

Checkpoint

It is possible to use the ordinal nature of characters to perform calculations. For instance, I have to calculate the progress of my GCSE students based on the number of grades progress they have made. Their target grade represents 3 levels (or grades) progress from their intake score at KS2.

For instance, if a student is targetted a grade B but achieved an A, this represents 2 levels of progress. We can calculate his progress using...

def progress(target,grade):
# Calculate progress. Assume target is 3 LoP
# Does not handle A* or U!
    progress = ord(target)-ord(grade)+3
    return progress


time limit
Task 7.2 Python programming, innit.

Implement this function into a suitable program which asks for the grade and the target and outputs the levels of progress. Then, if you fancy a challenge ...

1
Alter the function so that you have to specify the target LoP (assumed in this example to be 3)
2
Implement 'A*' and 'U' grade handling

Provide evidence of what you have done as a suitably formatted code listing and screenshots of input and output. There is no need to explain how the code operates as long as you can provide suitable evidence of testing.

OUTCOME : Evidence that you have implemented the progress calculator function, plus extensions if you can.

Checkpoint

Activity 8
Fonts

A character code does not tell us anything about the appearance of the characters. Character codes are used to enable the representation of pictures of letters, numbers, characters and symbols. The pictures can change but the code remains the same. We use fonts to represent the pictures of the characters in different ways.

image

Have you ever been foolish enough to try to send a 'secret' email using Windings? Here's the thing – it's not secret! Just because you change the font, doesn't change the character code! Character code 078 is still an 'N' even if it looks like a skull and crossbones!

image

time limit
Task 8.1 Not so secret message!
Task

Decode the following secret message and write the answer in your notebooks. (You might need to use Character Map or a Word document > Insert Symbol to help you!)

image

OUTCOME : Decoded message

Activity 9
Unicode

Since most people don't want to be restricted to the limited character set that ASCII provides, first Extended ASCII and then Unicode (Universal character cod(e)ing system) was developed. Unicode is now an international standard for consistent encoding, representation and handling of text expressed in most of the world's writing systems.

time limit
Task 9.1 Unicode charts galore!


Explore the character coding tables at http://unicode.org/charts/ which are all PDF documents containing the full character sets then answer the following questions in your notebooks.

Each unicode character is represented by a 4 hex digit code. How many bits does this represent?
How many characters can be represented in each coding table?
Investigate BabelMap (also available to download from the lesson resources) - software for inspection of Unicode character sets.
Why is Unicode so much better than ASCII?

OUTCOME : Answers to questions about Unicode.

Checkpoint

image

Extension Activities

How about these...

1
ASCII Art

One of the earliest examples of ASCII art was from the 1920's - long before the advent of computers! It was done using a typewriter. You might want to read more about this from this website.

image

2
Programming Challenge : ASCII to String

Write a program to take a sequence of ASCII codes and use them to form a character string, which is then displayed. Below is a sample program run.

image

To help you, the structured English for this problem might look like this ...

get list of ASCII codes and construct a list
loop through each item in the list, convert to a character and append to string
display string


As a reminder, the Python command for converting an ASCII code to a character is
chr(code)



3
Programming challenge : String to ASCII

Create a new program that will input a short message as a character string then output a sequence of ASCII codes as in the following example.

image

Again, to help you, here is some structured English ...

get string from user
construct a list of ASCII codes from each character in the string
loop through the list and display each code


The Python command for converting a character into an integer ASCII code is
ord("character")
in case you didn't already know :)


What's next?
Before you hand your book in for checking, make sure you have completed all the work required and that your book is tidy and organised. Your book will be checked to make sure it is complete and you will be given a spicy grade for effort.

END OF TOPIC ASSESSMENT

Last modified: February 14th, 2024
The Computing Café works best in landscape mode.
Rotate your device.
Dismiss Warning