CS35 : Speaking formally


Computers aren't like us. They (or rather the programmers that instruct them) find it very difficult  to interpret free speech. Formal language definitions are needed to help them a little.

We are learning ...
  • About the nature of languages
So that we can ...
  • Identify ambiguity in natural language
  • Explain the need for computer languages to have an unambiguous syntax

So, you sit down and start to write your latest, million selling text-based application ...


Syntax errors like this are the annoying to the beginning programmer but are necessary to highlight the fact that you have made a mistake in the structure of the language. In this case, Python required you to match the brackets and you didn't. 

Contrast this example with the following ...


There is some evidence to suggest that the human brain is capable of magical things regarding language. Typoglycemia↗ is the ability (that most of us have) to be able to read text if the middle letters in the word are randomly scrambled.

https://drive.google.com/file/d/0B83yXMOilskaT2ZwSU9tNFV4ZjQ/view?usp=drive_web
Click to view

We are also very good at working out the meaning of text even if the grammar is wrong. However, maybe the combination of both is simply too much for us?

Activity 1 Grammar (55)

The language we speak has the same types of grammar rules that Python has but the 'engine' that interprets what is written (the brain) is better at working around the errors than the Python interpreter is. All programming languages that are interpreted by computer have these strict set of syntax rules that govern exactly how the language is to be interpreted by software. 


But what about the meaning of what is written? Does that always make sense?

Task 1.1
Generating grammatically correct sentences
Web browser
  • There is a nice website which will allow you to generate random English sentences based on a fixed set of grammar rules. These sentences usually make sense ...
  • Visit this website and generate some random paragraphs of text. Again, they conform to grammar rules but rarely make any sense.
OUTCOME : Nothing really.

In summary, there are two broad types of languages ...
  • Natural languages which are spoken or written by humans to communicate;
  • Formal languages which are interpreted by machines.
Both types of language comprise ...
  • an alphabet which contains legal symbols in the language;
  • words which are elements of the language made from one or more symbols from the alphabet;
  • syntax which defines the structure of the language. We call this grammar;
  • semantics which defines the meaning of the language.
Natural languages are ambiguous in nature - there are lots of ways of saying the same thing. Formal languages cannot be ambiguous since they are designed primarily to be interpreted by machine.


Task 1.2
Making some notes
Book

Make some notes on the above section, including definitions.

OUTCOME : Some lovely notes about the two 'types' of language and other definitions.


In natural languages, many words have different meanings depending on their context even though they are spelt the same. These are called homonyms and cause nightmares for computers.



Activity 2 Formal languages (40)

In formal languages, we define the alphabet as the set of symbols valid in the language (a finite list) and the language itself as a set of "words over the alphabet". The language can be finite or infinite depending on the rules of combination of the symbols in the alphabet. The grammar of the language is defined using metalanguages like ... 
  • Regular Expressions (RE);
  • (Extended) Backus-Naur Form ((E)BNF);
  • Reverse Polish Notation (RPN).
These grammar definitions say nothing about the meaning or semantics of the language, only how valid words are to be constructed over the alphabet - the syntax if you will.

The table shows some examples of formal languages.

https://drive.google.com/file/d/0B83yXMOilskaZ2lxUjZ5YnR6Y3M/view?usp=drive_web
Click to view

The first two examples are infinite languages because there is no limit to the number of different ways of combining the symbols in the alphabet. The last one, Pencil hardness, is a finite language - there are only certain words which are valid over the alphabet.

Task 2.1
Internet research
Web browser
Brain
  • Visit good old Wikipedia and read about formal languages.
  • Visit the Pencilpages website (don't get too excited) and read about pencils.
OUTCOME : Make some notes about formal languages and pencils in your notebooks.


Extension activities

Have a go at the following activities to extend your understanding.
  • Visit the Typoglycemia generator and use it to amaze yourself.
  • Come up with some of your own homonym sentences.

What's next?

Before you hand your book in for checking, make sure you have completed all the work required and that your book is tidy and organised. Your book will be checked to make sure it is complete and you will be given a spicy grade for effort.

END OF TOPIC ASSESSMENT