s5cs38 regular expressions
This page is mainly about s5cs38 regular expressions

We are learning...
How to use regular expressions
So that we can...
State that a language is regular if it can be described using a regular expression
State that regular expressions are a simple way to represent a set
Describe simple languages using regular expressions
Use regular expressions for string matching
Describe the relationship between a regular expression and a finite state automata

Activity 1
What are regular expressions?
A regular language is a formal language that will be...
accepted by a deterministic finite state automaton
accepted by a non-deterministic finite state automaton
represented by a regular expression
A regular expression describes a set of strings to define a formal language. They are mainly
used for pattern matching to find valid strings in a formal language:
words in documents
files in folders
spelling checkers
commands in a programming language
validating email addresses
validating postcodes
syntax highlighting
virus signatures
Regular expressions are often abbreviated to RegEx or RE.

Task 1.1 Regular expression metacharacter reference
Print out and review the metacharacters on the handouts and make sure these are filed for reference. Your teacher will talk you through the contents if necessary.
Outcome: Handouts in your notebooks.


Activity 2
Developing and testing regular expressions
There are lots of different flavours of regular expressions depending on the implementation, so be careful.
There are also lots of different pieces of software and platform implementations of regex that you can use. A nice free one is The Regex Coach.
However, since you have to install it, we will be using an online regular expression engine called Regexr.com ...


Task 2.1 Developing regular expressions
Step one
There is a series of really good videos on YouTube using Regexr which will help you to learn how to use the platform. Watch all of them in order, trying things out on the Regexr website as you watch. Take notes if you need to.
Step two
Next, a few exercises to help you to understand regular expressions.
1
Check out these regular expressions on the Regexr.com website. Can you find two matching strings for each one and state whether the language is finite or infinite? HINT: The presence of particular metacharacters makes a language infinite...

Which of the following matches the regular expression
[a-zA-Z]*[^,]=
a)
Butt=
b)
BotHEr,=
c)
Ample
d)
FIdDiE7h=
e)
Brittle =
Design and test a regular expression which defines the following language.
a) It comprises the alphabet
{#, ,a,b,c,d,e,f,g,h,i,j}
b) Valid words in the language must start with the
#
signc) The hash sign must be followed by 1 or more letters in the range specified
d) Valid words must end in two hash signs and a space to denote the end of the word.
A language, L, is defined by the alphabet
{a,b,c}
together with the regular expression [ab]+bb
a) Explain in English what constitutes a valid string in L.
b) Give five examples of valid strings in L.
c) Explain whether the language is finite or infinite.
Say whether the following strings match the regular expression
be[ea]n
a)
ben
b)
been
c)
bean
d)
beean
Say whether the following strings match the regular expression 10[1]*01[/p]
a)
1001
b)
100101
c)
10101
d)
1011101
The word 'licence' has been spelled in the following ways in a piece of text: licence, license, lisence, license. Write a regular expression that will find all instances of the word 'licence' spelt in these different ways.
Say whether the following strings match the regular expression
\d+[-\s]\d+
a)
01296-433006
b)
01296--433006
c)
01793 234589
d)
01793 234589
Which of the following matches the regular expression
[a-z]+[\.\?!]
If the string does not match, tell me why!
1.
battle!
2.
Hot
3.
green?
4.
swamping.
5.
jump up.
Design a regular expression which matches only the following strings (finite language) ...
x,yz,zx
Write up carefully what you have done in your notebooks. It is quite likely that you will simple copy and paste the regular expression and the strings into Regexr, however, to demonstrate your understanding, you will need to explain the results that you have got rather than stating them.
Outcome : Worked solutions to all 10 regular expression questions


Activity 3
The relationship between regular expression and finite state machines
Regular Expression to FSA
Since a regular language can be described by either a finite state machine or a regular expression, this means that the two are equivalent. You may wish to revise your work on Finite State Machines before you attempt this!
First, watch Barry Browns videos in which he shows you how to convert a regular expression into a Finite State Automata. There are some examples for you to try in the Task which follows.
Converting regular expressions to finite state automata (6:46)
More advanced regular expression to finite state automata examples (8:46)
For instance, consider the following regular expression...
RegEx:
ab*c
reads as "One 'a' followed by 0 or more 'b' followed by one 'c'"Alphabet:
{a,b,c}
Language:
{'ac','abc','abbc','abbbc',...}
- Language is infiniteThis is equivalent to the following finite state automata.

Notice that this representation has a 'dead state', s3, into which the machine falls if there are any unmatched characters in the input string.

Task 3.1 Regular Expression to FSA
Convert the following regular expressions (RE) into their corresponding finite state automata (FSA). Copy the regular expression into your notes and construct the corresponding finite state automata alongside it. Add details of the alphabet, written explanation of the grammar, examples of valid words over the alphabet and whether the machine is finite or infinite.
Look at the example and the videos for guidance if you need to.
(c|ab)
(a|b)a*
b*ac+
Outcome: Regular expressions and their corresponding finite state automata

FSA to Regular Expression
Conversion of a given FSA to a Regular Expression by formal mathematical means is rock hard. Instead, we will learn to approximate the corresponding RE from the given FSM which should give you enough ammo to get a reasonable mark in an examination.
Strategy: Look for as many unique patterns in the input strings as you can find which will lead to the goal state and write them down. If you can, simplify this collection of expressions by looking for common starting and ending strings and then alternate the bits in between. Simples!
See how far you get with these. I'm imagining that ...


Task 3.2 FSA to Regular Expressions
Try to find as many patterns in the strings accepted by these Finite State Automata and write them all down in a list. Look at the list of regular expressions and see whether you can spot any similarities between them. Try to reduce the number of terms as far as you can.
Then use Regexr to test out yor regular expression with strings that you know will and won't be accepted by the Finite State Automata. Adjust your regex as appropriate.
Easy

Medium

Hard

I'll be impressed if you manage to get the same RegEx as me. That's not a problem because there are lots of matching RegEx for each FSA - it depends on how much you simplify it. Try your best. Identify as many 'parts' as you can - the more parts you recognise, the more marks you will get in the exam.
Outcome: An attempt to convert FSA into Regular Expressions.

Extension activities
Isn't it difficult enough?
Isn't it difficult enough?
What's next?
Before you hand your book in for checking, make sure you have completed all the work required and that your book is tidy and organised. Your book will be checked to make sure it is complete and you will be given a spicy grade for effort.
END OF TOPIC ASSESSMENT
Last modified: April 17th, 2024