lesson 7.5 - data validation
Garbage In, Garbage Out! Learn data validation to keep your data clean and useful.

Ever heard the phrase "Garbage In, Garbage Out"? It's the number one rule in the world of data. If you put rubbish, incorrect data into a system for Pedal Power Cycles, you'll get rubbish, useless information out of it. This lesson is all about becoming the digital bouncer for their data. You'll learn how to set up automated rules that check data at the door, rejecting anything that isn't sensible, and keeping their information clean, accurate, and powerful.
Learning Outcomes
The Building Blocks (Factual Knowledge)
The Connections and Theories (Conceptual Knowledge)
The Skills and Methods (Procedural Knowledge)
Recall the five main types of data validation checks (C2.A3.1).
The Connections and Theories (Conceptual Knowledge)
Describe the purpose of data validation in maintaining data integrity.
Explain the 'Garbage In, Garbage Out' (GIGO) principle.
The Skills and Methods (Procedural Knowledge)
Apply a range of data validation rules to a spreadsheet to restrict data entry.
Digital Skill Focus: This lesson, you will focus on applying rigorous data validation rules to a range of cells to restrict user input and prevent erroneous data entry.
The Digital Bouncer: Keeping Bad Data Out
What is Data Validation? (And why do we need it?)
Data validation is the process of using automated checks to ensure that data entered into a system is reasonable, sensible, and accurate. Its main purpose is to prevent bad data from ever getting into your spreadsheet or database in the first place.
This is vital because of a famous rule in computing: Garbage In, Garbage Out (GIGO).
If a manager at Pedal Power Cycles accidentally enters a bike's price as "£5O0" (with a letter 'O' instead of a zero) or a quantity as "-1", any calculations based on that data will be completely wrong. The final profit figures and charts on the dashboard would be useless 'garbage'. Validation acts as a gatekeeper to stop this from happening.
The Five Main Types of Validation Check
Spreadsheet software has built-in tools to perform these checks automatically.
1. Presence Check
The Question: Is the field empty?
The Rule: This check simply ensures that a field has not been left blank.
Pedal Power Cycles Example: A 'Customer Surname' field cannot be left empty on a sales record.
2. Length Check
The Question: Does the data have the correct number of characters?
The Rule: This check ensures the data is a specific length, or within a minimum and maximum length.
Pedal Power Cycles Example: A 'ProductCode' might have to be exactly 5 characters long, like "RD-01".
3. Format Check
The Question: Does the data match the correct pattern?
The Rule: This checks that the data has been entered in the correct format. It's also known as a 'picture check'.
Pedal Power Cycles Example: An email address must contain an '@' symbol and a dot, like "manager@pedalpower.com". A date must be in the DD/MM/YYYY format.
4. Range Check
The Question: Is this number within acceptable limits?
The Rule: This check ensures that numerical data is between a minimum and a maximum value.
Pedal Power Cycles Example: The 'QuantitySold' field must be a number greater than 0. A percentage discount must be between 0 and 50.
5. Lookup Check
The Question: Does this entry exist in a predefined list of accepted values?
The Rule: This check compares the input against a list of valid options. This is often implemented as a drop-down list to prevent spelling mistakes.
Pedal Power Cycles Example: A 'BikeCategory' field must be one of the following: "Mountain", "Road", "Hybrid", or "Kids".
Think like an examiner! Look at the core concepts in these notes and write down two common mistakes or 'traps' you think students might fall into when answering questions on this topic. How will you make sure you avoid them?

Task The Data Gatekeeper Challenge
You are now in charge of data quality at Pedal Power Cycles! You've been given a spreadsheet used to track staff training hours, but it's a disaster waiting to happen. Managers can type anything they want into it. Your job is to lock it down using data validation rules to force users to enter correct and sensible data.
1
Get Organised!
Download the starter file: ppc-training.xlsx.
Save or move this file into your Component_2 -> 03-Dashboard folder.
Open the file. You will see a simple table for recording staff training.
2
Apply a Length Check
The Staff ID must always be exactly 4 characters long (e.g., A101, B204).
Select all the cells in the 'StaffID' column (from A2 downwards).
Find the Data Validation tool in your software (usually under the 'Data' menu).
Set up a rule that only allows a Text length that is equal to 4.
3
Apply a Lookup Check
The 'Job Role' must be chosen from a specific list to avoid spelling mistakes.
Select all the cells in the 'JobRole' column.
In the Data Validation settings, choose to allow a List.
In the source box, type the following exactly: Mechanic,Sales,Manager. This will create a drop-down menu.
4
Apply a Range Check
Staff training hours for a week should never be more than 35.
Select all the cells in the 'TrainingHours' column.
Set up a validation rule that only allows a Whole number between 0 and 35.
5
Apply a Presence Check
Every record must have a contact number. This field cannot be left blank.
Select all the cells in the 'ContactNumber' column.
This one is tricky! A presence check is often done using a Text length rule. Set the length to be greater than 0.
6
Test Your Rules!
This is the most important step. Go back to your table and deliberately try to enter bad data.
Try to enter a StaffID that is 5 characters long.
Try to type "Salesperson" into the Job Role column.
Try to enter -5 or 40 into the Training Hours column.
Try to delete a contact number.
Does the spreadsheet stop you? If so, you have succeeded!
Act as a spreadsheet expert. Explain step-by-step how to create a drop-down list in a cell using data validation. The list should contain the options "Mountain", "Road", and "Hybrid". Limit the response to 120 words for a Key Stage 4 student. NO intro, NO outro, NO deviation from the topic, NO follow-up questions.
7
PROVE IT!
To prove what you have learnt today, create an 'artefact' (a document, image, presentation) which describes what each validation method does, how it's set up in Excel and screenshot evidence of you applying and testing the validation.
Outcome: A spreadsheet where the key columns have been protected with data validation rules, preventing users from entering incorrect data and an artefact explaining what each validation rule does and how to create it.

Hungry for more?
Format Check Wizardry: Research how to use the 'Custom' data validation rule to check for a specific format. Can you create a rule that forces a product code to start with a letter and end with a number? Find examples here.
Live Form Validation: Go to a major website with a sign-up form (like a social media site or online shop). Deliberately try to enter incorrect data (e.g., an email without an '@' symbol, a password that's too short). Watch how the website gives you real-time validation feedback.
Validation in Code: If you're a programmer, investigate how data validation is handled in a language like Python. How do you write code to check if input is a number before you try to do maths with it?
Application to the Component Sample PSA
While the Component 2 PSA does not require you to set data validation rules, understanding them is crucial for Task 1 and Task 2. The data files provided by Pedal Power Cycles will likely contain errors - data that would have been stopped by good validation.
Your knowledge will help you to spot these errors during your initial analysis of the data. For example, you might notice product codes with the wrong length, or sales quantities that are negative numbers. In your report for Task 1, you can comment on the quality of the data and in Task 2, you will need to clean these errors before you can build your dashboard. Understanding validation helps you identify what 'clean' data should look like.
Out of Lesson Learning
⭐ Error Spotter
Look at the small Pedal Power Cycles sales table below. It contains at least three errors that validation checks would have prevented. List the three errors and name the validation check that failed.
Look at the small Pedal Power Cycles sales table below. It contains at least three errors that validation checks would have prevented. List the three errors and name the validation check that failed.
SaleID
CustomerEmail
Quantity
101
a.smith@email.com
3
102
1
103
b.jones@email.com
-2
104
c.davies-email.com
5
⭐⭐ Rule Writer
The Pedal Power Cycles 'staff information' spreadsheet contains fields for 'NationalInsuranceNumber' and 'StartDate'. In plain English, write two clear and specific validation rules you would recommend for these fields. For each rule, state the type of validation check you are proposing (e.g., length, format, range).
The Pedal Power Cycles 'staff information' spreadsheet contains fields for 'NationalInsuranceNumber' and 'StartDate'. In plain English, write two clear and specific validation rules you would recommend for these fields. For each rule, state the type of validation check you are proposing (e.g., length, format, range).
⭐⭐⭐ Error Message Designer
For the two rules you created in the task above, design a user-friendly error message for each. The message should appear if a manager enters incorrect data. A good error message is polite and clearly explains what the user did wrong and how to fix it. For example, instead of just "Error!", a better message would be "Invalid Entry: The start date cannot be in the future."
For the two rules you created in the task above, design a user-friendly error message for each. The message should appear if a manager enters incorrect data. A good error message is polite and clearly explains what the user did wrong and how to fix it. For example, instead of just "Error!", a better message would be "Invalid Entry: The start date cannot be in the future."
Last modified: June 18th, 2026
