Nau mai, haere mai ki te whÄrangi "Error Control"! This page will explore how digital systems guard against mistakes and changes in data, ensuring that information remains accurate and reliable. This topic aligns with Digital Technologies Achievement Standard 91898: Demonstrate understanding of a computer science concept.Â
Define what "error control" is and its fundamental purpose.
Understand the various forms and locations where data errors can occur.
Explain why error control is critical for the reliability and usability of digital systems.
Identify the potential consequences of undetected data errors.
Differentiate between error detection and error correction.
Explain the concept of 2-dimensional parity and how it is used for both error detection and correction.
Understand the limitations of simple parity in detecting and correcting multiple errors.
Recognize the use of check digits in real-world numbering systems (e.g., barcodes, credit cards) and how they detect common errors.
Explain the difference between Forward Error Correction (FEC) and Automatic Repeat Request (ARQ).
Describe how QR codes use Reed-Solomon error correction for robustness and the concept of code-rate/overhead.
Understand the historical evolution and future importance of error control in various digital technologies.
To get the most out of learning about Error Control, it's essential to have a clear understanding of:
Digital Devices: Do you know that devices process, store, and transmit information?
Binary: Do you understand that all digital information is stored as 0s and 1s?
Data Transmission: Are you familiar with how data moves between devices?
Computer Security: Do you understand the concept of "Integrity" (ensuring data is accurate and hasn't been tampered with) as part of the CIA Triad?
Encryption: Have you explored how data can be scrambled to keep it secret?
Algorithms: Do you understand what an algorithm is (a set of step-by-step instructions)?
Quick Check: Imagine sending a text message, and a few letters get jumbled up on the way. How might that happen, and what problems could it cause? This is what error control tries to prevent!
Error control is a set of techniques used in digital systems to detect and often correct unintended changes or corruption of data. These changes, known as "errors," can happen for various physical reasons, such as electrical interference, a tiny scratch on a storage device, or even cosmic rays affecting computer memory.
The core purpose of error control codes is to guard against these physical problems, ensuring that (most of the time) digital systems just work reliably without you having to worry about such errors. It's a critical component of ensuring data Integrity within Computer Security.
Error control coding is specifically concerned with detecting when these errors occur and, if possible, correcting the data to what it is supposed to be. Some error control schemes only provide error detection, while others offer both detection and correction.
Nobody wants a computer that is unreliable and won't do what it's supposed to do because of bits being changed! Error control coding helps prevent this by addressing errors that occur in various places:Â
Sources of Data Errors:
Data errors can occur almost anywhere digital information exists or travels. They can be broadly categorized by their source:
Physical Problems:
Networks can have a lot of "noise" caused by poor quality wiring, electrical interference, or interference from other networks. This noise can jumble bits during Data Transmission.
The bits on disks (like hard drives or solid-state drives in cellphones/cameras) are very small, and imperfections in the surface can eventually cause some of the storage to fail. The surfaces of CDs and DVDs are exposed and can easily be damaged by storage conditions (e.g., heat or humidity) or handling (e.g., scratches or dust). Bits getting changed on permanent storage is sometimes referred to as data rot or bit rot.
Even data in the computer's temporary memory (RAM) or being processed in the CPU can be affected by physical problems, such as a cosmic ray flipping a single bit. This can lead to critical data corruption.
Human Input Errors:
Errors can occur when numbers are manually typed in, such as entering an incorrect bank account number for a payment, or a wrong container number for shipping.
A barcode on a product might be slightly scratched, have a black mark on it, or the package might be bent, making it difficult for a scanner to read properly, leading to an Tincorrect scan. Sometimes, the scanner won't read the barcode at all, requiring manual entry, which itself is prone to human error.
Consequences of Undetected Errors:
If errors in data are not detected and corrected, the information will just be used with incorrect values, leading to potentially serious problems:
A very poorly written banking system could result in your bank balance being incorrectly changed if just one of the bits in a number was flipped in the computer's memory.
If the barcode on a packet of chips you buy from the shop is scanned incorrectly, you might be charged for shampoo instead. Or, you might buy a book online by entering the ISBN, and the wrong book is sent to you.
A few days after you buy something online, you might get an email saying your credit card number was one digit different from another cardholder, leading to false charges.
If you transfer a music file from your laptop to your MP3 player and a few of the bits were transferred incorrectly during Data Transmission, the MP3 player might play annoying glitches in the music.
Error control codes are constantly working behind the scenes to prevent these issues, making our digital experiences reliable.
Error Detection vs. Error Correction
Error control schemes fall into two main categories:
Error Detection: These schemes can identify that an error has occurred, but they cannot fix the data. If an error is detected, the system might ask for the data to be resent (e.g., if a network packet arrives corrupted) or inform the user that there's a problem (e.g., when a credit card number is typed incorrectly).
Error Correction: These schemes can not only detect errors but also automatically fix them, reconstructing the original data. This is more complex and usually involves adding more redundant (extra) information to the data.
Many error control schemes, particularly those for Data Transmission (like sending data from a server overseas to your computer), send data in very small pieces called packets (covered in the Network Communication Protocols page). Each packet has error detection information added to it. If a packet is detected as having an error, it can be re-requested, ensuring the complete message arrives correctly.
The Parity Trick: A Simple Error Control Algorithm (Hamming Code Related)
The parity trick is a classic demonstration of a simple error control coding algorithm called 2-dimensional parity. It uses a grid of black and white cards (which represent bits, with their two states being black/white, like 0/1). The original data (e.g., some text or an image represented as bits) is laid out in a grid. This method is a form of Hamming code.
You add extra cards (called parity bits) to the right of each row and at the bottom of each column. Each parity bit is chosen so that every row and every column has an even number of black cards. (If there are 8 cards in a row/column and an even number of black cards, there will also be an even number of white cards). The bottom-right card acts as a parity bit for both its row and column, and its color should correctly maintain even parity for both.
When one card is flipped (simulating a 1-bit error due to dust, a cosmic ray, or interference), the row and the column that card was in will both suddenly have an odd number of black cards. You can easily spot these "odd" rows and columns. This means the algorithm has error detection.
The exact card that was flipped is at the intersection of the row and column that now have an odd number of black cards. Because you can pinpoint the exact location, you can flip that card back, thereby correcting the error. This means the algorithm also has error correction for single-bit errors.
For a 7x7 grid with 15 parity cards (to make it 8x8), 1 extra card was enough to detect an error. However, 15 extra cards (the parity row and column) were needed to be able to correct the error. This shows that error correction generally costs a lot more space (in terms of extra bits) than simply error detection!
The Parity Trick: A Simple Error Control Algorithm (Hamming Code Related)
The parity trick is a classic demonstration of a simple error control coding algorithm called 2-dimensional parity. It uses a grid of black and white cards (which represent bits, with their two states being black/white, like 0/1). The original data (e.g., some text or an image represented as bits) is laid out in a grid. This method is a form of Hamming code.
You add extra cards (called parity bits) to the right of each row and at the bottom of each column. Each parity bit is chosen so that every row and every column has an even number of black cards. (If there are 8 cards in a row/column and an even number of black cards, there will also be an even number of white cards). The bottom-right card acts as a parity bit for both its row and column, and its color should correctly maintain even parity for both.
When one card is flipped (simulating a 1-bit error due to dust, a cosmic ray, or interference), the row and the column that card was in will both suddenly have an odd number of black cards. You can easily spot these "odd" rows and columns. This means the algorithm has error detection.
The exact card that was flipped is at the intersection of the row and column that now have an odd number of black cards. Because you can pinpoint the exact location, you can flip that card back, thereby correcting the error. This means the algorithm also has error correction for single-bit errors.
For a 7x7 grid with 15 parity cards (to make it 8x8), 1 extra card was enough to detect an error. However, 15 extra cards (the parity row and column) were needed to be able to correct the error. This shows that error correction generally costs a lot more space (in terms of extra bits) than simply error detection!
Limitations of Parity: What if More Bits Flip?
While 2-dimensional parity is clever, it has limitations, especially when multiple errors occur:
2-Bit Errors: If two cards (bits) are flipped:
The system can always detect that an error has occurred (at least two rows and two columns will have odd parity).
However, the system cannot correct a 2-bit error. There will be multiple possible pairs of cards that could have been flipped to cause the observed parity changes, so you cannot uniquely pinpoint the original two flipped cards.
3-Bit Errors: If three cards are flipped, the algorithm will always detect that an error has occurred. However, correction is still not possible.
4-Bit Errors: With four cards flipped, it's possible (though not likely) that the errors can go undetected. For example, if you flip four cards that form a perfect rectangle in the grid, all row and column parities might return to "even," making the system believe there's no error, even though two bits have changed.
For very simple systems (like a single parity bit for a whole message, not a grid), a single bit error can be detected (the total number of black cards will become odd), but a 2-bit error won't be detected because the number of black cards will be even again. This makes such single-bit parity systems unreliable for multiple errors. The size of the grid (e.g., 6x6, 4x7, or 10x10) doesn't change these fundamental properties.
Check Digits and Checksums: Guarding Against Human Errors & File Corruption
Many important numbers used in everyday life have error control coding built into them, often in the form of a check digit The check digit is usually the last digit in the number and is calculated from all the other digits using a special algorithm. If any of the digits are typed or scanned incorrectly, there's a good chance the error will be detected.
Common Examples of Numbers with Check Digits: Barcode numbers (GTIN-13), credit card numbers, bank account numbers, ISBNs (International Standard Book Numbers), national health and social security numbers, and shipping labels (SSCC).
How Check Digits Work:
When you enter a number (e.g., a credit card number into a web form), the system uses the first digits to calculate what the last digit should be.
If the calculated check digit doesn't match the one you entered, the system detects an error and notifies you (e.g., "Credit card number is not valid"). This happens without needing to check a database.
Check digits are designed to catch common typing or scanning mistakes:
Getting one digit wrong (Substitution): If one digit is entered incorrectly (e.g., '9' instead of '8'), most check digit systems will always detect this. This is because the mathematical calculation used to get the check digit ensures that a single digit change will always result in a different expected check digit.
Swapping two adjacent digits (Transposition): If two numbers next to each other are swapped (e.g., '34' becomes '43'), most check digit systems will detect this. However, it's a bit more challenging to design algorithms that catch all transpositions.
Missing a digit / Adding a digit: These errors are usually detected immediately because the total length of the number will be incorrect (e.g., a GTIN-13 barcode must have exactly 13 digits).
Twin Errors: Where a digit repeated twice is changed to a different digit that's also repeated twice (e.g., '44' becomes '99'). Some check digit algorithms might not detect these if the contribution of the swapped digits to the sum remains mathematically the same.
Jump Transposition Errors: Where two digits separated by one or more digits are swapped (e.g., 812 becomes 218). Simple check digit algorithms sometimes don't detect these.
Example: GTIN-13 Barcodes:
Most products have a 13-digit "Global Trade Item Number" (GTIN-13). The first 12 digits identify the product, and the 13th is the check digit.
Calculation Algorithm (Simplified):
Multiply every second digit (starting from the second digit from the left) by 3.
Multiply every other digit (starting from the first digit from the left) by 1.
Add up all these multiplied numbers to get a sum.
The check digit is the number needed to bring this sum up to the nearest multiple of 10 (or 0 if the sum is already a multiple of 10).
Example: For 9300675032247 (Cola bottle), the sum of (9*1)+(3*3)+(0*1)+(0*3)+(6*1)+(7*3)+(5*1)+(0*3)+(3*1)+(2*3)+(2*1)+(4*3) is 73. To reach the next multiple of 10 (80), you need to add 7, which is indeed the check digit.
When a barcode is scanned, the scanner performs this calculation. If the calculated check digit doesn't match the 13th digit on the barcode, an error is detected, and the operator is alerted (e.g., with a warning sound). This means the system knows it's a wrong number without needing to look it up in a database.
Checksums: Larger checksums are also used to check that downloaded files are correct. A checksum is a value calculated from a block of data that can be used to detect changes or errors in the data.
Quick Response (QR) codes are 2-dimensional (matrix) barcodes that are excellent for transmitting data from paper to a digital device using a camera. They are used in advertising, payments, information sharing, contact tracing, and more. Because they are often in public places, they can be read under poor lighting and are vulnerable to physical damage.
QR codes rely heavily on error correction codes to ensure data can be scanned reliably even when damaged. When a small number of bits in a QR code are changed (e.g., coloring in a square), the scanning software can detect this and usually correct the error, working out the original message. This is Forward Error Correction (FEC) at work.
Eventually, if too many bits are changed, QR codes can't make the correction. What's impressive is that they still perform error detection: they won't read the data incorrectly; instead, they just refuse to give any data at all, rather than giving you corrupted information.
The powerful method that QR codes use to deal with data corruption is called Reed-Solomon error correction. This method adds extra bits to the data so that errors can be corrected, and it is able to deal with a lot more errors than simple parity. Reed-Solomon codes are also widely used for hard disks and optical disks (CDs, DVDs, and Blu-ray) because they cope well with bursts of adjacent errors (like a scratch on a DVD or a splash on a QR code). It's a form of Forward Error Correction because if an error is detected in a file saved long ago, you can't ask for retransmission (ARQ is not practical); the data must be self-correcting.
Reed-Solomon codes are also used by spacecraft like Voyager to reconstruct data sent from the edge of the solar system, where retransmission (ARQ) would be impractical due to the 20+ hour signal travel time.
The code-rate of an error correction method is based on how much of the final coded message contains the original data. If k bits of useful data are represented using n bits in total, the code-rate is k/n. The rest is overhead, which is the extra redundant data added for error control. For example, if a 5x5 grid (25 data bits) uses 11 parity bits, the total is 36 bits. The code-rate is 25/36 (about 69.4%), and the overhead is 30.6%. A higher overhead allows for more error correction. Reed-Solomon codes allow the code-rate to be adjusted depending on how many errors are expected.
The QR code layout follows a standard with various sizes (versions) and four error correction levels (Low, Medium, Quartile, High). The highest level can correct up to 30% of corrupted data.
The need for error control has been around for a long time, evolving with technology:
Early computing storage, like reels of tape, stored important information. They weren't always reliable, so simple parity checks were added. These early systems often used an ARQ approach: if a multi-bit error was detected, it was obvious the data was unreliable, and the tape had to be read again.
The last digit of a credit card is a check digit calculated using Luhn's algorithm. This algorithm (multiplying every second digit by 2) is still used today, highlighting how robust simple error detection can be.
The availability of lasers in the 1970s made barcode scanning feasible for commerce. A Universal Product Code (UPC) standard was developed that included a check digit (using multiplication by 3 for alternating digits) to make scanning reliable. This system can always detect single-digit errors and most two-digit errors. This reliability made people trust barcode scanners so much that they often don't even check their receipts!
Two-dimensional barcodes like QR codes appeared, capable of storing much more data. This allowed for the inclusion of powerful Forward Error Correction (FEC) (like Reed-Solomon), making them incredibly resilient to damage and wear and tear. The widespread availability of smartphones with cameras from around 2003 (with built-in QR code scanning software) made them convenient for public use.
Error correction looks to be essential for emerging technologies like RFID tags (which might replace barcodes), for high-speed data transmission, and even for quantum computing to deal with unstable "qubits" (the basic unit of quantum information).
The codes discussed in this chapter are all widely used, but the most widely used codes for data storage are more sophisticated, like the Reed-Solomon codes and Cyclic Redundancy Check (CRC), because they deal with more complex errors than single bits changing (e.g., many adjacent bits affected by a scratch). For human-readable numbers, checksums (like those derived from the Luhn algorithm) remain very common.
Activity 1: Spot the Data Glitch:
Task: Imagine a digital error occurring in the following situations.
a) A medical device monitoring a patient's heart rate.
b) A digital recipe for baking a cake.
c) A message sent between two astronauts in space.
Activity: For each scenario, describe:
What kind of error might occur in the data (e.g., a number changes, a word gets jumbled).
What the potential negative consequence of that undetected error would be.
Evidence: Create a Google Doc or Google Slide outlining your scenarios and their consequences.
Activity 2: Human Error Check (Unplugged & Evidence Submission):
Task: This is an unplugged activity! Work with a partner.
Activity:
Person A: Write down a short, simple sequence of numbers (e.g., a phone number, a 4-digit PIN, or a simple math problem like "123 + 456 ="). Now, subtly change one digit without telling your partner (e.g., 123 + 456 = becomes 128 + 456 =).
Person B: Look at the sequence. Without doing the math, can you tell if something looks "off" just by quickly scanning it, or would you likely use the incorrect number?
Now, Person A creates a very simple "error detection rule" (e.g., "The last digit must be even," "The sum of the first two digits must be 7").
Person B: Re-check the original sequence using Person A's rule. Can you now detect the error?
Evidence: In a Google Doc, record the original sequence, the altered sequence, your error detection rule, and a reflection on whether the rule helped detect the error.
Activity 3: Everyday Check Digits:
Task: Find an example of a real-world number that uses error detection (e.g., a barcode on a product, an ISBN from a book, or the number on a credit card - DO NOT use your actual credit card number, just look at the format!).
Activity:
Write down the number.
Explain why a number like this needs error detection (what would be the problem if it was typed or scanned incorrectly?).
(Optional Challenge): Research online how the "check digit" (often the last digit) of that specific number system works to detect errors.
Evidence: In a Google Doc, record your chosen number and your explanation of why error detection is needed.
Activity 4: Parity Trick Practice
Task: This activity can be done either with physical black and white cards (or coins) in a grid, or using an online interactive if available.
Activity:
Create a 5x5 grid of data cards (25 cards).
Add a row and column of parity bits to make it a 6x6 grid, ensuring each row and column has an even number of "black" cards (or whatever parity rule you choose).
Have a friend (or imagine the interactive) secretly flip one card.
Using the parity rules, identify which card was flipped and correct it.
Evidence: In a Google Doc or Google Slide, explain the steps you took to add the parity bits, detect the error, and correct it. You can include a simple diagram of your grid before and after the error.
Activity 5: Parity Limitations
Task: Using your 6x6 grid (from Activity 4, or a new one), explore the limitations of 2-dimensional parity.
Activity:
Have a friend (or imagine it) secretly flip two cards. Can you always detect that an error occurred? Can you always correct it? Explain why.
Try to find a way to flip four cards such that the 2-dimensional parity system fails to detect the error (i.e., all row and column parities return to "even" again). Draw a diagram of these four flipped cards.
Evidence: In a Google Doc, describe your observations for 2-bit errors (detection vs. correction) and provide a diagram for how a 4-bit error can go undetected.
Activity 6: GTIN-13 Check Digit Calculation
Task: Use the GTIN-13 check digit algorithm explained above to verify a barcode number.
Activity: Calculate the check digit for the first 12 digits of the barcode 9300675036009. Show your working (multiplications and sum). Does your calculated check digit match the last digit (9)?
Evidence: In a Google Doc, show your calculations and conclusion.
Activity 7: Detecting Barcode Errors
Task: One of the following product numbers has one incorrect digit. Use the GTIN-13 check digit method (or an online checker) to identify which one is incorrect.
9400550619775
9400559001014
9300617013199
Activity: State which barcode is incorrect and explain how you determined this. Try changing one digit in one of the correct numbers and see if the check digit still works.
Evidence: In a Google Doc, provide your answer and explanation.
Activity 8: QR Code Robustness
Task: Find a QR code (e.g., from a product, a website, or print one out).
Activity:
Scan the original QR code with a camera app on a mobile device to confirm it works.
"Damage" the QR code by coloring in one black square with white-out, or one white square with a black pen (avoid the three large corner squares).
Scan the "damaged" QR code. Does it still scan correctly?
Continue damaging the QR code by changing more squares, one by one. After each change, try scanning it. How many changes (approximate percentage of corrupted squares) can the QR code tolerate before it fails to scan the correct message or refuses to scan at all?
Evidence: In a Google Doc or Google Slide, describe your experiment, including an estimate of the percentage of corruption the QR code could handle, and whether it scanned incorrectly or refused to scan.
Check your understanding of Error Control.
Multiple Choice: What is the primary goal of error control in digital systems?
a) To make data transfer faster.
b) To ensure data remains accurate and reliable despite physical problems.
c) To prevent unauthorized access to data.
d) To compress data for smaller storage.
Short Answer: Name two different locations or situations where data errors might occur (e.g., during storage, processing, or transmission).
Scenario: You are downloading a very important software update to your computer over the internet.
Why is error control particularly important during this Data Transmission?
What might happen if the software update file gets corrupted by errors during the download and there's no error control?
Comparison: What is the main difference between error detection and error correction? Provide a simple example of each (can be from this page or your own idea).
Critical Thinking: How does Error Control directly contribute to the Integrity aspect of Computer Security (from the CIA Triad)?
Short Answer: Explain the role of parity bits in the 2-dimensional parity system. How do they help detect errors?
Scenario: You perform the 2-dimensional parity trick, and after your friend flips a card, you find that Row 3 and Column 5 both have an odd number of black cards.
Which card was flipped?
Did this demonstrate error detection or error correction, or both? Explain.
Critical Thinking: Why is the 2-dimensional parity system good at detecting 2-bit errors but unable to correct them?
Short Answer: What is a check digit, and why is it useful for numbers like credit card numbers or barcodes?
Scenario: A cashier manually enters a 13-digit barcode number, but accidentally swaps two adjacent digits (e.g., '45' becomes '54').
Will a standard GTIN-13 check digit system likely detect this error? Why or why not?
What about if they entered one digit incorrectly (e.g., '4' becomes '9')?
Comparison: What is the main difference between Forward Error Correction (FEC) and Automatic Repeat Request (ARQ)? When might you choose one over the other?
Critical Thinking: QR codes use Reed-Solomon error correction. Why is this type of powerful error correction necessary for QR codes, given where they are often used?
Application: The code-rate of an error correction method is k/n (useful data bits / total bits). If a system has 100 useful data bits and adds 25 error-correction bits, what is its code-rate and overhead (as a percentage)?
Error control techniques detect and correct unintended changes in data.
Errors can occur due to physical problems (noise, storage damage, data rot / bit rot) or human input errors.
They can happen during data storage, processing, or Data Transmission.
Undetected errors can lead to serious consequences, impacting reliability and safety.
Error detection identifies errors, while error correction fixes them.
The 2-dimensional parity algorithm uses parity bits in a grid to detect and correct single-bit errors. It demonstrates that error correction requires more overhead (extra bits) than error detection. This method is a form of Hamming code.
Simple parity systems have limitations; they can detect 2-bit errors but cannot correct them, and might even miss 4-bit errors.
Check digits are used in real-world numbering systems (e.g., barcodes, credit cards, ISBNs, bank numbers, tax numbers, social security numbers, shipping labels) to detect common human errors like substitution and most transpositions using a specific algorithm (e.g., Luhn algorithm for credit cards). Checksums are common for human-readable numbers and to verify downloaded files.
Forward Error Correction (FEC) adds enough redundant bits so that the receiver can correct errors without retransmission. The Reed-Solomon method is a powerful form of FEC used in QR codes, hard drives, CDs, DVDs, Blu-ray, and deep space communication (e.g., Voyager spacecraft).
Automatic Repeat Request (ARQ) detects errors and requests the sender to retransmit the corrupted data. This is common for online shopping check digits and initial network packet retransmissions. ARQ requires less overhead than FEC but isn't suitable when retransmission is impractical (e.g., QR codes, deep space).
The code-rate of an error correction method is the ratio of useful data bits to total transmitted bits (k/n), while the overhead is the redundant bits. Lower code-rates mean more overhead but higher error correction capability.
Error control is critical for maintaining data Integrity within Computer Security.
Historically, error detection was crucial for early storage (tapes) and physical codes (credit cards, barcodes), evolving into sophisticated FEC for modern devices and deep space. In the future, it will be essential for technologies like RFID tags and quantum computing.
Now that you understand how encryption keeps data secret, you're ready to explore other crucial aspects of data integrity and communication within digital systems:
Network Communication Protocols: Discover the rules that govern how data (including error-controlled data in packets) travels across networks and the internet.
Artificial Intelligence (AI): Explore how AI might use reliable, error-controlled data, and how errors in data can impact AI systems.
Big Data: Consider the challenges of maintaining data integrity and controlling errors when dealing with massive datasets.
Binary: Revisit the fundamental representation of bits and how a single bit flip can cause an error.
Algorithms: Re-examine the concept of algorithms, as error control itself relies on specific algorithms.
Complexity and Tractability: Explore why some mathematical problems related to error correction (especially advanced ones) can be very "hard" or intractable to solve.
Continue your journey by clicking on the links to these exciting topics!