Encoding Process: The encoder iterates through the data stream, maintaining a counter for the current character. When a different character is encountered, the current count and character are written to the output, and the counter resets for the new character.
Decoding Process: The decoder reads the count-value pairs from the compressed file and simply outputs the 'value' character into the data stream 'count' number of times.
Binary Representation: In a typical implementation, the count might be stored as a fixed-size integer (e.g., 4 or 8 bits) and the value as its standard binary code (e.g., ASCII for text or a color index for bitmaps).
Handling Multi-line Data: In image compression, RLE can either treat each row of pixels independently or allow runs to wrap around from the end of one line to the start of the next to maximize compression.
| Feature | Run Length Encoding (RLE) | Huffman Coding | Dictionary (LZW) |
|---|---|---|---|
| Strategy | Counts consecutive repeats | Uses frequency-based bit lengths | Replaces patterns with codes |
| Best For | Simple icons, line drawings | Text with varied char frequency | Repetitive phrases in text |
| Complexity | Very Low | Medium | High |
| Type | Lossless | Lossless | Lossless |
Calculate the Overhead: Always check if the compressed version is actually smaller. If a sequence is 'ABC', RLE might store it as '1A1B1C', which uses more bits than the original.
Check the Order: Examiners often switch between (Count, Value) and (Value, Count) formats; read the question carefully to ensure you are outputting the pairs in the requested sequence.
Bit-Level Calculations: If a question provides bit depths (e.g., 8 bits for count, 8 bits for ASCII), calculate the total bits for the original string () and compare it to the total bits for the RLE pairs ().
Reconstruction Accuracy: When decompressing, ensure the total number of elements in your output matches the sum of the counts in the RLE pairs.
Negative Compression: Students often assume RLE always works. In reality, if the data has no runs (e.g., 'XYZ'), RLE will increase the file size because it adds a count for every single unique character.
Fixed Count Limits: A common error is forgetting that the count field has a maximum value; if a run exceeds this (e.g., a run of 300 with an 8-bit limit of 255), it must be split into two separate RLE pairs.
Data Type Suitability: RLE is poorly suited for high-entropy data like encrypted files or complex photographs where adjacent pixels rarely have identical values.
Bitmap Images: RLE is a standard compression option for BMP and TIFF files, particularly those with large areas of solid color.
Fax Machines: Early fax technology used a variation of RLE to transmit black-and-white documents efficiently by only sending the lengths of white and black streaks.
Hybrid Systems: Modern formats often use RLE as a pre-processing step before applying more sophisticated algorithms like Huffman coding to further shrink the data.