Statistical Redundancy occurs when certain symbols or patterns appear more frequently than others; compression algorithms exploit this by assigning shorter codes to frequent symbols.
Spatial Redundancy refers to the correlation between neighboring elements, such as adjacent pixels in an image having similar colors.
Temporal Redundancy is found in video data where consecutive frames are often nearly identical, allowing the system to store only the changes between frames.
Perceptual Irrelevancy is the basis for lossy compression, where information that the human eye or ear cannot detect is removed to save space.
| Feature | Lossless Compression | Lossy Compression |
|---|---|---|
| Data Integrity | identical to original | Approximate reconstruction |
| Compression Ratio | Typically low ( to ) | Very high ( to ) |
| Typical Use | Text, code, medical images | Audio, video, photographs |
| Reversibility | Fully reversible | Non-reversible |
Check the Context: If a question involves medical imaging or executable software, always prioritize Lossless methods because any data loss could be fatal or break the program.
Calculate Ratios Carefully: Ensure you are dividing the original size by the compressed size. If the result is less than , the file has actually grown (a common occurrence with already-compressed data).
Entropy Limits: Remember that you cannot compress data below its entropy without losing information. If a problem asks for the 'theoretical minimum bits,' they are asking for the entropy calculation.
Identify Redundancy: Before selecting an algorithm, identify the type of redundancy. RLE is for repeating sequences; Huffman is for frequency imbalances; DCT is for visual/audio patterns.
The 'Infinite Compression' Myth: Students often think a file can be compressed repeatedly to reach near-zero size. In reality, once redundancy is removed, further compression attempts usually increase file size due to metadata overhead.
Lossy is 'Bad': A common misconception is that lossy compression is always inferior. In streaming and telecommunications, lossy compression is essential because lossless files would exceed available bandwidth.
Confusing Bitrate with Ratio: Bitrate (bits per second) is a measure of transmission speed, while compression ratio is a measure of storage efficiency. They are related but distinct concepts.