Information entropy
H(X) = −Σ p log p. Expected bits per sample.
Origin
Claude Shannon, "A Mathematical Theory of Communication," Bell System Technical Journal 1948. Founded information theory and gave compression its theoretical ceiling.
Where it shows up in production
- gzip / zstd compression Both target the Shannon entropy of the source. Real-world ratios near the entropy bound = excellent compressor.
- Decision trees Information gain (entropy reduction) is the standard split criterion. CART, C4.5, XGBoost, scikit-learn.
On Semicolony
Sources & further reading
Found this useful?