Why the tails come apart
LessWrong, August 1, 2014
Abstract
Correlated variables frequently exhibit divergence at their distribution tails, such that extreme outliers in a predictor variable seldom correspond to the most extreme outliers in the predicted outcome. This phenomenon is a structural property of bivariate distributions with correlations of less than one. Geometrically, such distributions form elliptical probability density envelopes where the “bulge” of the ellipse ensures that sub-maximal values of one variable coincide with the absolute maximum of another. Beyond potential trade-offs or biological limits, this divergence emerges primarily from the multifactorial nature of most real-world outcomes. When an outcome is influenced by multiple independent factors, an individual at the extreme tail of a single predictor (e.g., +4 standard deviations) is statistically likely to be near the mean in other relevant dimensions. Because population density increases significantly as one moves toward the mean, the much larger pool of individuals at slightly less extreme predictor levels (e.g., +3 standard deviations) provides a higher probability of containing an individual with a superior combination of all auxiliary factors. Consequently, the highest performers in a given domain typically possess very high, but not maximal, levels of any single correlated trait. This principle provides a foundational explanation for regression to the mean and the winner’s curse, indicating that selection processes targeting the extreme right tail of a distribution will encounter inherent predictive instability. – AI-generated abstract.
