Gaussian Distribution

Linear regression is a more informative metric for evaluating associations between variables than most people typically realize. When it is used to forecast outcomes, it can be converted to a measure of information gain or converted into a point estimate and associated confidence interval. It can also be used to quantify the amount a linear model reduces uncertainty.

Linear regression is closely related to the Central Limit Theorem because both regression and the CLT use probability distributions known as “Gaussians.”

Gaussian Distributions

Diligent data analysts will always include a model of the remaining uncertainty (or noise) associated with their conclusions and recommendations. Any given data analysis will include “signal” and “noise.” Noise is defined as that part of the future that cannot be explained by the present data. Examples include the uncertainty associated with forecasts from a linear regression model, or the uncertainty about a financial instrument’s rate of return. Noise cannot be eliminated completely.

Gaussian Probability Density (or “Normal”) functions are by far the most common model for uncertainty (or noise) used in data analysis. It is a type of continuous distribution that has many special properties.

Standardization

There are 3 steps to standardizing data:

  1. Calculate mean, variance, and, from it, standard deviation.

X Values = 10, 90, 75, 35, 20, 21, 33, 58, 60

Mean ( μx ):

μx=1ni=1nxi=44.67

Population Variance:

1ni=1nxiμx2

Population Standard Deviation (σx, stdevp in Excel):

σx=(1ni=1n(xiμx)2)=25.79

  1. Subtract the mean from each value.
  2. Divide the resulting value by the standard deviation.
X Values 10 90 75 35 20 21 33 58 60
Step 2 -34.7 45.3 30.3 -9.7 -24.7 -23.7 -11.7 13.3 15.3
Step 3 -1.34 1.76 1.18 -0.37 -0.96 -0.92 -0.45 0.52 0.59

The resulting standardized values are denoted xzi.

xzi=xiμxσx

Note that this standardization process will always produce values with a mean of 0 and standard deviation of 1 (μxz=0 and σxz=1).

Properties of the Standard Normal Distribution

  • Standard Area = 1
  • Mean = 0
  • σ=σ2=1
  • Height of the curve is .399 or 12π
  • Generated by the function f(x)=12πex22
  • Area under curve is known as the cumulative probability distribution function, given by

1(2π)ex2/2dx

  • To find the area to the left of a particular z-score, replace the upper limit of the integral, as follows:

12πzex2/2dx

  • In the figure below, the red distribution is a standard normal curve. We say ϕ(0,1). The green curve would be ϕ(2,0.5). Note that all these curves have area = 1. Larger variance curves will be lower and flatter; small variance curves will be taller.
  • Most of the probability is in the middle of the distribution:

    • At z=1, 84.1% of the probability is to the left.
    • At z=2, 97.7%.
    • At z=3, 99.8%.
    • At z=4, 99.9968%.
  • In Excel, NORMSDIST and NORMSINV functions calculate probabilities from z-scores and z-scores from probabilities, respectively. Probabilities are values in between 0 and 1, inclusive.

Example 1

As a practical example, assume that the standard deviation of a certain manufacturing process is known, and that the mean of the process needs to be set such that only one part in 10,000 falls below a certain minimum tolerance.

The mean should be placed =NORMSINV(1-(1/10000)) or 3.719 standard deviations above the minimum tolerance.

Example 2

As a further example, consider that a population of people have an average heart rate of 110 beats per minute with standard deviation of 15. They are given a medication. Following medication, a 45-person sample are measured to have an average heart rate of 102.

The sample standard deviation is calculated as 1545 or 2.24. The z-score is calculated as 1101022.24 or -3.58. The probability of that decrease in heart rate occurring by chance is 0.017%.

Central Limit Theorem

The Central Limit Theorem is an important reason for the Gaussian distribution’s prevalence in nature and in data. The Central Limit Theorem states that given any probability distribution with mean, μ, and standard deviation, σ, taking samples (size, n, 30) from the distribution will have a few results:

  • the mean of the sample means will approach the mean of the original distribution,

μx¯=μ

  • the standard deviation of sample means will approach the standard deviation of the original distribution divided by the square root of the size of the samples,

σx¯=σn

  • and the histogram of sample means will form an approximate Gaussian distribution, independent of the shape of the original distribution.

In particular one formula that seems to occur frequently in Central Limit problems is the variance of a continuous uniform distribution from xmin to xmax. The variance is given by

(xmaxxmin)212

Algebra Using Gaussians

Summation of Independent Distributions

Given the following two independent (no dependency, covariance, correlation) Gaussian distributions (note σ2, variance, is the square of standard deviation, σ):

ϕ1(μ1,σ12),ϕ2(μ2,σ22)

Summing them will result in a distribution described as follows:

μ1+2=μ1+μ2

σ1+22=σ12+σ22

Multiplying an Independent Distribution by a Constant (β)

Given a Gaussian distribution:

ϕ1(μ1,σ12)

Multiplying a given term, z, by β produces the following:

ϕβ=(βμ1,β2σ12)

Dependent Distributions

When covariance 0, then the distributions are dependent. Note, w1+w2=1. Then, combining the distributions results in the following standard variance.

σ1+22=w12σ12+w22σ22+2w1w2Cov12

Markowitz Portfolio Optimization Example

Markowitz Portfolio Optimization is an excellent example of the application of algebra using Gaussian distributions. The goal of Markowitz Portfolio Optimization is to create a combined portfolio of at least two investments that is optimal in a very specific way. Essentially, the optimization results in the largest possible returns-to-risk ratio.

For this example, the investments are stocks with expected returns, ER1 and ER2. The relative weighting of these investments are w1 and w2, where w1+w2=1.

Then, the return of the overall portfolio, plotted on the Y-Axis of the graph below, is given by:

ERp=(w1)(ER1)+(w2)(ER2)

We are given that the covariance, Cov, is a function of the correlation, R, as expressed below. The correlation between two securities is more commonly available than the covariance.

Cov12=R12σ1σ2

then, the volatility of returns for this portfolio, plotted on the X-Axis, is given by the weighted sum of the volatilities, combined as follows:

σp2=w12σ12+w22σ22+2w1w2R12σ1σ2

The data for this example are as follows:

σ ER
Risk-Free Rate, rf 0 0.01
Stock 1 0.095 0.12
Stock 2 0.083 0.10

Various weighted combinations of the two securities are plotted in yellow, below. The goal of Markowitz Portfolio Optimization is to maximize the Sharpe Ratio. This ratio is represented graphically as the slope of the dashed red line.

The Sharpe Ratio is written formulaically as

Sharpe=ERprfσp

Performing the substitution:

Sharpe=(w1)(ER1)+(w2)(ER2)rfw12σ12+w22σ22+2w1w2R12σ1σ2

With the information given in the problem, the Sharpe ratio becomes a function of two variables, w1 and w2, which are constrained by the relation w1+w2=1. The “Solver” Excel plug-in can then be used to maximize the Sharpe ratio.

Taking this analysis a step further, R12 can also be treated as a parameter of the Sharpe ratio. I consider five possible values for R12, tabulated below. The results are a family of curves and Sharpe ratio lines, shown below. Using Solver, I optimize the weights, w1 and w2, for each of these R values, maximizing the Sharpe Ratio, and tabulate the results below the chart.

R Sharpe Ratio w_1 w_2 ER_p σ_p
-0.70 2.89 47% 53% 0.109 0.034
-0.35 1.97 47% 53% 0.109 0.051
0.00 1.59 48% 52% 0.110 0.063
0.35 1.37 50% 50% 0.110 0.073
0.70 1.22 56% 44% 0.111 0.083

Note that, as expected, the volatility of returns is highly dependent on the amount of covariance. Interestingly, the optimal weighting for the securities is relatively stable, even given very wide swings in covariance. As a result, the expected returns are likewise relatively stable.

This analysis explains investors' constant search for diversification via alternative assets that are not highly correlated to major investment vehicles like the stock market.