Statistics
Textbooks
Boundless Statistics
Describing, Exploring, and Comparing Data
Further Considerations for Data
Statistics Textbooks Boundless Statistics Describing, Exploring, and Comparing Data Further Considerations for Data
Statistics Textbooks Boundless Statistics Describing, Exploring, and Comparing Data
Statistics Textbooks Boundless Statistics
Statistics Textbooks
Statistics
Concept Version 6
Created by Boundless

The Gauss Model

The normal (Gaussian) distribution is a commonly used distribution that can be used to display the data in many real life scenarios.

Learning Objective

  • Explain the importance of the Gauss model in terms of the central limit theorem.


Key Points

    • If $\mu = 0$ and $\sigma = 1$, the distribution is called the standard normal distribution or the unit normal distribution, and a random variable with that distribution is a standard normal deviate.
    • It is symmetric around the point $x=\mu$, which is at the same time the mode, the median and the mean of the distribution.
    • The Gaussian distribution is sometimes informally called the bell curve. However, there are many other distributions that are bell-shaped as well.
    • About 68% of values drawn from a normal distribution are within one standard deviation σ away from the mean; about 95% of the values lie within two standard deviations; and about 99.7% are within three standard deviations. This fact is known as the 68-95-99.7 (empirical) rule, or the 3-sigma rule.

Term

  • central limit theorem

    The theorem that states: If the sum of independent identically distributed random variables has a finite variance, then it will be (approximately) normally distributed.


Full Text

The Normal (Gaussian) Distribution

In probability theory, the normal (or Gaussian) distribution is a continuous probability distribution, defined by the formula:

 $\displaystyle f(x)= \frac{1}{\sigma \sqrt{2\pi }}e^\frac{{-(x-\mu )^{2}}}{2\sigma ^{2}}$

The parameter $\mu$ in this formula is the mean or expectation of the distribution (and also its median and mode). The parameter $\sigma$ is its standard deviation; its variance is therefore $\sigma^2$. A random variable with a Gaussian distribution is said to be normally distributed and is called a normal deviate.

If $\mu = 0$ and $\sigma = 1$, the distribution is called the standard normal distribution or the unit normal distribution, and a random variable with that distribution is a standard normal deviate.

Importance of the Normal Distribution

Normal distributions are extremely important in statistics, and are often used in the natural and social sciences for real-valued random variables whose distributions are not known. One reason for their popularity is the central limit theorem, which states that, under mild conditions, the mean of a large number of random variables independently drawn from the same distribution is distributed approximately normally, irrespective of the form of the original distribution. Thus, physical quantities that are expected to be the sum of many independent processes (such as measurement errors) often have a distribution very close to normal. Another reason is that a large number of results and methods (such as propagation of uncertainty and least squares parameter fitting) can be derived analytically, in explicit form, when the relevant variables are normally distributed.

The normal distribution is symmetric about its mean, and is non-zero over the entire real line. As such it may not be a suitable model for variables that are inherently positive or strongly skewed, such as the weight of a person or the price of a share. Such variables may be better described by other distributions, such as the log-normal distribution or the Pareto distribution.

The normal distribution is also practically zero once the value $x$ lies more than a few standard deviations away from the mean. Therefore, it may not be appropriate when one expects a significant fraction of outliers, values that lie many standard deviations away from the mean. Least-squares and other statistical inference methods which are optimal for normally distributed variables often become highly unreliable. In those cases, one assumes a more heavy-tailed distribution, and the appropriate robust statistical inference methods.

The Gaussian distribution is sometimes informally called the bell curve. However, there are many other distributions that are bell-shaped (such as Cauchy's, Student's, and logistic). The terms Gaussian function and Gaussian bell curve are also ambiguous since they sometimes refer to multiples of the normal distribution whose integral is not 1; that is, for arbitrary positive constants $a$, $b$ and $c$.

Properties of the Normal Distribution

The normal distribution $f(x)$, with any mean $\mu$ and any positive deviation $\sigma$, has the following properties:

  • It is symmetric around the point $x = \mu$, which is at the same time the mode, the median and the mean of the distribution.
  • It is unimodal: its first derivative is positive for $x<\mu$, negative for $x>\mu$, and zero only at $x=\mu$.
  • It has two inflection points (where the second derivative of $f$ is zero), located one standard deviation away from the mean, namely at $x = \mu - \sigma$ and $x = \mu + \sigma$.
  • About 68% of values drawn from a normal distribution are within one standard deviation $\sigma$ away from the mean; about 95% of the values lie within two standard deviations; and about 99.7% are within three standard deviations. This fact is known as the 68-95-99.7 (empirical) rule, or the 3-sigma rule .

Notation

The normal distribution is also often denoted by $N(\mu, \sigma^2)$. Thus when a random variable $x$ is distributed normally with mean $\mu$ and variance $\sigma^2$, we write $X\sim N\left ( \mu ,\sigma ^{2} \right )$

[ edit ]
Edit this content
Prev Concept
Chance Models
Comparing Two Sample Averages
Next Concept
Subjects
  • Accounting
  • Algebra
  • Art History
  • Biology
  • Business
  • Calculus
  • Chemistry
  • Communications
  • Economics
  • Finance
  • Management
  • Marketing
  • Microbiology
  • Physics
  • Physiology
  • Political Science
  • Psychology
  • Sociology
  • Statistics
  • U.S. History
  • World History
  • Writing

Except where noted, content and user contributions on this site are licensed under CC BY-SA 4.0 with attribution required.