Algebra
Textbooks
Boundless Algebra
Combinatorics and Probability
Probability
Algebra Textbooks Boundless Algebra Combinatorics and Probability Probability
Algebra Textbooks Boundless Algebra Combinatorics and Probability
Algebra Textbooks Boundless Algebra
Algebra Textbooks
Algebra
Concept Version 11
Created by Boundless

Experimental Probabilities

The experimental probability is the ratio of the number of outcomes in which an event occurs to the total number of trials in an experiment.

Learning Objective

  • Calculate the empirical probability of an event based on given information


Key Points

    • In a general sense, experimental (or empirical) probability estimates probabilities from experience and observation.
    • In simple cases, where the result of a trial only determines whether or not the specified event has occurred, modeling using a binomial distribution might be appropriate; then the empirical estimate is the maximum likelihood estimate.
    • If a trial yields more information, the empirical probability can be improved upon by adopting further assumptions in the form of a statistical model. If such a model is fitted, it can be used to derive an estimate of the probability of the specified event.

Terms

  • binomial distribution

    The discrete probability distribution of the number of successes in a sequence of $n$ independent yes/no experiments, each of which yields success with probability $p$.

  • experimental probability

    The probability that a certain outcome will occur, as determined through experiment.

  • discrete

    Separate; distinct; individual; non-continuous.


Full Text

The experimental (or empirical) probability pertains to data taken from a number of trials. It is a probability calculated from experience, not from theory. If a sample of $x$ trials is observed that results in an event, $e$, occurring $n$ times, the probability of event $e$ is calculated by the ratio of $n$ to $x$.

$\displaystyle \text{experimental probability of event} = \frac{\text{occurrences of event}}{\text{total number of trials}}$

Experimental probability contrasts theoretical probability, which is what we would expect to happen. For example, if we flip a coin $10$ times, we might expect it to land on heads $5$ times, or half of the time. We know that this is unlikely to happen in practice. If we conduct a greater number of trials, it often happens that the experimental probability becomes closer to the theoretical probability. For this reason, large sample sizes (or a greater number of trials) are generally valued.

In statistical terms, the empirical probability is an estimate of a probability. In simple cases, where the result of a trial only determines whether or not the specified event has occurred, modeling using a binomial distribution might be appropriate. A binomial distribution is the discrete probability distribution of the number of successes in a sequence of $n$ independent yes/no experiments. In such cases, the empirical probability is the most likely estimate.

If a trial yields more information, the empirical probability can be improved on by adopting further assumptions in the form of a statistical model: if such a model is fitted, it can be used to estimate the probability of the specified event. For example, one can easily assign a probability to each possible value in many discrete cases: when throwing a die, each of the six values $1$ to $6$ has the probability of $\frac{1}{6}$.

Advantages

An advantage of estimating probabilities using empirical probabilities is that this procedure includes few assumptions. For example, consider estimating the probability among a population of men that satisfy two conditions:

  1. They are over six feet in height.
  2. They prefer strawberry jam to raspberry jam.

A direct estimate could be found by counting the number of men who satisfy both conditions to give the empirical probability of the combined condition.

An alternative estimate could be found by multiplying the proportion of men who are over six feet in height with the proportion of men who prefer strawberry jam to raspberry jam, but this estimate relies on the assumption that the two conditions are statistically independent. 

Disadvantages

A disadvantage in using empirical probabilities is that without theory to "make sense" of them, it's easy to draw incorrect conclusions. Rolling a six-sided die one hundred times it's entirely possible that well over $\frac{1}{6}$ of the rolls will land on $4$. Intuitively we know that the probability of landing on any number should be equal to the probability of landing on the next. Experiments, especially those with lower sampling sizes, can suggest otherwise.

This shortcoming becomes particularly problematic when estimating probabilities which are either very close to zero, or very close to one. For example, the probability of drawing a number from between $1$ and $1000$ is $\frac{1}{1000}$. If $1000$ draws are taken and the first number drawn is $5$, there are $999$ draws left to draw a $5$ again and thus have experimental data that shows double the expected likelihood of drawing a $5$.

In these cases, very large sample sizes would be needed in order to estimate such probabilities to a good standard of relative accuracy. Here statistical models can help, depending on the context.

For example, consider estimating the probability that the lowest of the maximum daily temperatures at a site in February in any one year is less than zero degrees Celsius. A record of such temperatures in past years could be used to estimate this probability. A model-based alternative would be to select of family of probability distributions and fit it to the data set containing the values of years past. The fitted distribution would provide an alternative estimate of the desired probability. This alternative method can provide an estimate of the probability even if all values in the record are greater than zero.

[ edit ]
Edit this content
Prev Concept
Independence
Introduction to the Polar Coordinate System
Next Concept
Subjects
  • Accounting
  • Algebra
  • Art History
  • Biology
  • Business
  • Calculus
  • Chemistry
  • Communications
  • Economics
  • Finance
  • Management
  • Marketing
  • Microbiology
  • Physics
  • Physiology
  • Political Science
  • Psychology
  • Sociology
  • Statistics
  • U.S. History
  • World History
  • Writing

Except where noted, content and user contributions on this site are licensed under CC BY-SA 4.0 with attribution required.