Discrete and Continuous Distributions
The list of discrete distributions are given below
- Bernoulli distribution
- Binomial distribution
- Poisson Distribution
- Hyper-Geometric distribution
- Negative binomial distribution
Continuous distribution involves probabilities of the possible values random variable (which is a continuous random variable). A variable which has an infinite number of possible values or whose values can be measured is referred to as continuous variables.
Suppose X is a continuous random variable. Then its probability is the area under the PDF curve. Hence probability at a point for a continuous random variable is always zero. And it has a probability greater than zero only for a range of values.
- Normal probability distribution
- Student's t distribution
- Chi-square distribution
- F distribution
A binomial experiment is a statistical experiment that has following properties.
- The experiment consists of n repeated trials
- Each trial can result in just two possible outcomes. We call one of these outcomes a success and the other, a failure
- The probability of success, denoted by P, is the same on every trial
- The trials are independent; the outcome of one trial doesn't affect the outcome of other trials
Let the count X of successes in a group of n observations with success probability p be a binomial distribution.
Then the distribution of the sample proportion phat is the count of success X divided by the number of observations n.
If X be a random variable following Binomial distribution, the probability mass function,
Uniform distribution is also known as rectangular distribution. It has two parameters “A” and “B” which are its minimum and maximum values. Its probability density function for a uniform distribution is given by:
The cumulative distribution function for a Uniform distribution is given below.
The probability density function of normal distribution along with its parameters is given below.
- Empirical Rule
Z score indicates how many standard deviations a score is away from the mean. A z score follows N (0, 1). Its formula is given by:
- A z-score less than 0 implies that score is less than the mean
- A z-score greater than 0 implies that score is greater than the mean
- A z-score equal to 0 implies that score is equal to the mean
- If the number of elements in the set is large, about 68% of the elements have a z-score between -1 and 1; about 95% have a z-score between -2 and 2; and about 99% have a z-score between -3 and 3
- A z score less than -3 or greater than 3 indicates that the corresponding score is an outlier
- Central Limit Theorem
According to the central limit theorem (CLT), the arithmetic mean of a sufficiently large number of iterates of independent random variables, each with a well-defined expected value and well-defined variance will be approximately normally distributed, regardless of the underlying distribution
For a normal distribution, the histogram of the data is symmetric. The empirical rule says that 68% of the data will fall within the first standard deviation of the mean; 95% within the first two standard deviations of the mean; 99.7% will fall within the first three standard deviations of the mean. The graphical representation is shown below.