What is the difference between Population and Sample?
A population consists of everything being studied to draw useful inferences. Populations are generally large in size. A random sample based on which one draws inferences is taken from a population. For example, if a survey has to be conducted regarding the average weight of all 45-year-olds living in Singapore, then the population consists of individuals who are 45 years old and living in Singapore.
A sample is a group of units selected (generally randomly) from a population of interest so as to draw valid conclusions about the population. The sample is a representative of the population. For example, the population might be all infants born in Singapore in the year 2000. The sample might be all infants born on 28th August in the year 2000.
What is Biases in Statistics?
Bias is the tendency of a sample statistic to systematically over-estimate or under-estimate a population parameter. Bias is an inaccuracy in the data which might be due to creation and collection of data, or due to the faulty sample design.
Scientific bias can arise due to
- Failure to account for all the variables (omission, inclusion, procedural measurement and reporting bias)
- Flawed design
- Chronology/Recall/Performance Bias
Ways to avoid biases are given below:
- Clearly define outcome and risk with validated methods
- Select diverse and large sampling groups
- Avoid historical controls
- Test each hypothesis multiple times by multiple methods
The sample design is a procedure drawn before any data are collected to obtain a sample from a given population. It is also known as a sampling plan.
If respondents answer questions in a way they think the questioner wants them to answer rather than according to their true beliefs, is referred as Response Bias.
Simple Random Sampling
Simple random sampling (SRS) is a process where a group of subjects (sample) is selected from a larger group (population) for the study of population. Each unit in the sample is chosen entirely by chance and each member of the population has an equal and independent chance of being included in the sample.
Assign a single number to each element in the sampling frame and then use random numbers to select elements into the sample until the desired number of cases is obtained. For example, the method of winning a lottery consists of sample random sampling technique.
In systematic sampling, the first sampling unit is selected at random, while the remaining units are selected according to a predetermined pattern involving regular intervals of units. Here, the remaining units don't have the same probability of being included in the sample but are dependent on the selection of the first unit.
Thus if a population has N units numbered from 1 – N in some order and a sample of size n is to be drawn such that, k = N/n, where k is sampling interval. Systematic sample of size n will include units i, i + k, i + 2k, ..., i + (n-1)k.
For example, if the population of study contains 4000 students in a college and a sample of 200 students is to be drawn. Then every 20th student would be included in the sample where the first student is randomly selected. This technique is called a systematic sampling with a random start.
A stratified sampling is a technique in which the entire population is divided into relatively homogeneous subgroups or strata, and the final units are selected randomly from these strata’s.
For example, A church has 600 women and 400 men as members. A stratified random sample of size 30 would include a simple random sample of 18 women from the 600 women and another simple random sample of 12 men from the 400 men.
Cluster sampling is a sampling technique where the entire population is divided into groups or clusters, and selection of random sample of these clusters is done. All observations in the selected clusters are included in the sample. Cluster sampling is used when the researcher cannot get a complete list of the members of a population they want to study but can get a complete list of groups or 'clusters' of the population. For example, the population of a study was church members in Hong Kong. But, there is no list of all church members in the country. But a list of churches in Hong Kong can be prepared and a sample of churches could be chosen. The list of members from these churches can be obtained.