Basics In Statistics Definitions
Sample
It is part of a population called the universe, reference, or parent populatio
Biostatistics
It is that branch of statistics concerned with mathematical facts & data related to biological events
Variable
It is a state, condition, concept, or event whose value is free to vary within the population
Basics In Statistics Important Notes
1. Measures of central tendency
- Arithmetic mean
- Simplest measure
- Obtained by summing up all the observations divided by the number of observations
- It is very sensitive to extreme scores.
- Median
- It is the simplest division of the set of measurements into two halves
- When the distribution has odd numbers, the middle value is the median, when the distribution has an even number of elements, the average of two middle scores is median
- It is insensitive to small numbers.
- Mode
- It is the most frequently occurring value in a set of observations
2. Sampling
- Simple random sampling
- Used when the population is small, homogenous.
- Systemic sampling stratified sampling
- Used when the population is large, non-homogenous, and scattered
- Multistage sampling
- Employed in large country surveys
- Carried out in several stages
- Multiphase sampling
- Here sampling is done in different phases
- Cluster sampling
- Involves grouping the population and then surveying
- Stratified sampling
- Used when the population is large, nonhomogenous
3.Properties of the normal curve
- Bell-shaped
- Symmetrical
- The height of the curve is maximum at the mean
- Mean = median = mode
- The area under the curve between any two points can be found in terms of the relationship between mean and standard deviation.
Mean ± 1 SD = 68.3% of observation
Mean ± 2 SD = 95.4% of observation
Mean + 3 SD = 99.7% of observation
4. Classification of data
- Qualitative data
- It is data with frequency but no magnitude
- Nonparametric tests are used for it
Read And Learn More: Percentive Communitive Dentistry Question And Answers
- Quantitative data
- It is data with a magnitude
- Parametric tests are used for it
5. Chi-square test is used
- To test the association between the cause and effect
- To find the goodness of fit
- To test the differences between two/more proportions
6. Tests
Basics In Statistics Long Essays
Question 1. Define sample. What are the ideal requisites of sampling, describe different sampling methods.
Answer:
Sample:
It is part of a population called the universe, reference, or parent population
Sample Ideal Requisites:
- Efficiency
- Representativeness
- Measurability
- Size-large
- Adequate coverage
- Goal orientation
- Feasibility
- Economic
Sample Sampling Methods:
Sample Probability Sampling:
- Simple Random Sampling
- Each member of the population has an equal chance of being included in the sample
- The member is determined by chance only © Methods of random selection are
- Lottery method
- Table of random numbers
- Systematic
- It is obtained by selecting one unit at random & then selecting additional units at evenly spaced interval till an adequate sample size is obtained
- It can be adopted as long as there is no periodicity of occurrence of any particular event in the population
- Stratified Random
- The population to be sampled is subdivided into strata
- A simple random sample is then chosen from it
- Used for a heterogeneous population
- It ensures more representativeness, provides greater accuracy & can concentrate over a wider area
- It eliminates sampling variation
Sample Cluster Sampling:
- Useful when a population forms natural groups
- First, a sample of the clusters is selected & then all units in clusters are surveyed
Sample Advantage:
- Simple
- Less expensive
Sample Disadvantage:
Cannot be generalized
Sample Non-Probability Sampling:
Sample Accidental Sampling:
- It is a matter of taking what you can get
- It is not randomly obtained
Sample Advantage:
It is inexpensive & less time-consuming
Sample Purposive Sampling:
- It is a nonrepresentative subset of some larger population
- A sample is achieved by asking a participant to suggest someone else willing for the study
1. Quota Sampling:
It involves the selection of proportional samples of subgroups within a target population to ensure generalization
2. Dimensional Sampling:
A small sample is selected then each selected case is examined in detail
3. Mixed Sampling:
Constitute a combination of both probability & nonprobability sampling
Question 2. Define biostatistics. Write in detail the uses of biostatistics in dental public health.
Answer:
Biostatistics:
- It is that branch of statistics concerned with mathematical facts & data related to biological events
- It deals with the statistical methodologies involved in biological sciences
Biostatistics Uses:
- Measure the state of health of the community
- Identify the health problems
- Compare the health status of one country with another & past status with present
- Predict health trends
- Plan & administer dental health services
- Evaluate the achievement of public health program
- Fix priorities in public health program
- Evaluate the efficacy of vaccines, sera, etc
- Measure mortality & morbidity
- Test whether the difference between 2 populations is real or a chance occurrence
- Study correlation between attributes in the same population
- Promote health legislation
- Help the dentist to think quantitatively
Question 3. Define sampling. Classify sampling. Enumerate any one sampling.
Answer:
Sampling:
It is the process or technique of selecting a sample of appropriate characteristics & adequate size
Probability Sampling:
Simple Random Sampling:
- Each member of the population has an equal chance of being included in the sample
- The member is determined by chance only
- Methods of random selection are
- Lottery method
- Table of random numbers
Sampling Systematic:
- It is obtained by selecting one unit at random & then selecting additional units at evenly spaced intervals till an adequate sample size is obtained
- It can be adopted as long as there is no periodicity of occurrence of any particular event in the population
1. Stratified Random:
- The population to be sampled is subdivided into strata
- A simple random sample is then chosen from it
- Used for a heterogeneous population
- It ensures more representativeness, provides greater accuracy & can concentrate over a wider area
- It eliminates sampling variation
2. Cluster Sampling:
- Useful when a population forms natural groups
- First, a sample of the clusters is selected & then all units in clusters are surveyed
Sampling Advantage:
- Simple
- Less expensive
Sampling Disadvantage:
Cannot be generalized
Question 4. Enumerate various measures of dispersion & describe in detail the test of significance.
Answer:
Measures Of Dispersion:
- Range
- It is the difference between the smallest & largest results in a set of data
- Mean deviation
- It is the average of the deviation from the arithmetic mean
- Standard deviation
Measures Of Dispersion Test Of Significance:
It deals with the techniques to know how far the differences between the estimates of different samples is due to sampling variations
1. Standard Error of Mean (SE):
Gives the standard deviation of the mean of several samples from the same population
= standard deviation / √n
2. Standard Error of Proportion:
= p & q = proportion of occurrence of an event
in 2 groups
n = sample size
Measures Of Dispersion Standard Error Of Difference Between Two Means:
Indicates whether the samples represent two different universe
Measures Of Dispersion Standard Error Of Difference Between Proportion:
Indicate whether the difference is significant or has occurred by chance
Measures Of Dispersion Chi-Square Test:
Measures Of Dispersion Uses:
- Test whether the difference in the distribution of attributes in different groups is due to sampling variation or not
- Test the significance of the difference between 2 proportion
- Used when there are more than 2 groups to be compared
Measures Of Dispersion Z Test:
- Test the significance of differences in means for large samples
- ‘t’ Test
Measures Of Dispersion Synonym:
Student’s t-test
Measures Of Dispersion Uses:
- Used when the sample size is small
- Used to test the hypothesis
- Find the significance of the difference between the 2 proportions
Measures Of Dispersion Types:
- Unpaired’t’ test
- Applied to unpaired data made on individuals of 2 different sample
- Test if the difference between the means is real or not
Measures Of Dispersion Paired’t’ test:
Applied to paired data obtained from one sample only
Question 5. Define biostatistics. Describe in detail the normal curve. Write a note on measures of central tendency.
(or) Normal distribution/ Properties of normal curve/ Gaussian curve.
(or) Mean, Median, Mode.
(or) Measures of central tendency.
Answer:
Biostatistics:
- It is that branch of statistics concerned with mathematical facts & data related to biological events
- It deals with the statistical methodologies involved in biological sciences
Biostatistics Normal Curve:
- It is a pattern followed by very many sets of continuous measurements.
- It is characterized by a symmetric, bell-shaped curve
- In a normal curve
- The area between one standard deviation on either side of the mean will include approximately 68% of the values
- The area between two standard deviations on either side of the mean will include approximately 95% of the values
- The area between three standard deviations on either side of the mean will include approximately 99.5% of the values
Biostatistics Characteristics:
- It is smooth, symmetrical bell-shaped
- The maximum number of observations is at the center & gradually decreases at the extremities
- The total area is 1, the mean is 0 & standard deviation is 1
- Mean, median & mode coincide at center
Basics In Statistics Short Essays
Question 1. Presentation of statistical data.
(or) Pie Chart
(or) Histogram
(or) Pictogram
(or) Uses of biostatistics
Answer:
Presentation of statistical data Tabulation
- Tables are simple devices used for data presentation
- Prepared manually or mechanically
Presentation of statistical data Types:
1. Simple Table:
Way table containing one characteristic of data only
Presentation of statistical data Master Table:
Contains all the data obtained from a survey
Presentation of statistical data Frequency Distribution Table: Two-column table
- 1st column: lists classes of data
- 2nd column: lists the frequency of each class
Charts/ Diagrams:
1. BarCharts:
- It is a diagram of columns/ bars
- The height of the bars determines the value of the particular data
- The width of the bar remains the same
- The bars are separated by spaces
- The bars can be either vertical/ horizontal
Presentation of statistical data Types:
- Simple bar chart
- Represents only one variable
Presentation of statistical data Multiple bar chart
Consist of a set of bars of the same width corresponding to the different sections without any gap in between
- Component bar chart
- Individual bars are divided into 2 or more parts
- Used to compare the sub-groups
2. Pie Chart:
- The entire graph looks like a pie & its components are represented by its slices
- It is divided into different sectors corresponding to the frequencies of the variables
- The segments are then shaded/ colored
3. Histogram:
- It is a pictorial presentation of data
- Class intervals are presented on the X-axis & frequencies on the Y axis
- No space occurs between the cells
4. Pictogram:
They are small pictures used for data presentation USA
5. Line Diagram:
- Used for continuous variable
- Time is represented on the X-axis & value on the Y axis
6. Statistical Maps:
- Refer to the geographic area
- Dot/ point is used to represent the area
Question 2. Types of diagram.
Answer:
1. Bar Charts:
- It is a diagram of columns/ bars
- The height of the bars determines the value of the particular data
- The width of the bar remains the same
- The bars are separated by spaces
- The bars can be either vertical/ horizontal
Bar Charts Types:
- Simple bar chart
- Represents only one variable
2. Multiple bar chart:
Consist of a set of bars of the same width corresponding to the different sections without any gap in between
3. Component bar chart:
- Individual bars are divided into 2 or more parts
- Used to compare the sub-groups
4. Pie Chart:
- The entire graph looks like a pie & its components are represented by its slices
- It is divided into different sectors corresponding to the frequencies of the variables
- The segments are then shaded/ colored
5. Histogram:
- It is a pictorial presentation of data
- Class intervals are presented on the X-axis & frequencies on the Y axis
- No space occurs between the cells
6. Pictogram:
They are small pictures used for data presentation
Question 3. Types of samples/ Probability sampling methods/ Sampling methods.
(or) Cluster sampling
Answer:
Probability Sampling
Simple Random Sampling:
- Each member of the population has an equal chance of being included in the sample
- The member is determined by chance only
- Methods of the random selection are e
- Lottery method
- Table of random numbers
Probability Sampling Systematic:
- It is obtained by selecting one unit at random & then selecting additional units at evenly spaced intervals till an adequate sample size is obtained
- It can be adopted as long as there is no periodicity of occurrence of any particular event in the population
Probability Sampling Stratified Random:
- The population to be sampled is subdivided into strata
- A simple random sample is then chosen from it
- Used for a heterogeneous population
- It ensures more representativeness, provides greater accuracy & can concentrate over a wider area
- It eliminates sampling variation
Probability Sampling Cluster Sampling:
- Useful when a population forms natural groups
- First, a sample of the clusters is selected & then all units in clusters are surveyed
Probability Sampling Advantage:
- Simple
- Less expensive
Probability Sampling Disadvantage:
Cannot be generalized
Probability Sampling Non-Probability Sampling:
Probability Sampling Accidental Sampling:
- It is a matter of taking what you can get
- It is not randomly obtained
Probability Sampling Advantage:
It is inexpensive & less time-consuming
Probability Sampling Purposive Sampling:
- It is a nonrepresentative subset of some larger population
- A sample is achieved by asking a participant to suggest someone else willing for the study
Probability Sampling Quota Sampling:
It involves the selection of proportional samples of subgroups within a target population to ensure generalization
Probability Sampling Dimensional Sampling:
A small sample is selected then each selected case is examined in detail
Probability Sampling Mixed Sampling:
Constitute a combination of both probability & nonprobability sampling
Question 4. Simple random sampling.
Answer:
Simple random sampling
- Each member of the population has an equal chance of being included in the sample
- The member is determined by chance only
- Methods of the random selection are the
- Lottery method
- Table of random numbers
Question 5. Multistage sample.
Answer:
Multistage sample
It is a sampling procedure often used when the sampling units can be defined in a hierarchical manner
Multistage sample Steps:
- Select the groups/cluster
- Then subsamples are taken in subsequent stages
- 1st stage: choice of states within countries
- 2nd stage: choice of towns within each state
- 3rd stage, choice of neighborhoods in each town
Question 6. Tests of significance.
(or)’t’ test.
Answer:
Tests of significance
It deals with the techniques to know how far the differences between the estimates of different samples is due to sampling variations
Tests of significance Standard Error Of Mean(Se):
Gives the standard deviation of the mean of several samples from the same population
Tests of significance Standard Error Of Proportion:
\(=\sqrt{\frac{p q}{n}} \mathrm{p} \& \mathrm{q}=\) proportion of occurrence of an event
in 2 groups n= sample size
Tests of significance Standard Error Of Difference Between Two Means
Indicates whether the samples represent two different universe
Tests of significance Standard Error Of Difference Between Proportion
Indicate whether the difference is significant or has occurred by chance
Tests of significance Chi-Square Test
Tests of significance Uses:
- Test whether the difference in the distribution of attributes in different groups is due to sampling variation or not
- Test the significance of the difference between 2 proportion
- Used when there are more than 2 groups to be compared
Tests of significance Z Test:
Test the significance of differences in means for large samples
Tests of significance ‘t’ Test:
Tests of significance Synonym:
Student’s t-test
Tests of significance Uses:
- Used when the sample size is small
- Used to test the hypothesis
- Find the significance of the difference between the 2 proportions
Tests of significance Types:
Tests of significance Unpaired’t’ test:
- Applied to unpaired data made on individuals of 2 different sample
- Test if the difference between the means is real or not
- Paired’t’ test
- Applied to paired data obtained from one sample only
Question 7. Statistical analysis.
Answer:
Statistical analysis
- It is based on
- Population
- It is the collection of units of observations that are of interest & is the target of the investigation
- It is essential to identify the population clearly & precisely
- The success of the investigation will depend on the identification of the population
- Variable
- It is a state, condition, concept/ event whose value is free to vary within the population
Classification of Statistical Analysis:
- Independent
- Manipulated/ treated in a study
Dependent:
- Result of the independent variable
- Confounding
- Confound the effect of the independent variable on the dependent
- Background
- Considered for possible inclusion in the study
- Probability distribution
- It is a link between population & its characteristics
- It is a way to enumerate the different values the variable can have & how frequently each value appears in the population
- It is characterized by parameters i.e. quantities
Question 8. Standard deviation.
Answer:
Standard deviation
- It is the square root of the mean of the squared deviations from arithmetic
- It is the most commonly used measure of dispersion
Standard deviation Synonym
Root Mean Square Deviation
Standard deviation Calculation
- Calculate the mean of the series, X
- Take the deviation mean X- X,
- Square these deviations & add them up 5^ 2
- Divide the result by the total number of observation
- Obtain the square root of it (Standard deviation)
Standard deviation Significance:
- The greater the standard deviation, the greater the magnitude of dispersion
- Lesser the standard deviation, a higher degree of uniformity of observation
Question 9. Bar diagram/ charts.
Answer:
Bar diagram
- It is a diagram of columns/ bars
- The height of the bars determines the value of the particular data
- The width of the bar remains the same
- The bars are separated by spaces
- The bars can be either vertical/ horizontal
Bar diagram Types:
1. Bar diagram Simple bar chart
Represents only one variable
2. Bar diagram Multiple bar chart
Consist of a set of bars of the same width corresponding to the different sections without any gap in between
3. Bar diagram Component bar chart
Individual bars are divided into 2 or more parts Used to compare the sub-groups
Basics In Statistics Short Question And Answers
Question 1. Primary & secondary data.
Answer:
secondary data Primary Data:
- Obtained directly from an individual
- It is first-hand information
secondary data Advantage:
- Precise information
- Reliable
secondary data Disadvantages:
- Time-consuming
- Expensive
secondary data Methods:
- Direct personal interviews
- Oral health examination
- Questionnaire
secondary data Secondary Data:
- Obtained from outside sources
- Used to serve the purpose of the objective of the study
- Example: Hospital records
Question 2. Frequency polygon.
Answer:
Frequency polygon
Pictorial presentation of data
Frequency polygon Method:
- Obtained from histogram
- Mark the midpoint over histogram bars
- Next, connect these points in a straight line
- Example. Agewise prevalence of dental caries
Question 3. Stratified random sampling.
Answer:
Stratified random sampling
- The population to be sampled is subdivided into strata
- A simple random sample is then chosen from it
- Used for a heterogeneous population
- It ensures more representativeness, provides greater accuracy & can concentrate over a wider area
Question 4. Mode.
Answer:
Mode
It is a value occurring with the greatest frequency
Mode Advantage:
- Eliminates extreme variation
- Easily located
- Easy to understand
Mode Disadvantage:
- Uncertain location e Not exactly defined
- Not useful in a small number of cases
Question 5. Null hypothesis.
Answer:
Null hypothesis
- It asserts that there is no real difference between the two groups under consideration & the difference found is accidental & arises out of sampling variation
- It is the first step in the testing of the hypothesis
Question 6. Variable.
Answer:
Variable
It is a state, condition, concept, or event whose value is free to vary within the population
Classification of Variable:
- Independent
- Manipulated/ treated in a study
- Dependent:
- Result of an independent variable
- Confounding
- Confound the effect of the independent variable on the dependent
- Background
- Considered for possible inclusion in the study
Question 7. Qualitative data.
Answer:
Qualitative data
When data is collected on the basis of attributes/ qualities like sex, it is called qualitative data
Question 8. Chi-square test.
Answer:
Chi-square test Uses:
- Test whether the difference in the distribution of attributes in different groups is due to sampling variation or not
- Test the significance of the difference between 2 proportion
- Used when there are more than 2 groups to be compared
Basics In Statistics Viva Voce
- Mean, median and mode are measures of central tendency
- Range, standard deviation, and coefficient of variation are measures of dispersion
- The range is the difference between the smallest item and the value of the largest item
- A census is a collection of information from all the individuals in a population
- Sampling is the collection of information from representative units in a sample
- Standard deviation is the most important and widely used measure of studying dispersion
- A bar diagram is used to represent qualitative data
- Histogram used to depict quantitative data
- A frequency polygon is used to represent the frequency distribution of quantitative data
- A pie diagram is used to show percentage breakdowns for qualitative data
- A line diagram is useful to study the changes in values in the variable over time
- Pictogram is the method to impress the frequency of occurrence of events to the common man
- The chi-square test is a non-parametric test for qualitative data
- For large samples, z test is preferred
- For small samples, a t-test is preferred
- The value of the mean in a normal distribution is zero
- Standard deviation is also called root mean square deviation
- The median is also called the 50th percentile
- The standard error of the mean depicts the deviation