## Basics In Statistics Definitions

**Sample**

It is part of a population called the universe, reference, or parent populatio

**Biostatistics**

It is that branch of statistics concerned with mathematical facts & data related to biological events

**Variable**

It is a state, condition, concept, or event whose value is free to vary within the population

## Basics In Statistics Important Notes

**1. Measures of central tendency**

**Arithmetic mean**- Simplest measure
- Obtained by summing up all the observations divided by the number of observations
- It is very sensitive to extreme scores.

**Median**- It is the simplest division of the set of measurements into two halves
- When the distribution has odd numbers, the middle value is the median, when the distribution has an even number of elements, the average of two middle scores is median
- It is insensitive to small numbers.

**Mode**- It is the most frequently occurring value in a set of observations

**2. Sampling**

- Simple random sampling
- Used when the population is small, homogenous.

- Systemic sampling stratified sampling
- Used when the population is large, non-homogenous, and scattered

- Multistage sampling
- Employed in large country surveys
- Carried out in several stages

- Multiphase sampling
- Here sampling is done in different phases

- Cluster sampling
- Involves grouping the population and then surveying

- Stratified sampling
- Used when the population is large, nonhomogenous

**3.Properties of the normal curve**

- Bell-shaped
- Symmetrical
- The height of the curve is maximum at the mean
- Mean = median = mode
- The area under the curve between any two points can be found in terms of the relationship between mean and standard deviation.

Mean ± 1 SD = 68.3% of observation

Mean ± 2 SD = 95.4% of observation

Mean + 3 SD = 99.7% of observation

**4. Classification of data**

- Qualitative data
- It is data with frequency but no magnitude
- Nonparametric tests are used for it

**Read And Learn More: Percentive Communitive Dentistry Question And Answers**

- Quantitative data
- It is data with a magnitude
- Parametric tests are used for it

**5. Chi-square test is used**

- To test the association between the cause and effect
- To find the goodness of fit
- To test the differences between two/more proportions

**6. Tests**

## Basics In Statistics Long Essays

**Question 1. Define sample. What are the ideal requisites of sampling, describe different sampling methods.**

**Answer:**

**Sample:**

It is part of a population called the universe, reference, or parent population

**Sample Ideal Requisites:**

- Efficiency
- Representativeness
- Measurability
- Size-large
- Adequate coverage
- Goal orientation
- Feasibility
- Economic

**Sample Sampling Methods:**

**Sample Probability Sampling:**

- Simple Random Sampling
- Each member of the population has an equal chance of being included in the sample
- The member is determined by chance only © Methods of random selection are
- Lottery method
- Table of random numbers

- Systematic
- It is obtained by selecting one unit at random & then selecting additional units at evenly spaced interval till an adequate sample size is obtained
- It can be adopted as long as there is no periodicity of occurrence of any particular event in the population
- Stratified Random
- The population to be sampled is subdivided into strata
- A simple random sample is then chosen from it
- Used for a heterogeneous population
- It ensures more representativeness, provides greater accuracy & can concentrate over a wider area
- It eliminates sampling variation

**Sample Cluster Sampling:**

- Useful when a population forms natural groups
- First, a sample of the clusters is selected & then all units in clusters are surveyed

**Sample Advantage:**

- Simple
- Less expensive

**Sample Disadvantage:**

Cannot be generalized

**Sample Non-Probability Sampling:**

**Sample Accidental Sampling:**

- It is a matter of taking what you can get
- It is not randomly obtained

**Sample Advantage:**

It is inexpensive & less time-consuming

**Sample Purposive Sampling:**

- It is a nonrepresentative subset of some larger population
- A sample is achieved by asking a participant to suggest someone else willing for the study

**1. Quota Sampling:**

It involves the selection of proportional samples of subgroups within a target population to ensure generalization

**2. Dimensional Sampling:**

A small sample is selected then each selected case is examined in detail

**3. Mixed Sampling:**

Constitute a combination of both probability & nonprobability sampling

**Question 2. Define biostatistics. Write in detail the uses of biostatistics in dental public health.**

**Answer:**

**Biostatistics:**

- It is that branch of statistics concerned with mathematical facts & data related to biological events
- It deals with the statistical methodologies involved in biological sciences

**Biostatistics Uses:**

- Measure the state of health of the community
- Identify the health problems
- Compare the health status of one country with another & past status with present
- Predict health trends
- Plan & administer dental health services
- Evaluate the achievement of public health program
- Fix priorities in public health program
- Evaluate the efficacy of vaccines, sera, etc
- Measure mortality & morbidity
- Test whether the difference between 2 populations is real or a chance occurrence
- Study correlation between attributes in the same population
- Promote health legislation
- Help the dentist to think quantitatively

**Question 3. Define sampling. Classify sampling. Enumerate any one sampling.**

**Answer:**

**Sampling:**

It is the process or technique of selecting a sample of appropriate characteristics & adequate size

**Probability Sampling:**

**Simple Random Sampling:**

- Each member of the population has an equal chance of being included in the sample
- The member is determined by chance only
- Methods of random selection are
- Lottery method
- Table of random numbers

**Sampling Systematic:**

- It is obtained by selecting one unit at random & then selecting additional units at evenly spaced intervals till an adequate sample size is obtained
- It can be adopted as long as there is no periodicity of occurrence of any particular event in the population

**1. Stratified Random:**

- The population to be sampled is subdivided into strata
- A simple random sample is then chosen from it
- Used for a heterogeneous population
- It ensures more representativeness, provides greater accuracy & can concentrate over a wider area
- It eliminates sampling variation

**2. Cluster Sampling:**

- Useful when a population forms natural groups
- First, a sample of the clusters is selected & then all units in clusters are surveyed

**Sampling Advantage:**

- Simple
- Less expensive

**Sampling Disadvantage:**

Cannot be generalized

**Question 4. Enumerate various measures of dispersion & describe in detail the test of significance.**

**Answer:**

**Measures Of Dispersion:**

- Range
- It is the difference between the smallest & largest results in a set of data

- Mean deviation
- It is the average of the deviation from the arithmetic mean

- Standard deviation

**Measures Of Dispersion Test Of Significance:**

It deals with the techniques to know how far the differences between the estimates of different samples is due to sampling variations

**1. Standard Error of Mean (SE):**

Gives the standard deviation of the mean of several samples from the same population

= standard deviation / √n

**2. Standard Error of Proportion:**

= p & q = proportion of occurrence of an event

in 2 groups

n = sample size

**Measures Of Dispersion Standard Error Of Difference Between Two Means:**

Indicates whether the samples represent two different universe

**Measures Of Dispersion Standard Error Of Difference Between Proportion:**

Indicate whether the difference is significant or has occurred by chance

**Measures Of Dispersion Chi-Square Test:**

**Measures Of Dispersion Uses:**

- Test whether the difference in the distribution of attributes in different groups is due to sampling variation or not
- Test the significance of the difference between 2 proportion
- Used when there are more than 2 groups to be compared

**Measures Of Dispersion Z Test:**

- Test the significance of differences in means for large samples
- ‘t’ Test

**Measures Of Dispersion Synonym:**

Student’s t-test

**Measures Of Dispersion Uses:**

- Used when the sample size is small
- Used to test the hypothesis
- Find the significance of the difference between the 2 proportions

**Measures Of Dispersion Types:**

- Unpaired’t’ test
- Applied to unpaired data made on individuals of 2 different sample
- Test if the difference between the means is real or not

**Measures Of Dispersion Paired’t’ test:**

Applied to paired data obtained from one sample only

**Question 5. Define biostatistics. Describe in detail the normal curve. Write a note on measures of central tendency.**

**(or) Normal distribution/ Properties of normal curve/ Gaussian curve.**

**(or) Mean, Median, Mode.**

**(or) Measures of central tendency.**

**Answer:**

**Biostatistics:**

- It is that branch of statistics concerned with mathematical facts & data related to biological events
- It deals with the statistical methodologies involved in biological sciences

**Biostatistics Normal Curve:**

- It is a pattern followed by very many sets of continuous measurements.
- It is characterized by a symmetric, bell-shaped curve
- In a normal curve
- The area between one standard deviation on either side of the mean will include approximately 68% of the values
- The area between two standard deviations on either side of the mean will include approximately 95% of the values
- The area between three standard deviations on either side of the mean will include approximately 99.5% of the values

**Biostatistics Characteristics:**

- It is smooth, symmetrical bell-shaped
- The maximum number of observations is at the center & gradually decreases at the extremities
- The total area is 1, the mean is 0 & standard deviation is 1
- Mean, median & mode coincide at center

## Basics In Statistics Short Essays

**Question 1. Presentation of statistical data.
(or) Pie Chart
(or) Histogram
(or) Pictogram
(or) Uses of biostatistics**

**Answer:**

**Presentation of statistical data Tabulation**

- Tables are simple devices used for data presentation
- Prepared manually or mechanically

**Presentation of statistical data Types:**

**1. Simple Table:**

Way table containing one characteristic of data only

**Presentation of statistical data Master Table:**

Contains all the data obtained from a survey

**Presentation of statistical data Frequency Distribution Table:** Two-column table

- 1st column: lists classes of data
- 2nd column: lists the frequency of each class

**Charts/ Diagrams:**

**1. BarCharts:**

- It is a diagram of columns/ bars
- The height of the bars determines the value of the particular data
- The width of the bar remains the same
- The bars are separated by spaces
- The bars can be either vertical/ horizontal

**Presentation of statistical data Types:**

- Simple bar chart
- Represents only one variable

**Presentation of statistical data Multiple bar chart**

Consist of a set of bars of the same width corresponding to the different sections without any gap in between

- Component bar chart
- Individual bars are divided into 2 or more parts
- Used to compare the sub-groups

**2. Pie Chart:**

- The entire graph looks like a pie & its components are represented by its slices
- It is divided into different sectors corresponding to the frequencies of the variables
- The segments are then shaded/ colored

**3. Histogram:**

- It is a pictorial presentation of data
- Class intervals are presented on the X-axis & frequencies on the Y axis
- No space occurs between the cells

**4. Pictogram:**

They are small pictures used for data presentation USA

**5. Line Diagram:**

- Used for continuous variable
- Time is represented on the X-axis & value on the Y axis

**6. Statistical Maps:**

- Refer to the geographic area
- Dot/ point is used to represent the area

**Question 2. Types of diagram.**

**Answer:**

**1. Bar Charts:**

- It is a diagram of columns/ bars
- The height of the bars determines the value of the particular data
- The width of the bar remains the same
- The bars are separated by spaces
- The bars can be either vertical/ horizontal

**Bar Charts Types:**

- Simple bar chart
- Represents only one variable

**2. Multiple bar chart:**

Consist of a set of bars of the same width corresponding to the different sections without any gap in between

**3. Component bar chart:**

- Individual bars are divided into 2 or more parts
- Used to compare the sub-groups

**4. Pie Chart:**

- The entire graph looks like a pie & its components are represented by its slices
- It is divided into different sectors corresponding to the frequencies of the variables
- The segments are then shaded/ colored

**5. Histogram:**

- It is a pictorial presentation of data
- Class intervals are presented on the X-axis & frequencies on the Y axis
- No space occurs between the cells

**6. Pictogram:**

They are small pictures used for data presentation

**Question 3. Types of samples/ Probability sampling methods/ Sampling methods.
(or) Cluster sampling**

**Answer:**

**Probability Sampling**

**Simple Random Sampling:**

- Each member of the population has an equal chance of being included in the sample
- The member is determined by chance only
- Methods of the random selection are e
- Lottery method
- Table of random numbers

**Probability Sampling Systematic:**

- It is obtained by selecting one unit at random & then selecting additional units at evenly spaced intervals till an adequate sample size is obtained
- It can be adopted as long as there is no periodicity of occurrence of any particular event in the population

**Probability Sampling Stratified Random:**

- The population to be sampled is subdivided into strata
- A simple random sample is then chosen from it
- Used for a heterogeneous population
- It ensures more representativeness, provides greater accuracy & can concentrate over a wider area
- It eliminates sampling variation

**Probability Sampling Cluster Sampling:**

- Useful when a population forms natural groups
- First, a sample of the clusters is selected & then all units in clusters are surveyed

**Probability Sampling Advantage:**

- Simple
- Less expensive

**Probability Sampling Disadvantage:**

Cannot be generalized

**Probability Sampling Non-Probability Sampling:**

**Probability Sampling Accidental Sampling:**

- It is a matter of taking what you can get
- It is not randomly obtained

**Probability Sampling Advantage:**

It is inexpensive & less time-consuming

**Probability Sampling Purposive Sampling:**

- It is a nonrepresentative subset of some larger population
- A sample is achieved by asking a participant to suggest someone else willing for the study

**Probability Sampling Quota Sampling:**

It involves the selection of proportional samples of subgroups within a target population to ensure generalization

**Probability Sampling Dimensional Sampling:**

A small sample is selected then each selected case is examined in detail

**Probability Sampling Mixed Sampling:**

Constitute a combination of both probability & nonprobability sampling

**Question 4. Simple random sampling.**

**Answer:**

**Simple random sampling**

- Each member of the population has an equal chance of being included in the sample
- The member is determined by chance only
- Methods of the random selection are the
- Lottery method
- Table of random numbers

**Question 5. Multistage sample.**

**Answer:**

**Multistage sample**

It is a sampling procedure often used when the sampling units can be defined in a hierarchical manner

**Multistage sample Steps:**

- Select the groups/cluster
- Then subsamples are taken in subsequent stages
- 1st stage: choice of states within countries
- 2nd stage: choice of towns within each state
- 3rd stage, choice of neighborhoods in each town

**Question 6. Tests of significance.**

**(or)’t’ test.**

**Answer:**

** Tests of significance**

It deals with the techniques to know how far the differences between the estimates of different samples is due to sampling variations

**Tests of significance Standard Error Of Mean(Se):**

Gives the standard deviation of the mean of several samples from the same population

**Tests of significance Standard Error Of Proportion:**

\(=\sqrt{\frac{p q}{n}} \mathrm{p} \& \mathrm{q}=\) proportion of occurrence of an event

in 2 groups n= sample size

**Tests of significance Standard Error Of Difference Between Two Means**

Indicates whether the samples represent two different universe

**Tests of significance Standard Error Of Difference Between Proportion**

Indicate whether the difference is significant or has occurred by chance

**Tests of significance Chi-Square Test**

**Tests of significance Uses:**

- Test whether the difference in the distribution of attributes in different groups is due to sampling variation or not
- Test the significance of the difference between 2 proportion
- Used when there are more than 2 groups to be compared

**Tests of significance Z Test:**

Test the significance of differences in means for large samples

**Tests of significance ‘t’ Test:**

**Tests of significance Synonym:**

Student’s t-test

**Tests of significance Uses:**

- Used when the sample size is small
- Used to test the hypothesis
- Find the significance of the difference between the 2 proportions

**Tests of significance Types:**

**Tests of significance Unpaired’t’ test:**

- Applied to unpaired data made on individuals of 2 different sample
- Test if the difference between the means is real or not
- Paired’t’ test
- Applied to paired data obtained from one sample only

**Question 7. Statistical analysis.**

**Answer:**

** Statistical analysis**

- It is based on
- Population
- It is the collection of units of observations that are of interest & is the target of the investigation
- It is essential to identify the population clearly & precisely
- The success of the investigation will depend on the identification of the population

- Variable
- It is a state, condition, concept/ event whose value is free to vary within the population

**Classification of Statistical Analysis:**

- Independent
- Manipulated/ treated in a study

**Dependent:**

- Result of the independent variable
- Confounding
- Confound the effect of the independent variable on the dependent

- Background
- Considered for possible inclusion in the study

- Probability distribution
- It is a link between population & its characteristics
- It is a way to enumerate the different values the variable can have & how frequently each value appears in the population
- It is characterized by parameters i.e. quantities

**Question 8. Standard deviation.**

**Answer:**

** Standard deviation**

- It is the square root of the mean of the squared deviations from arithmetic
- It is the most commonly used measure of dispersion

**Standard deviation Synonym**

Root Mean Square Deviation

**Standard deviation Calculation**

- Calculate the mean of the series, X
- Take the deviation mean X- X,
- Square these deviations & add them up 5^ 2
- Divide the result by the total number of observation
- Obtain the square root of it (Standard deviation)

**Standard deviation Significance:**

- The greater the standard deviation, the greater the magnitude of dispersion
- Lesser the standard deviation, a higher degree of uniformity of observation

**Question 9. Bar diagram/ charts.**

**Answer:**

**Bar diagram**

- It is a diagram of columns/ bars
- The height of the bars determines the value of the particular data
- The width of the bar remains the same
- The bars are separated by spaces
- The bars can be either vertical/ horizontal

** Bar diagram Types:**

**1. Bar diagram Simple bar chart**

Represents only one variable

**2. Bar diagram Multiple bar chart**

Consist of a set of bars of the same width corresponding to the different sections without any gap in between

**3. Bar diagram Component bar chart**

Individual bars are divided into 2 or more parts Used to compare the sub-groups

## Basics In Statistics Short Question And Answers

**Question 1. Primary & secondary data.**

**Answer:**

**secondary data Primary Data:**

- Obtained directly from an individual
- It is first-hand information

**secondary data Advantage:**

- Precise information
- Reliable

**secondary data Disadvantages:**

- Time-consuming
- Expensive

**secondary data Methods:**

- Direct personal interviews
- Oral health examination
- Questionnaire

**secondary data Secondary Data:**

- Obtained from outside sources
- Used to serve the purpose of the objective of the study
- Example: Hospital records

**Question 2. Frequency polygon.**

**Answer:**

** Frequency polygon**

Pictorial presentation of data

** Frequency polygon Method:**

- Obtained from histogram
- Mark the midpoint over histogram bars
- Next, connect these points in a straight line
- Example. Agewise prevalence of dental caries

**Question 3. Stratified random sampling.**

**Answer:**

**Stratified random sampling**

- The population to be sampled is subdivided into strata
- A simple random sample is then chosen from it
- Used for a heterogeneous population
- It ensures more representativeness, provides greater accuracy & can concentrate over a wider area

**Question 4. Mode.**

**Answer:**

**Mode**

It is a value occurring with the greatest frequency

**Mode Advantage:**

- Eliminates extreme variation
- Easily located
- Easy to understand

**Mode Disadvantage:**

- Uncertain location e Not exactly defined
- Not useful in a small number of cases

**Question 5. Null hypothesis.**

**Answer:**

**Null hypothesis**

- It asserts that there is no real difference between the two groups under consideration & the difference found is accidental & arises out of sampling variation
- It is the first step in the testing of the hypothesis

**Question 6. Variable.**

**Answer:**

**Variable**

It is a state, condition, concept, or event whose value is free to vary within the population

**Classification of Variable:**

- Independent
- Manipulated/ treated in a study

- Dependent:
- Result of an independent variable

- Confounding
- Confound the effect of the independent variable on the dependent

- Background
- Considered for possible inclusion in the study

**Question 7. Qualitative data.**

**Answer:**

**Qualitative data**

When data is collected on the basis of attributes/ qualities like sex, it is called qualitative data

**Question 8. Chi-square test.**

**Answer:**

**Chi-square test Uses:**

- Test whether the difference in the distribution of attributes in different groups is due to sampling variation or not
- Test the significance of the difference between 2 proportion
- Used when there are more than 2 groups to be compared

## Basics In Statistics Viva Voce

- Mean, median and mode are measures of central tendency
- Range, standard deviation, and coefficient of variation are measures of dispersion
- The range is the difference between the smallest item and the value of the largest item
- A census is a collection of information from all the individuals in a population
- Sampling is the collection of information from representative units in a sample
- Standard deviation is the most important and widely used measure of studying dispersion
- A bar diagram is used to represent qualitative data
- Histogram used to depict quantitative data
- A frequency polygon is used to represent the frequency distribution of quantitative data
- A pie diagram is used to show percentage breakdowns for qualitative data
- A line diagram is useful to study the changes in values in the variable over time
- Pictogram is the method to impress the frequency of occurrence of events to the common man
- The chi-square test is a non-parametric test for qualitative data
- For large samples, z test is preferred
- For small samples, a t-test is preferred
- The value of the mean in a normal distribution is zero
- Standard deviation is also called root mean square deviation
- The median is also called the 50th percentile
- The standard error of the mean depicts the deviation