Home: Introduction to evidence-based practice for the health sciences: Getting started with stats

Getting started with stats

Getting started with statistics

Most studies include statistical data to show their results and how significant their findings might be. Understanding basic statistics will help you to interpret evidence correctly, make informed decisions, and ultimately contribute to better care for patients.

Descriptive statistics

Descriptive statistics are used summarise and describe the main features of a dataset. They help you to present and understand data in a meaningful way.

Measures of central tendency tell you the typical or central value in a dataset.

Mean	The sum of all values in a group divided by the number of values. It is used when data is evenly distributed.
Median	The value in the middle of a group of values when they are arranged in order. It is used when data is skewed.
Mode	The value that appears most often in a dataset. It is used to identify the most common outcome.

Measures of variability help you to understand how data is distributed.

Range	The difference between the highest and lowest values in a dataset. It is calculated as: Maximum value - Minimum value = Range The range is the simplest way to describe variation in a set of values, but it can be misleading if there are outliers (e.g. extremely high or low values) that don't fit the typical pattern.
Interquartile range (IQR)	The range of the middle 50% of data, ignoring extremes values. It is generally more reliable than the range because it ignores outliers.
Standard deviation (SD)	A measure of how spread out the numbers in a dataset are. It tells you how much data differ from the mean of the dataset. Small SD = indicates not much variation, suggesting consistent treatment results. Large SD = indicates more variation, suggesting unpredictable outcomes.

Inferential statistics

Inferential statistics are used to make predictions, or inferences, about a larger population based on a sample of data from that population. They allow researchers to draw conclusions and make decision about a population without having to examine every single member of the study population.

P-value	Helps to determine whether study results are due to a real effect or occurred by chance. A smaller p-value provides stronger evidence against randomness. p ≤ 0.05 - statistically significant. p ≤ 0.01 - highly significant. p ≤ 0.001 - very highly significant.
Confidence interval (CI)	A range that likely contains the true value of an estimate if the experiment were repeated multiple times. It reflects precision and uncertainty in the estimate. Narrow CI - more precise, higher confidence in the estimate. Wide CI - less precise, greater uncertainty.
Odds ratio (OR)	A statistical measure that compares the odds of an outcome occurring in one group to the odds of it occurring in another. It is most commonly used in case-control studies. Causal (OR > 1): exposure increases the odds of an outcome, often something harmful. Neutral (OR = 1): no difference between groups. Protective (OR < 1): exposure decreases the odds of a negative outcome or provides protection.
T-test	Evaluates whether there's a significant difference between the means of two groups for a specific variable.
Analysis of Variance [ANOVA]	Evaluates whether there are significant differences between the means of three or more groups for a particular variable.
Correlation coefficient [r]	Evaluates how strongly two variables are related, indicating whether they tend to change together. Positive correlation: both variables move in the same direction (both increase or both decrease). Negative correlation: variables move in opposite directions (one increases while the other decreases).
Risk Ratio (RR) / Relative Risk	Measures how the risk of an outcome changes based on exposure. Commonly used in randomised controlled trials (RCTs) and cohort studies. Causal (RR > 1): exposure increases the risk of the outcome. Protective (RR < 1): exposure decreases (protects against) the risk of the outcome.

Biostatistical terms

Biostatistical measures related to therapies

These measures are commonly used in clinical research and evidence-based practice to evaluate the effectiveness of treatments or interventions. They help to interpret therapeutic outcomes by comparing the impact of a treatment to a control group.

Relative Risk Reduction (RRR)	The percentage decrease in risk between a treatment group and a control group. It shows how much the treatment reduces the risk compared to no treatment.
Absolute Risk Reduction (ARR)	The direct difference in risk between the treatment and control groups. It shows the actual reduction in outcome occurrence due to treatment.
Number Needed to Treat (NNT)	The number of patients who must receive a treatment instead of the alternative for one additional person to benefit. It is calculated as: 1/ARR= NNT

Biostatistical terms - diagnostic testing

Biostatistical terms related to diagnostic testing

These terms are used to evaluate the accuracy and usefulness of medical tests in distinguishing between individuals who have a disease and individuals who do not. They help healthcare professionals to interpret test results and decide on future care.

Sensitivity (Sn)	The percentage of patients with the disease who receive a positive test result (true positives). It indicates how well a test detects the disease in those who actually have it.
Specificity (Sp)	The percentage of patients without the disease who receive a negative test result (true negatives). It indicates how well a test correctly identifies disease-free individuals.
Positive Predictive Value (PPV)	The percentage of patients with a positive test result who actually have the disease (true positives). It is affected by disease prevalence in the population.
Negative Predictive Value (NPV)	The percentage of patients with a negative test result who do not have the disease (true negatives). It is also influenced by disease prevalence in the population.
Pre-test Probability	The estimated likelihood that a patient has a disease before a diagnostic test is performed.
Post-test Probability	The likelihood of disease in a patient after receiving a test result, incorporating both test performance and pre-test probability.
Likelihood Ratio (LR)	Indicates how much a test result changes the probability of disease. LR > 1: increases disease likelihood. LR < 1: decreases disease likelihood. Tests with LR > 5 or LR < 0.2 are considered most useful.

Introduction to evidence-based practice for the health sciences

Useful books

Useful books

NCCMT videos

Learn more about statistical terms

Contact Info

Library contact

Getting started with stats

Getting started with statistics

Descriptive statistics

Inferential statistics

Inferential statistics

Biostatistical terms

Biostatistical measures related to therapies

Biostatistical terms - diagnostic testing

Biostatistical terms related to diagnostic testing