Jump to section:
A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
A
Adjusted R²
Adjusted R² is a modified version of the coefficient of determination that accounts for the number of predictors included in a regression model. Unlike R², which always increases when additional predictors are added, adjusted R² increases only when new predictors meaningfully improve the model’s explanatory power.
Why it matters: Adjusted R² helps researchers evaluate whether additional predictors improve a model or simply increase complexity without improving explanatory value.
See also: R²; Regression; Model Fit
AIC (Akaike Information Criterion)
AIC is a statistic used to compare competing statistical models. It balances two considerations: how well the model fits the data and how complex the model is. Models that include more predictors often fit the data better but may overfit the sample. AIC penalizes excessive complexity, allowing researchers to identify the model that provides the best balance between accuracy and simplicity.
Why it matters: AIC helps researchers determine which predictive model is most appropriate when multiple models are possible.
See also: Model Fit; Logistic Regression; Predictive Model
Alpha (α)
Alpha is the threshold used to determine statistical significance in hypothesis testing. It represents the probability of making a Type I error, meaning the chance of incorrectly rejecting a true null hypothesis. Researchers commonly set alpha at .05, indicating they are willing to accept a 5% chance of concluding that a relationship or difference exists when it actually does not.
Why it matters: Alpha establishes the standard for determining whether results are statistically significant.
See also: p-value; Type I Error; Hypothesis Testing
Alternative Hypothesis (H₁)
The alternative hypothesis is the statement that a relationship, difference, or effect exists between variables. It represents the research expectation that the observed data differ from what would be expected under the null hypothesis. In statistical testing, evidence against the null hypothesis provides support for the alternative hypothesis.
Why it matters: The alternative hypothesis represents the researcher’s proposed explanation or expected pattern in the data and guides the interpretation of statistical test results.
See also: Null Hypothesis; Hypothesis Testing; p-value; Statistical Significance
ANCOVA (Analysis of Covariance)
ANCOVA is a statistical method used to compare group means while controlling for the influence of another variable, known as a covariate. The covariate is included in the analysis to account for variability that might otherwise distort comparisons between groups. By adjusting for this additional variable, ANCOVA allows for a more precise comparison of the groups being studied.
Why it matters: ANCOVA allows researchers to compare groups while accounting for factors that may influence the outcome variable.
See also: ANOVA; Covariate; Regression
ANOVA (Analysis of Variance)
ANOVA is a statistical test used to determine whether the means of three or more groups differ significantly. Rather than comparing each pair of groups individually, ANOVA evaluates all groups simultaneously to determine whether at least one group mean differs from the others. If the overall test is significant, additional analyses called post hoc tests are conducted to determine which groups differ.
Why it matters: ANOVA allows researchers to test differences among multiple groups without increasing the risk of statistical error.
See also: One-Way ANOVA; Factorial ANOVA; Post Hoc Test
Area Under the Curve (AUC)
Area Under the Curve (AUC) is a statistic used to evaluate the performance of a classification model, such as binomial logistic regression. It represents the area under the Receiver Operating Characteristic (ROC) curve, which plots the model’s ability to correctly identify cases across different classification thresholds.
AUC values range from 0 to 1. A value of 0.50 indicates that the model performs no better than random guessing, while values closer to 1.00 indicate stronger ability to correctly distinguish between categories of the outcome variable.
Why it matters: AUC provides a summary measure of how well a model can distinguish between the outcome categories across all possible classification thresholds.
See also: ROC Curve; Logistic Regression; Classification; Cutoff Value; Predictive Model
Association
Association refers to a statistical relationship between two variables. When variables are associated, changes in one variable tend to occur alongside changes in another. However, association does not necessarily imply that one variable causes the other.
Why it matters: Understanding association helps researchers identify patterns and relationships within data.
See also: Correlation; Causation; Regression; Chi-Square Test of Association
Assumption
An assumption is a condition that must reasonably hold true for a statistical test to produce valid results. Different statistical tests rely on different assumptions, such as normality, independence of observations, or equal variance among groups.
Why it matters: If assumptions are violated, statistical results may be misleading or inaccurate.
See also: Assumption Check; Normal Distribution; Homogeneity of Variance
Assumption Check
An assumption check is the process of evaluating whether the conditions required for a statistical test are satisfied. Researchers often examine plots or run diagnostic tests before interpreting statistical results.
Why it matters: Checking assumptions ensures that statistical analyses are appropriate for the data being analyzed.
See also: Assumption; Residual Plot; Shapiro–Wilk Test
Autocorrelation
Autocorrelation refers to a pattern in which residuals or observations are correlated with one another across an ordered sequence, such as time. In regression, autocorrelation most often means that the errors are not independent, which violates an important model assumption. Positive autocorrelation occurs when nearby values tend to be similar, whereas negative autocorrelation occurs when nearby values tend to differ.
Why it matters: Autocorrelation can distort standard errors and significance tests, leading researchers to draw inaccurate conclusions from a regression model.
See also: Durbin–Watson Statistic; Residual; Independence of Errors; Regression
Back to top
B
Bar Chart
A bar chart is a graphical display used to represent categorical data. Each bar represents a category, and the height of the bar corresponds to the value associated with that category, such as a count or mean.
Why it matters: Bar charts allow researchers to quickly compare values across categories.
See also: Histogram; Categorical Variable
Beta (β)
The probability of committing a Type II error in hypothesis testing. A Type II error occurs when a study fails to detect a real effect or relationship that actually exists.
Why it matters: Beta is inversely related to statistical power (1 − β). Lower beta values indicate a greater likelihood of detecting real effects.
See also: Type II Error; Statistical Power
Binomial Test
The binomial test is a statistical test used when the outcome variable has only two possible categories, such as success/failure or yes/no. The test determines whether the observed proportion differs significantly from an expected proportion.
Why it matters: The binomial test helps researchers evaluate whether observed outcomes differ from theoretical or expected values.
See also: Goodness-of-Fit Test; Probability
Bonferroni Correction
The Bonferroni correction is a method used to adjust the significance level when multiple statistical tests are performed. The adjustment reduces the risk of incorrectly identifying significant results by dividing the original alpha level by the number of comparisons being made.
Why it matters: The Bonferroni correction helps control the increased risk of Type I errors when conducting multiple comparisons.
See also: Post Hoc Test; Alpha; Familywise Error Rate
Boxplot
A boxplot is a graphical representation of the distribution of a variable. It displays the median, quartiles, and potential outliers, providing a visual summary of the spread and central tendency of the data.
Why it matters: Boxplots allow researchers to quickly evaluate variability and identify unusual values in a dataset.
See also: Quartile; Outlier; Distribution
Back to top
C
Case
A case refers to a single entity or unit included in a dataset or research study. A case may represent an individual person, organization, classroom, school, or other unit being analyzed.
In many datasets, the terms case and observation are used interchangeably, with each case representing one row of data containing values for multiple variables.
Why it matters: Clearly identifying what constitutes a case helps ensure that statistical analyses and conclusions are interpreted at the appropriate level.
See also: Observation; Unit of Analysis; Variable
Categorical Variable
A categorical variable, also known as a nominal variable, represents groups or categories rather than numerical quantities. Examples include gender, school level, or job role.
Why it matters: Categorical variables require different statistical tests than numerical variables.
See also: Nominal Variable; Chi-Square Test of Association
Causation
Causation refers to a relationship in which changes in one variable directly produce changes in another variable. Demonstrating causation typically requires experimental designs that control for alternative explanations.
Why it matters: Distinguishing causation from association helps prevent incorrect conclusions about relationships between variables.
See also: Association; Experimental Design; Internal Validity
Central Limit Theorem (CLT)
The Central Limit Theorem states that when many samples are drawn from a population, the distribution of sample means approaches a normal distribution as the sample size increases. This occurs even if the original population distribution is not normal.
Why it matters: The CLT explains why many statistical methods work reliably with large samples.
See also: Sampling Distribution; Normal Distribution; Standard Error
Central Tendency
Central tendency refers to statistical measures that describe the typical or central value of a dataset. The most common measures of central tendency are the mean, median, and mode. These statistics summarize where the center of a distribution lies.
Why it matters: Measures of central tendency provide a simple way to summarize the overall pattern of values in a dataset and are often the first step in understanding data.
See also: Mean; Median; Mode; Descriptive Statistics
Chi-Square (χ²)
Chi-square (χ²) is a statistical test statistic used to evaluate whether observed categorical data differ from what would be expected under a null hypothesis. The chi-square value represents the overall discrepancy between observed frequencies and expected frequencies across categories.
Why it matters: The chi-square statistic forms the basis of several statistical tests used with categorical data, including the chi-square test of independence and the chi-square goodness-of-fit test.
See also: Chi-Square Test of Independence; Goodness-of-Fit Test; Expected Frequency; Observed Frequency
Chi-Square Test of Association
The chi-square test of independence evaluates whether two categorical variables are associated. It compares observed frequencies with expected frequencies to determine whether differences are greater than would be expected by chance.
Why it matters: This test allow researchers to examine associations between nominal variables.
See also: Goodness-of-Fit Test; Nominal Variable
Classification Accuracy
Classification accuracy refers to the proportion of observations that a predictive model correctly classifies. In logistic regression and other classification models, accuracy is calculated by comparing predicted outcomes with observed outcomes.
Why it matters: Classification accuracy provides a simple summary of how well a model predicts categorical outcomes.
See also: Sensitivity; Specificity; Logistic Regression
Cluster Sampling
Cluster sampling is a sampling method in which naturally occurring groups, or clusters, are randomly selected from a population, and all individuals within the selected clusters are included in the sample or further sampled.
Why it matters: Cluster sampling can make data collection more practical and cost-effective when populations are geographically dispersed or organized into natural groups.
See also: Random Sampling; Stratified Sampling; Population
Cochran’s Q Test
Cochran’s Q test is a nonparametric statistical test used to compare three or more related proportions when the outcome variable is dichotomous, such as yes/no or success/failure. It is an extension of McNemar’s test for situations involving more than two related conditions or time points.
Why it matters: Cochran’s Q test allows researchers to determine whether the proportion of cases classified into one category differs across three or more related conditions.
See also: McNemar’s Test; Binomial Test; Repeated Measures Design; Nonparametric Test
Coefficient
A coefficient is a numerical estimate in a regression model that represents the relationship between a predictor variable and the outcome variable. The coefficient indicates the expected change in the outcome variable associated with a one-unit change in the predictor.
Why it matters: Coefficients allow researchers to interpret how strongly predictors are associated with an outcome variable.
See also: Regression; Intercept; Predictor Variable
Cohen’s d
Cohen’s d is an effect size statistic used to estimate the magnitude of the difference between two group means. It expresses the mean difference in units of standard deviation, allowing researchers to evaluate how large a difference is relative to the variability in the data.
Cohen’s d is commonly reported alongside t-tests and pairwise comparisons to indicate the practical importance of group differences.
Why it matters: Cohen’s d helps researchers determine whether a statistically significant difference is also practically meaningful.
See also: Effect Size; t-Test; Mean Difference; Standard Deviation
Comparative Design
A comparative design is a research design used to examine differences between two or more groups on an outcome variable. The groups already exist and are compared based on characteristics such as school level, role, gender, or program participation. Comparative designs focus on identifying whether meaningful differences are present, rather than establishing causation.
Why it matters: Comparative designs help researchers determine whether outcomes vary across groups, which is especially useful in applied research settings where random assignment is not possible.
See also: Group Comparison; Independent Variable; Outcome Variable; Non-Experimental Design; ANOVA; t-Test
Confidence Interval
A confidence interval is a range of values that likely contains the true population parameter. Confidence intervals are typically reported with a confidence level, such as 95%.
Why it matters: Confidence intervals provide information about the precision and uncertainty of statistical estimates.
See also: Parameter; Standard Error; Inference
Construct Validity
Construct validity refers to the extent to which a measurement instrument accurately represents the theoretical concept or construct it is intended to measure. Establishing construct validity involves demonstrating that the instrument behaves as expected based on theory and prior research.
Why it matters: Strong construct validity increases confidence that a measurement truly reflects the concept being studied, rather than capturing unrelated factors.
See also: Measurement Validity; Reliability; Scale
Continuous Variable
A continuous variable is a quantitative variable that can take any value within a range, including fractional or decimal values. These variables are typically measured rather than counted.
Continuous variables include interval and ratio variables, and can theoretically take an infinite number of values within their measurement range.
Why it matters: Continuous variables are commonly used in statistical analyses such as correlation, regression, and many parametric tests.
See also: Interval Variable; Ratio Variable; Quantitative Variable; Distribution
Correlation
Correlation is a statistical measure that describes the strength and direction of a relationship between two variables. Correlation coefficients range from −1 to +1.
Why it matters: Correlation helps researchers understand how variables move together.
See also: Scatterplot; Association; Regression
Covariate
A covariate is a variable included in an analysis to account for its influence on the outcome variable. By controlling for covariates, researchers can obtain clearer estimates of the relationships of interest.
Why it matters: Covariates help reduce confounding influences in statistical analyses.
See also: ANCOVA; Regression; Control Variable
Cramér’s V
Cramér’s V is an effect size statistic used with chi-square tests to measure the strength of association between two categorical variables. The statistic ranges from 0 to 1, where larger values indicate stronger associations between categories.
Cramér’s V is often reported following a Chi-Square Test of Association to describe the magnitude of the relationship between variables.
Why it matters: Cramér’s V helps researchers evaluate how strongly two categorical variables are associated beyond simply determining whether the relationship is statistically significant.
See also: Chi-Square Test of Association; Effect Size; Association
Cronbach’s Alpha
Cronbach’s alpha is a statistic used to measure the internal consistency reliability of a scale. It evaluates whether multiple items designed to measure the same construct produce consistent results.
Why it matters: Cronbach’s alpha helps researchers determine whether survey items reliably measure a concept.
See also: Internal Consistency Reliability; Reliability
Cross-Sectional Study
A cross-sectional study collects data from participants at a single point in time. These studies are often used to describe patterns or relationships within a population.
Why it matters: Cross-sectional studies provide a snapshot of a population at a particular moment.
See also: Retrospective Study; Descriptive Design
Back to top
D
Data Visualization
Data visualization refers to the graphical display of data to help researchers and readers understand patterns, trends, and relationships within a dataset. Common forms of data visualization include histograms, boxplots, scatterplots, bar charts, and line graphs, each designed to highlight different features of the data.
Visualizations allow complex numerical information to be communicated in a more intuitive and accessible way. By presenting data visually, researchers can more easily identify patterns such as clustering, trends, outliers, or differences between groups.
Why it matters: Data visualization helps researchers explore and communicate data effectively, making it easier to interpret statistical results and identify meaningful patterns in a dataset.
See also: Histogram; Boxplot; Scatterplot; Distribution; Descriptive Statistics
Degrees of Freedom (df)
Degrees of freedom represent the number of independent pieces of information available to estimate a statistical parameter. In many statistical tests, degrees of freedom are determined by sample size and the number of parameters estimated.
Why it matters: Degrees of freedom influence the shape of statistical distributions and are used when calculating test statistics and p-values.
See also: t Statistic; F Statistic; Hypothesis Testing
Density Plot
A density plot is a graphical representation of the distribution of a continuous variable. Instead of displaying counts within bins like a histogram, a density plot uses a smooth curve to show how values are distributed across the range of the data. The height of the curve represents the relative concentration of observations at different values.
Density plots help visualize features of a distribution such as skewness, peaks, and spread.
Why it matters: Density plots provide a smooth visual summary of how values are distributed and can make patterns in the data easier to interpret than histograms alone.
See also: Histogram; Distribution; Skewness
Dependent Variable
A dependent variable is the variable that is measured as the outcome of interest in a study. It represents the result or response that researchers seek to explain or predict. In many statistical analyses, the dependent variable is influenced by one or more independent or predictor variables.
Why it matters: Clearly identifying the dependent variable helps researchers select appropriate statistical analyses and interpret the relationships between variables.
See also: Independent Variable; Outcome Variable; Predictor Variable; Regression
Descriptive Design
A descriptive research design focuses on describing characteristics, patterns, or conditions within a population. Rather than testing relationships or predicting outcomes, descriptive studies summarize what is happening in the data using statistics such as means, frequencies, or percentages.
Why it matters: Descriptive designs help researchers understand the current state of a phenomenon before exploring relationships or causes.
See also: Descriptive Statistics; Cross-Sectional Study; Explanatory Design
Descriptive Statistics
Descriptive statistics summarize and organize data to describe patterns within a dataset. Common descriptive statistics include measures of central tendency (mean, median, mode) and measures of variability (range, variance, standard deviation).
Why it matters: Descriptive statistics provide the foundation for understanding data before conducting inferential analyses.
See also: Mean; Standard Deviation; Distribution
Deviance
Deviance is a statistic in logistic regression that measures how well a model fits the observed data. Lower deviance values indicate better model fit. Deviance is often used when comparing models.
Why it matters: Deviance helps researchers evaluate whether a logistic regression model adequately explains the data.
See also: Logistic Regression; Model Fit; AIC
Dispersion
Dispersion refers to the degree to which values in a dataset are spread out or vary from one another. Common measures of dispersion include the range, variance, standard deviation, and interquartile range.
Why it matters: Measures of dispersion help researchers understand the variability in data and determine whether values are tightly clustered or widely distributed.
See also: Range; Variance; Standard Deviation; Interquartile Range
Distribution
A distribution describes how values are spread across a dataset. Graphical representations such as histograms and boxplots are often used to visualize distributions.
Why it matters: Understanding the distribution of data helps determine which statistical methods are appropriate.
See also: Histogram; Skewness; Kurtosis
Dummy Coding
Dummy coding is a method for converting categorical variables into numerical form so they can be included in regression analyses. Categories are represented using binary indicators, typically coded as 0 or 1.
Why it matters: Dummy coding allows categorical predictors to be incorporated into regression models.
See also: Categorical Variable; Regression; Reference Group
Dunn’s Post Hoc Test
Dunn’s test is a follow-up procedure used after a significant Kruskal–Wallis test to identify which groups differ from one another.
Why it matters: Post hoc tests allow researchers to determine where differences occur after a significant overall test.
See also: Kruskal–Wallis Test; Post Hoc Test
Durbin–Watson Statistic
The Durbin–Watson statistic is used in regression analysis to detect autocorrelation in the residuals. Values close to 2 suggest that residuals are independent, while values substantially above or below 2 may indicate autocorrelation.
Why it matters: Autocorrelation violates regression assumptions and can lead to biased statistical conclusions.
See also: Regression; Residual; Independence of Errors
Back to top
E
Effect Size
Effect size is a measure of the magnitude of a relationship or difference between variables. Unlike statistical significance, which indicates whether an effect exists, effect size indicates how large that effect is.
Why it matters: Effect size helps researchers evaluate the practical importance of statistical findings.
See also: Cohen’s d; Eta Squared; Practical Importance
Estimated Marginal Mean
An estimated marginal mean is a predicted group mean calculated from a statistical model after accounting for other variables in the analysis. Unlike a simple group mean calculated directly from the data, estimated marginal means are adjusted based on the model and represent the expected value of the outcome variable for each group while holding other variables constant.
Why it matters: Estimated marginal means allow researchers to compare group outcomes after adjusting for other variables in the model, making comparisons more accurate when additional predictors or covariates are included.
See also: ANOVA; ANCOVA; Covariate; Model Fit
Eta Squared (η²)
Eta squared is an effect size statistic used in ANOVA-type analyses that estimates the proportion of total variance in the outcome variable explained by a factor. Because it uses the total variance in the dataset, eta squared provides a straightforward estimate of how much of the overall variability in the outcome is associated with group differences.
Eta squared is commonly reported in standard ANOVA and factorial ANOVA, where the goal is to understand how much of the total variation in the outcome is explained by the factors being studied.
Why it matters: Eta squared helps researchers evaluate the practical importance of group differences by showing how much of the overall variability in the outcome is associated with the factor.
See also: Effect Size; ANOVA; Partial Eta Squared; Variance
Expected Frequency
Expected frequency represents the number of observations that would be expected in each category if the null hypothesis were true. Expected frequencies are calculated during chi-square analyses to compare observed patterns with theoretical expectations.
Why it matters: Comparing observed and expected frequencies allows researchers to determine whether categorical variables are related.
See also: Observed Frequency; Chi-Square Test
Experimental Design
An experimental design is a research design in which researchers manipulate variables and randomly assign participants to groups. This design allows researchers to test causal relationships.
Why it matters: Experimental designs provide the strongest evidence for causal conclusions.
See also: Quasi-Experimental Design; Internal Validity
Explanatory Design
An explanatory research design examines relationships between variables to understand why outcomes occur. These studies often use statistical methods such as correlation or regression.
Why it matters: Explanatory studies help researchers identify factors associated with important outcomes.
See also: Predictive Design; Regression; Association
External Validity
External validity refers to the extent to which study findings can be generalized beyond the sample to other populations or contexts.
Why it matters: Strong external validity increases confidence that findings apply in real-world settings.
See also: Internal Validity; Representative Sample
Back to top
F
F Statistic
The F statistic is a test statistic used in ANOVA and regression analyses. It compares the variability explained by a model to the unexplained variability within the data.
Why it matters: The F statistic helps determine whether group differences or model predictors explain a significant portion of variability in the outcome variable.
See also: ANOVA; Mean Square; Sum of Squares
Factor Analysis
Factor analysis is a statistical technique used to identify underlying structures, or factors, that explain patterns of correlations among multiple variables.
Why it matters: Factor analysis helps researchers determine whether groups of variables measure the same underlying construct, which is useful in scale development and validation.
See also: Correlation; Construct Validity; Reliability
Factorial ANOVA
Factorial ANOVA is a statistical test used to examine the effects of two or more independent variables on a single outcome variable. It also allows researchers to evaluate interactions between variables.
Why it matters: Factorial ANOVA helps researchers understand how multiple variables jointly influence outcomes.
See also: ANOVA; Interaction
False Negative
A false negative occurs when a statistical model incorrectly classifies a case as negative when the true outcome is positive.
Why it matters: False negatives represent missed detections in classification models and can affect the interpretation of model performance.
See also: Sensitivity; Classification Accuracy
False Positive
A false positive occurs when a statistical model incorrectly classifies a case as positive when the true outcome is negative.
Why it matters: False positives represent incorrect detections and are important when evaluating classification accuracy.
See also: Specificity; Classification Accuracy
Familywise Error Rate
The familywise error rate refers to the probability of making at least one Type I error when conducting multiple statistical tests.
Why it matters: When multiple comparisons are conducted, the overall probability of falsely identifying a significant result increases. Adjustments such as the Bonferroni correction help control this risk.
See also: Bonferroni Correction; Type I Error
Fisher’s Exact Test
Fisher’s exact test is a statistical test used to examine relationships between categorical variables when sample sizes are small.
Why it matters: This test provides accurate results when expected cell counts are too small for chi-square tests.
See also: Chi-Square Test
Frequency
Frequency refers to the number of times a particular value occurs in a dataset.
Why it matters: Frequencies help describe how often outcomes occur within a dataset.
See also: Distribution; Descriptive Statistics
Back to top
G
Games–Howell Test
The Games–Howell test is a post-hoc comparison procedure used after ANOVA when the assumption of homogeneity of variance is violated or when group sizes differ substantially. Unlike Tukey’s test, the Games–Howell test does not assume equal variances across groups.
Why it matters: The Games–Howell test allows researchers to conduct pairwise comparisons when group variances are unequal, providing a more appropriate method when ANOVA assumptions are violated.
See also: Post-Hoc Test; ANOVA; Homogeneity of Variance; Tukey’s Honestly Significant Difference
Generalizability
Generalizability refers to the extent to which findings from a study can be applied to individuals, settings, or populations beyond those included in the sample. When results are generalizable, researchers can reasonably extend conclusions from the sample to the broader population of interest.
Why it matters: Generalizability helps determine whether study findings are relevant outside the specific participants or context in which the research was conducted.
See also: Population; Sample; Sampling; External Validity; Inference
Goodness-of-Fit Test
A goodness-of-fit test evaluates whether the observed distribution of a categorical variable differs from an expected distribution.
Why it matters: These tests help determine whether observed patterns align with theoretical expectations.
See also: Chi-Square Goodness-of-Fit Test
Greenhouse–Geisser Correction
The Greenhouse–Geisser correction is an adjustment applied in repeated measures ANOVA when the assumption of sphericity is violated. The correction adjusts the degrees of freedom used in the F test to produce a more accurate significance test.
Why it matters: Applying this correction helps prevent inflated Type I error rates when sphericity assumptions are not met.
See also: Repeated Measures ANOVA; Sphericity
Group Comparison
Group comparison refers to statistical analyses that examine whether differences exist between two or more groups on an outcome variable. These analyses evaluate whether observed differences in group means or distributions are likely due to chance or reflect meaningful differences in the population.
Why it matters: Group comparison methods allow researchers to determine whether groups differ in meaningful ways, helping answer questions about differences across categories such as treatment conditions, demographic groups, or time points.
See also: Independent Samples t-Test; Analysis of Variance (ANOVA); Nonparametric Test; Independent Variable; Dependent Variable
Back to top
H
Histogram
A histogram is a graphical representation of the distribution of a continuous variable using bars that represent frequency ranges.
Why it matters: Histograms help researchers visually evaluate distribution shape and variability.
See also: Distribution; Skewness
Homogeneity of Variance
Homogeneity of variance is the assumption that the variability of scores is approximately equal across groups being compared in a statistical analysis. In other words, the spread of values within each group should be similar. This assumption is required for several parametric tests, including t-tests and analysis of variance (ANOVA).
Why it matters: If the variability across groups differs substantially, statistical tests that assume equal variances may produce inaccurate results. Researchers often evaluate this assumption using tests such as Levene’s test.
See also: Assumption; ANOVA; t-Test; Levene’s Test
Homoscedasticity
Homoscedasticity is the assumption that the variability of residuals remains constant across values of the predictor variable.
Why it matters: Violations of homoscedasticity can affect the accuracy of regression estimates.
See also: Residual Plot; Regression
Huynh–Feldt Correction
The Huynh–Feldt correction is an adjustment applied in repeated measures ANOVA when the assumption of sphericity is violated. The correction adjusts the degrees of freedom used to calculate the F statistic, producing a more accurate significance test when variances of the differences between conditions are not equal.
Compared with the Greenhouse–Geisser correction, the Huynh–Feldt adjustment is typically less conservative and may produce slightly larger degrees of freedom.
Why it matters: Applying the Huynh–Feldt correction helps ensure that statistical conclusions from repeated measures ANOVA remain accurate when the sphericity assumption is not satisfied.
See also: Repeated Measures ANOVA; Sphericity; Mauchly’s Test of Sphericity; Greenhouse–Geisser Correction
Hypothesis
A hypothesis is a testable statement predicting a relationship or difference between variables.
Why it matters: Hypotheses guide statistical testing and research design.
See also: Alternative Hypothesis; Null Hypothesis; Inference
Hypothesis Testing
Hypothesis testing is a statistical process used to evaluate whether observed data provide sufficient evidence to support a research claim. The process begins by stating a null hypothesis, which represents the assumption that no relationship or difference exists. Researchers then use sample data to determine whether the observed results are unlikely under the null hypothesis.
If the probability of the observed results occurring under the null hypothesis is sufficiently small, the null hypothesis is rejected in favor of the alternative hypothesis.
Why it matters: Hypothesis testing provides a systematic method for determining whether patterns observed in sample data likely reflect real relationships in the population.
See also: Null Hypothesis; Alternative Hypothesis; p-value; Statistical Significance; Type I Error; Type II Error
Back to top
I
Independence of Errors
Independence of errors is the assumption that the residuals in a statistical model are not correlated with one another. In other words, the error associated with one observation should not influence the error associated with another observation. This assumption is particularly important in regression analyses and time-ordered data.
Why it matters: If errors are correlated, statistical estimates such as standard errors and significance tests may be inaccurate. Violations of this assumption are often detected through tests for autocorrelation, such as the Durbin–Watson statistic.
See also: Autocorrelation; Residual; Durbin–Watson Statistic; Regression
Independent Samples t-Test
The independent samples t-test is a statistical test used to determine whether the mean of an outcome variable differs between two independent groups. The test evaluates whether the observed difference between group means is likely due to chance or reflects a difference in the population.
Why it matters: The independent samples t-test allows researchers to compare outcomes across two separate groups, such as treatment and control groups or different demographic categories.
See also: t-Test; Group Comparison; Independent Variable; Dependent Variable
Independent Variable
An independent variable is a variable believed to influence or explain changes in an outcome variable.
Why it matters: Independent variables help researchers examine factors associated with outcomes.
See also: Predictor Variable; Outcome Variable
Inference
Statistical inference is the process of drawing conclusions about a population from data collected from a sample.
Why it matters: Inference allows researchers to generalize findings beyond the sample studied.
See also: Sample; Population; Confidence Interval
Influential Observation
An influential observation is a data point that has a substantial impact on the results of a statistical model. Removing or modifying the observation may change the estimated relationships between variables.
Why it matters: Identifying influential observations helps ensure that statistical conclusions are not driven by a small number of extreme cases.
See also: Outlier; Regression Diagnostics
Interaction
An interaction occurs when the effect of one independent variable depends on the level of another variable.
Why it matters: Interactions reveal complex relationships that cannot be explained by individual variables alone.
See also: Factorial ANOVA; Interaction Plot
Interaction Plot
An interaction plot is a graph used to visualize interaction effects between variables.
Why it matters: These plots help researchers interpret how variables jointly influence outcomes.
See also: Interaction; Factorial ANOVA
Intercept
The intercept is the predicted value of the outcome variable when all predictor variables in a regression model are equal to zero.
Why it matters: The intercept establishes the baseline value from which the effects of predictors are estimated.
See also: Regression; Coefficient
Internal Consistency
Internal consistency is a type of measurement reliability that evaluates how consistently the items within a scale measure the same underlying construct. When a set of survey or test items is designed to assess a single concept, internal consistency examines whether those items produce similar patterns of responses.
If the items are all measuring the same construct, participants who score high on one item should generally score high on the other items as well. A common statistic used to evaluate internal consistency is Cronbach’s alpha, which summarizes how closely related the items in a scale are.
Why it matters: Internal consistency helps determine whether a set of items functions as a coherent scale, providing confidence that the measure is reliably capturing the intended construct.
See also: Reliability; Cronbach’s Alpha; Measurement Validity; Construct Validity
Internal Validity
Internal validity refers to the extent to which a study accurately identifies relationships between variables without interference from other factors.
Why it matters: Strong internal validity increases confidence in study conclusions.
See also: Confounding Variable; Experimental Design
Interquartile Range (IQR)
The interquartile range (IQR) measures the spread of the middle 50% of values in a dataset. It is calculated as the difference between the third quartile (Q3) and the first quartile (Q1). Because it focuses on the central portion of the data, the IQR is less influenced by extreme values than some other measures of variability.
Why it matters: The interquartile range provides a robust measure of variability and is commonly used when data contain outliers or are not normally distributed.
See also: Quartile; Boxplot; Dispersion; Outlier
Interval Variable
An interval variable is a numeric variable in which the differences between values are meaningful and evenly spaced, but the scale does not include a true zero point.
Because the intervals between values are equal, differences between scores can be interpreted meaningfully. However, ratios cannot be interpreted because zero does not represent the complete absence of the variable.
Why it matters: Interval variables allow researchers to calculate statistics such as means, standard deviations, and correlations.
See also: Ratio Variable; Continuous Variable; Mean; Standard Deviation
Back to top
J
Jitter
Jitter refers to the small random adjustment added to the position of points in a scatterplot or similar visualization to reduce overlap when many observations share the same value. When multiple data points have identical or very similar coordinates, they can appear stacked on top of each other, making it difficult to see how many observations are present.
By slightly shifting the points horizontally, vertically, or both, jitter spreads overlapping observations apart while preserving the overall pattern of the data.
Why it matters: Jitter improves the readability of visualizations by making overlapping data points visible, allowing viewers to better see the distribution of observations.
See also: Scatterplot; Data Visualization; Distribution
Back to top
K
Kruskal–Wallis Test
The Kruskal–Wallis test is a nonparametric alternative to One-Way ANOVA used to compare three or more groups when assumptions of normality are violated.
Why it matters: This test allows group comparisons when data do not meet parametric assumptions.
See also: Dunn’s Test; Nonparametric Test
Kurtosis
Kurtosis measures the degree to which a distribution is peaked or flat relative to a normal distribution. Distributions with high kurtosis have sharper peaks and heavier tails, while distributions with low kurtosis appear flatter with lighter tails.
Why it matters: Kurtosis helps researchers understand the shape of a distribution and identify whether extreme values occur more or less frequently than expected under normality.
See also: Skewness; Distribution
Back to top
L
Levene’s Test
Levene’s test is a statistical test used to evaluate the assumption of homogeneity of variance. It examines whether the variability of scores is approximately equal across groups being compared. A statistically significant Levene’s test suggests that group variances differ more than would be expected by chance.
Why it matters: Levene’s test helps researchers determine whether analyses such as t-tests and ANOVA meet the equal-variance assumption. If this assumption is violated, alternative procedures or adjusted interpretations may be needed.
See also: Homogeneity of Variance; ANOVA; Independent Samples t-Test; Games–Howell Test
Leverage
Leverage measures how far an observation’s predictor values are from the average predictor values in a regression model. Observations with high leverage have the potential to strongly influence model estimates.
Why it matters: High-leverage observations may disproportionately affect regression results.
See also: Influential Observation; Regression Diagnostics
Linear Regression
Linear regression is a statistical method used to model and estimate the relationship between one outcome variable and one or more predictor variables using a linear equation.
Why it matters: Linear regression allows researchers to examine how changes in predictor variables are associated with changes in an outcome variable and to estimate the strength and direction of those relationships.
See also: Regression Coefficient; Linearity; Predictor Variable; Outcome Variable
Linearity
Linearity refers to a relationship between two variables that can be represented by a straight line, where changes in one variable are associated with proportional changes in another.
Why it matters: Many statistical methods, including correlation and linear regression, assume that relationships between variables are approximately linear.
See also: Correlation; Linear Regression; Scatterplot
Linearity Assumption
The linearity assumption states that the relationship between predictor variables and the outcome variable is linear. This assumption is required for many regression analyses.
Why it matters: If the relationship between variables is not linear, regression estimates may be inaccurate.
See also: Regression; Residual Plot
Log-Likelihood
Log-likelihood is a statistic used in logistic regression to evaluate how well a model explains the observed data. Larger log-likelihood values indicate better model fit.
Why it matters: Log-likelihood is used when comparing competing logistic regression models.
See also: Logistic Regression; Model Fit
Logistic Regression
Logistic regression is a statistical method used when the outcome variable has two categories. The model estimates the probability of an outcome based on predictor variables.
Why it matters: Logistic regression is widely used in applied research when outcomes are binary.
See also: Odds Ratio; Predicted Probability; ROC Curve
Logit
The logit is the logarithm of the odds used as the underlying scale in logistic regression models.
Why it matters: Logits allow probabilities to be modeled using linear predictors.
See also: Logistic Regression; Odds
Longitudinal Study
A longitudinal study is a research design in which data are collected from the same participants repeatedly over an extended period of time. Measurements may occur across multiple time points, such as months or years, allowing researchers to examine patterns of change and development.
Longitudinal studies are commonly used to study trends, growth, or long-term effects within the same group of individuals.
Why it matters: Longitudinal designs allow researchers to observe how outcomes change over time rather than capturing only a single point-in-time snapshot.
See also: Repeated Measures Design; Retrospective Study
Back to top
M
Mann–Whitney U Test
The Mann–Whitney U test is a nonparametric alternative to the independent samples t-test. It compares the distributions of two independent groups when the assumptions of parametric tests are not met.
Why it matters: This test allows researchers to compare groups when data are ordinal or not normally distributed.
See also: Nonparametric Test; Kruskal–Wallis Test
Mauchly’s Test of Sphericity
Mauchly’s test evaluates whether the assumption of sphericity is satisfied in repeated measures ANOVA.
Why it matters: If Mauchly’s test indicates that sphericity is violated, corrections such as the Greenhouse–Geisser adjustment should be applied.
See also: Repeated Measures ANOVA; Sphericity
McNemar’s Test
McNemar’s test is a nonparametric statistical test used to compare two related proportions when the outcome variable is dichotomous. It is commonly used with paired data, such as before-and-after responses from the same participants, to determine whether the proportion of cases in one category changes across two related conditions.
Why it matters: McNemar’s test helps researchers evaluate change in binary outcomes over time or across paired conditions.
See also: Cochran’s Q Test; Binomial Test; Paired Samples Design; Nonparametric Test
Mean
The mean is the arithmetic average of a set of values.
Why it matters: The mean is one of the most commonly used measures of central tendency.
See also: Median; Mode
Mean Difference
The mean difference represents the difference between two group means or between a sample mean and a comparison value.
Why it matters: Mean differences provide the basis for many statistical tests comparing groups.
See also: t-Test; Group Comparison
Mean Rank
Mean rank represents the average rank assigned to observations within a group during nonparametric tests.
Why it matters: Nonparametric tests rely on ranks rather than raw scores to evaluate differences between groups.
See also: Rank; Kruskal–Wallis Test
Mean Square (MS)
Mean square is calculated by dividing a sum of squares value by its corresponding degrees of freedom.
Why it matters: Mean squares are used to calculate the F statistic in ANOVA.
See also: Sum of Squares; ANOVA; F Statistic
Measurement Error
Measurement error is the difference between an observed score and the true value of the variable being measured.
Why it matters: Measurement error can reduce the accuracy of statistical results.
See also: Reliability; Measurement Validity
Measurement Validity
Measurement validity refers to the extent to which a measurement instrument accurately captures the concept it is intended to measure.
Why it matters: Invalid measures can produce misleading conclusions.
See also: Reliability; Construct Validity
Median
The median is the middle value in an ordered dataset.
Why it matters: The median is useful when distributions contain outliers.
See also: Mean; Mode
Mode
The mode is the most frequently occurring value in a dataset.
Why it matters: The mode is useful when analyzing categorical data.
See also: Mean; Median
Model
A model is a simplified mathematical representation of the relationship between variables used to describe, explain, or predict patterns in data.
Why it matters: Statistical models allow researchers to estimate relationships between variables and evaluate how well those relationships explain observed outcomes.
See also: Linear Regression; Predictor Variable; Outcome Variable; Regression Coefficient
Model Fit
Model fit refers to how well a statistical model represents the observed data.
Why it matters: Evaluating model fit helps determine whether the model accurately represents the data.
See also: AIC; Deviance
Multicollinearity
Multicollinearity occurs when predictor variables in a regression model are highly correlated with one another.
Why it matters: High multicollinearity can make regression coefficients difficult to interpret.
See also: Regression; Variance Inflation Factor
Multiple Linear Regression
Multiple linear regression is a statistical method for predicting a continuous outcome variable from multiple predictors.
Why it matters: Multiple regression allows researchers to evaluate the combined influence of several predictors.
See also: Regression; Predictor Variable
Multivariate Analysis of Variance (MANOVA)
Multivariate Analysis of Variance (MANOVA) is an inferential statistical test used to determine whether groups differ across two or more outcome variables simultaneously. It extends the logic of ANOVA, which compares group means on a single outcome variable, by allowing researchers to examine group differences across multiple related outcomes at the same time.
In a MANOVA, one or more independent variables define the groups being compared, while several dependent variables represent the outcomes of interest. The analysis evaluates whether the combined pattern of outcomes differs across groups, taking into account the relationships among those variables.
Why it matters: MANOVA allows researchers to examine group differences across several related outcomes in a single analysis rather than conducting multiple separate tests.
See also: ANOVA; Dependent Variable; Independent Variable; Multivariate Test Statistics; Pillai’s Trace; Wilks’ Lambda
Multivariate Test Statistics
Multivariate test statistics are measures used in analyses such as MANOVA to determine whether groups differ across a set of outcome variables considered together. These statistics evaluate whether the combination of dependent variables varies significantly across groups defined by one or more independent variables.
Several different multivariate test statistics can be used to evaluate these differences, including Pillai’s Trace, Wilks’ Lambda, Hotelling’s Trace, and Roy’s Largest Root. Although they are calculated differently, they all serve the same general purpose: assessing whether the overall multivariate effect of the independent variable is statistically significant.
Why it matters: Multivariate test statistics provide the basis for determining whether groups differ across multiple outcomes simultaneously in analyses such as MANOVA.
See also: MANOVA; Pillai’s Trace; Wilks’ Lambda; Dependent Variable; Independent Variable
Back to top
N
Nagelkerke’s R²
Nagelkerke’s R² is a pseudo R² statistic used in logistic regression to estimate how well the model explains variation in the outcome variable. Because traditional R² cannot be calculated for logistic regression, Nagelkerke’s R² adjusts another statistic (Cox & Snell R²) so that the value can range from 0 to 1, making interpretation more similar to the R² used in linear regression.
Why it matters: Nagelkerke’s R² provides an approximate indication of how well a logistic regression model accounts for variation in the outcome, helping researchers evaluate the overall explanatory strength of the model.
See also: Logistic Regression; Pseudo R²; Model Fit; R²
Nominal Variable
A nominal variable is a type of categorical variable in which values represent distinct categories without any inherent order or ranking. The categories simply label different groups or classifications.
Why it matters: Nominal variables are typically analyzed using frequency counts, percentages, or statistical tests designed for categorical data.
See also: Categorical Variable; Ordinal Variable; Chi-Square Test; Measurement Scale
Non-Experimental Design
A non-experimental design studies relationships among variables without manipulating them.
Why it matters: Most applied research relies on non-experimental designs when manipulation is impractical or unethical.
See also: Descriptive Design; Explanatory Design
Nonparametric Test
A nonparametric test is a statistical method that does not rely on assumptions about the distribution of the data.
Why it matters: Nonparametric tests provide alternatives when parametric assumptions are violated.
See also: Kruskal–Wallis Test; Mann–Whitney U Test
Non-Probability Sampling
Non-probability sampling is a sampling method in which members of the population are selected without a known or equal chance of being included in the sample. Participants are often selected based on convenience, accessibility, or specific characteristics relevant to the study.
Why it matters: Non-probability sampling is commonly used in applied research when random sampling is not practical. However, because selection probabilities are unknown, results may be less generalizable to the broader population.
See also: Sampling; Probability Sampling; Sample; Population
Normal Distribution
A normal distribution is a symmetrical bell-shaped distribution in which most values cluster around the mean.
Why it matters: Many statistical tests assume normality.
See also: Central Limit Theorem; Skewness
Null Hypothesis
The null hypothesis states that no relationship or difference exists between variables.
Why it matters: Statistical tests evaluate whether evidence supports rejecting the null hypothesis.
See also: Hypothesis; p-value
Back to top
O
Observation
An observation refers to a single recorded data point in a dataset. Each observation represents one instance of the unit of analysis being measured or studied.
For example, if a dataset includes survey responses from teachers, each teacher’s responses would represent one observation. In a dataset of schools, each school would represent an observation.
Observations are typically organized as rows in a dataset, while variables appear as columns.
Why it matters: Understanding what constitutes an observation helps researchers correctly interpret datasets and conduct statistical analyses.
See also: Unit of Analysis; Variable; Case
Observed Frequency
Observed frequency refers to the actual number of observations recorded in each category of a dataset.
Why it matters: Observed frequencies are compared with expected frequencies in chi-square tests.
See also: Expected Frequency; Chi-Square Test
Odds
Odds represent the ratio of the probability that an event occurs to the probability that it does not occur.
Why it matters: Odds form the basis for interpreting logistic regression models.
See also: Odds Ratio; Logistic Regression
Odds Ratio
An odds ratio describes how the odds of an outcome change as a predictor increases.
Why it matters: Odds ratios help interpret the magnitude of relationships in logistic regression.
See also: Logistic Regression; Predicted Probability
Ordinal Variable
An ordinal variable is a categorical variable in which the categories have a meaningful order or ranking, but the distances between categories are not assumed to be equal.
Why it matters: Ordinal variables allow researchers to examine relative ranking or order among observations, but the lack of equal spacing between categories limits the types of statistical analyses that are appropriate.
See also: Nominal Variable; Measurement Scale; Likert Scale; Nonparametric Test
Outcome Variable
The outcome variable is the variable being predicted or explained in an analysis.
Why it matters: The outcome variable represents the main focus of statistical investigation.
See also: Predictor Variable; Dependent Variable
Outlier
An outlier is a data point that differs substantially from other observations in a dataset.
Why it matters: Outliers can influence statistical results and may indicate data errors or unusual cases.
See also: Boxplot; Distribution
Back to top
P
p-value
The p-value represents the probability of observing results as extreme as those obtained if the null hypothesis were true.
Why it matters: The p-value helps determine whether results are statistically significant.
See also: Alpha; Hypothesis Testing
Paired Difference
A paired difference is the difference between two related measurements taken from the same participant or unit of analysis.
Why it matters: Paired differences are analyzed in paired samples tests and repeated measures designs.
See also: Paired Samples t-Test; Repeated Measures Design
Paired Samples t-Test
A paired samples t-test is a statistical test used to compare the means of two related measurements. The two values typically come from the same participants measured at two different times or under two different conditions. Instead of comparing independent groups, the paired samples t-test evaluates the average difference between paired observations.
Why it matters: The paired samples t-test allows researchers to determine whether a meaningful change or difference occurred within the same group over time or across conditions.
See also: Paired Difference; t-Test; Wilcoxon Signed-Rank Test; Repeated Measures Design
Parameter
A parameter is a numerical characteristic of a population.
Why it matters: Researchers use sample statistics to estimate population parameters.
See also: Statistic; Population
Partial Eta Squared (η²p)
Partial eta squared is an effect size statistic used in ANOVA-type analyses that estimates the proportion of variance explained by a factor after accounting for other variables in the model. Instead of using the total variance in the dataset, partial eta squared isolates the variance associated with a specific effect relative to the variance remaining after other effects are removed.
Partial eta squared is commonly reported in models that include additional predictors, such as ANCOVA, repeated measures ANOVA, or complex factorial models, because it reflects the contribution of a factor while controlling for other variables.
Why it matters: Partial eta squared helps researchers evaluate the strength of a specific effect in models that include covariates or multiple factors.
See also: Effect Size; ANOVA; Eta Squared; ANCOVA; Covariate
Percentile
A percentile is a value that indicates the percentage of observations in a dataset that fall at or below a given point. For example, the 25th percentile represents the value below which 25% of the observations occur.
Percentiles help describe the position of values within a distribution and are commonly used to summarize how data are spread across a range of scores.
Why it matters: Percentiles provide a way to understand where individual values fall relative to the rest of the dataset.
See also: Quartiles; Distribution; Interquartile Range; Descriptive Statistics
Pillai’s Trace
Pillai’s Trace is a multivariate test statistic used in analyses such as MANOVA to determine whether groups differ across a set of outcome variables. It evaluates the proportion of variance in the dependent variables that is explained by the independent variable when the outcomes are considered together.
Among the available multivariate test statistics, Pillai’s Trace is often considered one of the most robust, meaning it tends to perform well even when certain statistical assumptions are slightly violated.
Why it matters: Pillai’s Trace provides evidence about whether group differences exist across multiple outcomes in a multivariate analysis.
See also: MANOVA; Multivariate Test Statistics; Wilks’ Lambda; Dependent Variable
Population
A population is the entire group of individuals, cases, or observations that a researcher is interested in studying. Because it is often impractical to collect data from every member of a population, researchers typically draw a sample and use statistical inference to draw conclusions about the population.
Why it matters: Clearly defining the population helps determine how results from a sample can be interpreted and whether findings can be generalized beyond the study participants.
See also: Sample; Sampling; Inference; Probability Sampling
Post-Hoc Test
A post-hoc test is conducted after a significant overall test to determine which specific groups differ.
Why it matters: Post-hoc tests provide detailed comparisons following ANOVA or similar analyses.
See also: ANOVA; Dunn’s Test
Predicted Probability
Predicted probability is the estimated likelihood that an outcome will occur based on a statistical model.
Why it matters: Predicted probabilities make regression results easier to interpret.
See also: Logistic Regression
Predictor Variable
A predictor variable is used to estimate or explain changes in an outcome variable.
Why it matters: Predictor variables represent potential influences on outcomes.
See also: Outcome Variable; Regression
Probability Sampling
Probability sampling is a sampling method in which each member of the population has a known probability of being selected.
Why it matters: Probability sampling improves the representativeness of a sample and supports generalization to a population.
See also: Sample; Population; Random Sampling
Pseudo R²
Pseudo R² is a statistic used in logistic regression to estimate how much the model improves prediction relative to a baseline model.
Why it matters: Pseudo R² provides an approximate measure of explanatory power in logistic models.
See also: Logistic Regression; Model Fit
Purposive Sampling
Purposive sampling is a nonprobability sampling method in which participants are intentionally selected based on specific characteristics or criteria relevant to the research question.
Why it matters: Purposive sampling allows researchers to focus on participants who have particular experiences or knowledge needed for the study.
See also: Nonprobability Sampling; Snowball Sampling; Sampling Bias
Back to top
Q
Q–Q Plot
A Q–Q plot compares the distribution of observed data to a theoretical distribution, often the normal distribution.
Why it matters: Q–Q plots help researchers assess whether data meet normality assumptions.
See also: Normal Distribution; Shapiro–Wilk Test
Quartiles
Quartiles divide an ordered dataset into four equal parts, each representing 25% of the observations. The first quartile (Q1) represents the value below which 25% of the observations fall. The second quartile (Q2) represents the median, or the point at which half of the observations fall below and half fall above. The third quartile (Q3) represents the value below which 75% of the observations fall.
Quartiles are often used to summarize distributions and are key components of boxplots and the interquartile range (IQR).
Why it matters: Quartiles help describe the spread and central distribution of data by showing how values are divided across four equal portions of the dataset.
See also: Percentile; Median; Interquartile Range; Boxplot
Quasi-Experimental Design
A quasi-experimental design evaluates the impact of an intervention without random assignment.
Why it matters: These designs allow researchers to study interventions in real-world settings.
See also: Experimental Design; Internal Validity
Back to top
R
R
R is the symbol commonly used for the Pearson correlation coefficient, which measures the strength and direction of a linear relationship between two continuous variables. Its value ranges from −1 to +1. Values closer to −1 or +1 indicate stronger relationships, whereas values closer to 0 indicate weaker relationships.
Why it matters: R helps researchers summarize the direction and strength of a relationship between two variables in a single statistic.
See also: Correlation; Scatterplot; Pearson Correlation
R² (Coefficient of Determination)
R² represents the proportion of variance in the outcome variable explained by the predictors in a regression model.
Why it matters: R² provides an estimate of how well a model explains variation in the outcome variable.
See also: Regression; Adjusted R²
Random Sampling
Random sampling is a probability sampling method in which individuals are selected from a population using a random process so that each member of the population has an equal chance of being included in the sample.
Why it matters: Random sampling helps produce samples that are more representative of the population, improving the accuracy and generalizability of research findings.
See also: Stratified Sampling; Cluster Sampling; Population; Sampling Bias
Range
The range is the difference between the highest and lowest values in a dataset.
Why it matters: Range provides a simple measure of variability.
See also: Variance; Standard Deviation
Rank
A rank represents the relative position of a value when observations are ordered from smallest to largest.
Why it matters: Nonparametric tests rely on ranks rather than raw data values.
See also: Mean Rank; Nonparametric Test
Ratio Variable
A ratio variable is a numeric variable that has equal intervals between values and a true zero point, meaning zero represents the complete absence of the measured quantity.
Why it matters: Ratio variables allow the full range of statistical analyses because both differences and proportional relationships between values are meaningful.
See also: Interval Variable; Continuous Variable; Mean; Standard Deviation
Reference Group
A reference group (or level) is the category of a categorical variable used as the baseline for comparison in statistical models. The effects of other categories are interpreted relative to this group.
Why it matters: Choosing a reference group determines how results are interpreted in analyses such as regression, where coefficients represent differences compared to the baseline category.
See also: Dummy Coding; Nominal Variable; Regression Coefficient
Regression
Regression is a statistical method used to estimate relationships between variables and predict outcomes.
Why it matters: Regression models are widely used for explanation and prediction.
See also: Multiple Linear Regression; Logistic Regression
Regression Coefficient
A regression coefficient is a numerical value that represents the estimated relationship between a predictor variable and an outcome variable in a regression model. It indicates the expected change in the outcome variable for a one-unit change in the predictor, holding other variables constant.
Why it matters: Regression coefficients help researchers interpret the direction and magnitude of relationships between variables in regression analysis.
See also: Linear Regression; Unstandardized Coefficient (B); Standardized Coefficient (β); Predictor Variable
Reliability
Reliability refers to the consistency of a measurement instrument.
Why it matters: Reliable measurements are necessary for accurate statistical analysis.
See also: Measurement Validity; Cronbach’s Alpha
Repeated Measures Design
A repeated measures design is a research design in which the same participants are measured multiple times under different conditions or at different time points. Because the same individuals are observed repeatedly, each participant serves as their own comparison, reducing variability caused by differences between individuals.
Repeated measures designs are commonly used in studies examining change over time, before-and-after interventions, or multiple treatment conditions experienced by the same participants.
Why it matters: Repeated measures designs allow researchers to detect changes within the same group of participants while controlling for individual differences.
See also: Paired Samples t-Test; Repeated Measures ANOVA; Wilcoxon Signed-Rank Test; McNemar’s Test; Cochran’s Q Test
Research Design
Research design refers to the overall plan or structure used to guide a research study. It outlines how the research problem will be investigated, including how data will be collected, what variables will be examined, and how the results will be analyzed.
Different research designs are used depending on the purpose of the study. For example, researchers may use descriptive, predictive, explanatory, experimental, or quasi-experimental designs depending on whether they aim to describe patterns, examine relationships, or evaluate potential causal effects.
Why it matters: A clear research design ensures that the data collected and the analyses conducted are appropriate for answering the research question.
See also: Research Question; Research Problem; Experimental Design; Non-Experimental Design
Research Problem
A research problem is the broader issue, gap, or concern that motivates a study. It describes the situation or condition that requires investigation and explains why the study is needed. The research problem establishes the context for the study and leads to the development of specific research questions.
While the research problem identifies what issue needs to be understood, the research question specifies what the study will examine to address that issue.
Why it matters: Clearly defining the research problem helps ensure that the study is focused on a meaningful issue and that the research questions and analysis are aligned with the purpose of the study.
See also: Research Question; Research Design; Variables; Inference
Research Question
A research question is a clear, focused statement that defines what a study is attempting to investigate. It identifies the variables or relationships being examined and guides decisions about research design, data collection, and statistical analysis.
Research questions typically specify whether the study is examining differences between groups, relationships between variables, or changes over time.
Why it matters: The research question determines the appropriate research design and statistical methods used in a study.
See also: Research Problem; Hypothesis; Independent Variable; Dependent Variable; Research Design
Residual
A residual is the difference between an observed value and the value predicted by a statistical model.
Why it matters: Residuals help evaluate model accuracy and assumptions.
See also: Residual Plot; Regression
Residual Plot
A residual plot displays residuals against predicted values.
Why it matters: Residual plots help diagnose violations of regression assumptions.
See also: Homoscedasticity; Regression
ROC Curve
A receiver operating characteristic (ROC) curve evaluates the classification performance of a model across possible cutoff values.
Why it matters: ROC curves help determine how well a model distinguishes between outcome categories.
See also: AUC; Logistic Regression
Back to top
S
Sample
A sample is a subset of the population used in a study.
Why it matters: Researchers rely on samples to draw conclusions about populations.
See also: Population; Sampling
Sampling
Sampling is the process of selecting a subset of individuals, cases, or observations from a larger population for use in a study. Because researchers often cannot collect data from every member of a population, sampling allows them to gather information from a smaller group and use it to draw conclusions about the larger population.
Why it matters: The quality of a sample influences how well study findings represent the population and affects the generalizability of results.
See also: Sample; Population; Sampling Frame; Probability Sampling; Non-Probability Sampling
Sampling Bias
Sampling bias occurs when the sample selected for a study is not representative of the population due to systematic differences in how participants were chosen.
Why it matters: Sampling bias can lead to inaccurate conclusions because the results may not generalize to the broader population.
See also: Sampling Error; Population; External Validity
Scale
A scale is a set of related items used to measure a single underlying construct, such as attitudes, perceptions, or behaviors.
Why it matters: Scales allow researchers to measure complex concepts that cannot be captured by a single item.
See also: Item; Reliability; Construct Validity
Scatterplot
A scatterplot displays the relationship between two continuous variables.
Why it matters: Scatterplots help visualize correlation and detect outliers.
See also: Correlation; Regression
Sensitivity
Sensitivity is the proportion of true positive cases correctly identified by a classification model.
Why it matters: Sensitivity indicates how well a model detects positive cases.
See also: Specificity; ROC Curve
Skewness
Skewness measures the degree and direction of asymmetry in a distribution. A distribution is positively skewed when the tail extends toward higher values (to the right) and negatively skewed when the tail extends toward lower values (to the left).
Why it matters: Skewness helps researchers understand the shape of a distribution and assess whether data deviate from symmetry, which may affect statistical assumptions.
See also: Distribution; Kurtosis
Snowball Sampling
Snowball sampling is a non-probability sampling method in which current participants recruit additional participants from their social networks.
Why it matters: Snowball sampling can help researchers access hard-to-reach or specialized populations that are difficult to identify through traditional sampling methods.
See also: Non-probability Sampling; Purposive Sampling; Sampling Bias
Spearman’s Rank-Order Correlation (Spearman’s rho)
Spearman’s rank-order correlation is a nonparametric measure of association that evaluates the strength and direction of a monotonic relationship between two variables. The statistic is calculated using the ranked values of the variables rather than their raw scores.
Why it matters: Spearman’s correlation is useful when variables are ordinal, when assumptions of normality are violated, or when relationships between variables are monotonic but not linear.
See also: Correlation; Pearson Correlation; Nonparametric Test; Rank
Specificity
Specificity is the proportion of true negative cases correctly identified by a model.
Why it matters: Specificity reflects how well a model avoids false positives.
See also: Sensitivity; ROC Curve
Sphericity
Sphericity is an assumption of repeated measures ANOVA stating that the variances of the differences between all combinations of conditions are equal.
Why it matters: Violations of sphericity can lead to inaccurate statistical conclusions unless corrections are applied.
See also: Mauchly’s Test; Greenhouse–Geisser Correction
Standard Deviation
Standard deviation measures the average distance of values from the mean.
Why it matters: It provides a widely used measure of variability.
See also: Variance; Range
Standard Error
The standard error measures how much a sample statistic is expected to vary across repeated samples.
Why it matters: The standard error helps quantify uncertainty in statistical estimates.
See also: Confidence Interval; Sampling Distribution
Standardized Coefficient (Beta)
A standardized coefficient expresses the relationship between a predictor and the outcome variable using standardized units.
Why it matters: Standardized coefficients allow researchers to compare the relative importance of predictors measured on different scales.
See also: Regression; Coefficient
Statistical Power
Statistical power refers to the probability that a statistical test will correctly detect a true effect or relationship when one actually exists. In other words, power represents the likelihood that a study will produce a statistically significant result when the alternative hypothesis is true.
Statistical power is influenced by several factors, including sample size, effect size, variability in the data, and the chosen significance level (alpha).
Why it matters: Studies with low statistical power may fail to detect meaningful effects, increasing the likelihood of a Type II error.
See also: Type II Error; Effect Size; Sample Size; Hypothesis Testing
Statistical Significance
Statistical significance indicates that an observed effect is unlikely to have occurred by chance alone.
Why it matters: Significance testing helps researchers evaluate evidence for relationships or differences.
See also: Alpha; p-value
Stratified Sampling
Stratified sampling is a probability sampling method in which a population is divided into subgroups, or strata, based on shared characteristics, and participants are randomly selected from each stratum.
Why it matters: Stratified sampling helps ensure that important subgroups within a population are adequately represented in the sample.
See also: Random Sampling; Cluster Sampling; Population
Sum
A sum refers to the total obtained by adding a set of values together. In statistics, sums are used to calculate many common statistics, including the mean, variance, and sum of squares. Summation is often represented using the Greek symbol Σ (sigma), which indicates that a series of values should be added together.
Why it matters: Many statistical formulas rely on sums of values, making summation a fundamental component of statistical calculations.
See also: Mean; Sum of Squares; Variance
Sum of Squares (SS)
Sum of squares measures the total variability in a dataset and is used to partition variability into components explained by a model and unexplained variability.
Why it matters: Sum of squares values form the basis for ANOVA calculations.
See also: Mean Square; ANOVA; F Statistic
Back to top
T
t Statistic
The t statistic is a test statistic used in t-tests and regression analysis. It represents the standardized difference between an observed value and the value expected under the null hypothesis.
Why it matters: The t statistic helps determine whether a difference or relationship is statistically significant.
See also: t-Test; Hypothesis Testing
Tukey’s Honestly Significant Difference (Tukey HSD)
Tukey’s Honestly Significant Difference (HSD) test is a post hoc comparison procedure used after a statistically significant ANOVA result. It compares all possible pairs of group means while controlling the overall Type I error rate.
Tukey’s HSD is appropriate when the assumption of homogeneity of variance is satisfied and group sizes are reasonably similar.
Why it matters: Tukey’s test helps researchers identify which specific group means differ after an ANOVA indicates that at least one group difference exists.
See also: Post Hoc Test; ANOVA; Mean Difference; Bonferroni Correction
Type I Error
A Type I error occurs when the null hypothesis is incorrectly rejected.
Why it matters: Type I errors represent false positives in statistical testing.
See also: Alpha; Hypothesis Testing
Type II Error
A Type II error occurs when a study fails to detect a real effect.
Why it matters: Type II errors represent false negatives.
See also: Beta; Statistical Power
Back to top
U
Uniform Distribution
A uniform distribution is a type of probability distribution in which all possible outcomes have equal probability within a given range. In this distribution, no value is more likely to occur than another.
When visualized, a uniform distribution appears flat because each outcome occurs with the same frequency.
Why it matters: Uniform distributions are often used in probability theory and simulation to represent situations where every outcome within a range is equally likely.
See also: Probability Distribution; Normal Distribution; Random Variable
Unit of Analysis
The unit of analysis refers to the primary entity being studied in a dataset. It defines what each row or observation in the data represents.
Depending on the research design, the unit of analysis might be an individual, classroom, school, organization, or geographic region. Clearly identifying the unit of analysis helps ensure that statistical conclusions are interpreted at the correct level.
Why it matters: Misidentifying the unit of analysis can lead to incorrect interpretations or conclusions about the data.
See also: Population; Sample; Observation; Research Design
Univariate Analysis
Univariate analysis refers to the examination of a single variable at a time in a dataset. The goal of univariate analysis is to summarize and describe the characteristics of that variable, often using statistics such as the mean, median, standard deviation, or frequency counts.
Researchers commonly use tables and visualizations such as histograms, boxplots, or bar charts to examine the distribution of a single variable.
Why it matters: Univariate analysis provides a foundational understanding of how individual variables are distributed before examining relationships between variables.
See also: Descriptive Statistics; Distribution; Histogram; Central Tendency
Univariate Distribution
A univariate distribution describes how the values of one variable are spread across a dataset. It shows the pattern of frequencies or values that occur for that variable.
Univariate distributions are often examined using visualizations such as histograms, boxplots, density plots, or frequency tables, which help reveal patterns such as skewness, clustering, or outliers.
Why it matters: Understanding the distribution of a single variable helps researchers interpret the data and determine which statistical methods are appropriate for analysis.
See also: Distribution; Histogram; Boxplot; Skewness; Kurtosis
Unstandardized Coefficient (B)
An unstandardized coefficient (B) is a regression coefficient that represents the expected change in the outcome variable for a one-unit change in a predictor variable, while holding other variables in the model constant. The coefficient is expressed in the original units of the variables, which allows the relationship to be interpreted in meaningful, real-world terms.
Why it matters: Unstandardized coefficients allow researchers to interpret the magnitude and direction of relationships between variables using the original measurement units.
See also: Regression Coefficient; Standardized Coefficient (Beta); Predictor Variable; Outcome Variable; Linear Regression
Back to top
V
Validity
Validity refers to the extent to which a study or measurement accurately reflects the concept being examined.
Why it matters: High validity strengthens confidence in research conclusions.
See also: Internal Validity; Measurement Validity
Variable
A variable is a characteristic or attribute that can take different values across observations in a dataset. Variables represent the information that researchers measure, record, or analyze when conducting a study.
Variables may be categorical, such as gender or school type, or numerical, such as age, income, or test scores. In a dataset, variables typically appear as columns, while observations or cases appear as rows.
Researchers often classify variables based on their role in an analysis. For example, a variable may serve as a predictor (independent variable) used to explain or predict another variable, or as an outcome (dependent variable) representing the result being studied.
Why it matters: Variables are the fundamental building blocks of statistical analysis because they represent the information used to describe patterns and relationships in data.
See also: Observation; Case; Predictor Variable; Outcome Variable
Variance
Variance measures the average squared deviation from the mean.
Why it matters: Variance quantifies how spread out the values are in a dataset.
See also: Standard Deviation
Violin Plot
A violin plot is a graphical display that combines elements of a boxplot and a density plot to show the distribution of a continuous variable. The shape of the violin reflects the density of values at different points in the distribution, while the center of the plot often includes markers for the median and quartiles.
Violin plots are often used to compare distributions across groups.
Why it matters: Violin plots allow researchers to see both the spread of the data and the shape of the distribution, providing more information than a boxplot alone.
See also: Boxplot; Density Plot; Distribution; Quartiles
Back to top
W
Wilcoxon Signed-Rank Test
A nonparametric alternative to the paired-samples t-test is used when data violate the normality assumption.
Why it matters: This test allows comparison of related measurements without assuming normality.
See also: Nonparametric Test; Paired Samples t-Test
Wilks’ Lambda
Wilks’ Lambda is a multivariate test statistic used in analyses such as MANOVA to determine whether groups differ across multiple outcome variables simultaneously. It evaluates the proportion of variance in the dependent variables that is not explained by the independent variable.
Values of Wilks’ Lambda range from 0 to 1. Smaller values indicate that a larger portion of the variance in the outcome variables is associated with the grouping variable, suggesting stronger group differences.
Why it matters: Wilks’ Lambda is commonly reported in multivariate analyses to assess whether groups differ across a set of related outcomes.
See also: MANOVA; Multivariate Test Statistics; Pillai’s Trace; Dependent Variable; Independent Variable
Back to top
X
X-Axis
The x-axis is the horizontal axis of a graph used to display the values of one variable. In many statistical visualizations, the x-axis represents the independent variable, categories, or measurement values used to organize the data.
Why it matters: The x-axis provides the reference line used to position data points or bars horizontally, helping readers interpret how values change across categories or levels of a variable.
See also: Y-Axis; Scatterplot; Data Visualization
Back to top
Y
Y-Axis
The y-axis is the vertical axis of a graph used to display the values of a variable being measured or observed. In many statistical graphs, the y-axis represents the outcome or dependent variable.
For example, in a bar chart comparing average test scores across different classrooms, the classroom categories may appear on the x-axis while the average scores are displayed along the y-axis.
Why it matters: The y-axis shows the magnitude or value of the variable being measured, allowing readers to compare levels, trends, or differences in the data.
See also: X-Axis; Scatterplot; Graph; Data Visualization; Dependent Variable
Back to top
Z
z-Score
A z-score indicates how many standard deviations a value lies above or below the mean.
Why it matters: z-scores standardize values for comparison across different scales.
See also: Standard Deviation; Normal Distribution
Back to top