15: Data Visualization
15.1 Understanding Data through Visuals
Data visualization is a powerful tool for exploring, understanding, and communicating data. It translates complex statistical concepts into intuitive, accessible visuals that can reveal patterns, trends, and outliers not easily detected in raw numbers. This chapter introduces common visualization techniques, including bar plots, histograms, density plots, boxplots, violin plots, and Q–Q plots. These tools are essential for summarizing data, assessing distributions, and identifying relationships within your dataset.
15.2 Bar Plots
Bar plots are one of the simplest and most effective ways to visualize categorical data. They use rectangular bars to represent the frequency for each category. The length or height of each bar corresponds to the size or value associated with that category. Bar plots are especially useful for comparing groups within a dataset and identifying patterns across categories.
How To: Bar Plots
To create bar plots in Jamovi, go to the Analyses tab, select Exploration, then Descriptives.
- Move nominal variables into the Variables box.
-
Under the Statistics drop-down, uncheck all options.
- Under Plots, check Bar plot under Bar Plots.
15.3 Histograms
A histogram is a graphical representation of the distribution of numerical data. It organizes values into intervals, called bins, and shows how many data points fall into each bin. Think of bins as ranges of values. Each bar in the histogram represents one of these ranges, and the height of the bar shows how many values fall within that range.
The number and width of bins are set based on the range of the data and the number of observations. The goal is to balance detail and readability: too few bins can hide important features, while too many can make the plot noisy or hard to interpret.
Histograms are especially useful for visualizing the shape of a distribution. They help you see how data are spread out and whether they cluster in certain ranges. Histograms also make it easier to identify patterns such as normality, skewness, or multimodality, which may not be obvious from raw data alone.
How To: Histograms
To create histograms in Jamovi, go to the Analyses tab, select Exploration, then Descriptives.
- Move interval variables into the Variables box.
- Under Plots, check Histogram under Histograms.
15.4 Density Plots
A density plot is a smoothed version of a histogram that displays the distribution of continuous data using a continuous curve. It estimates the probability density function of the variable, showing where values are more or less concentrated. Unlike histograms, which group data into bins, density plots provide a smoother and more precise view of the data’s shape, especially useful when identifying skewness or multiple peaks. Density plots are particularly helpful when comparing distributions across groups, as overlapping curves make differences easier to see than stacked or side-by-side histograms.
How To: Density Plots
To create density plots in Jamovi, go to the Analyses tab, select Exploration, then Descriptives.
- Move interval variables into the Variables box.
- Under Plots, check Density under Histograms.
TIP: You can create a histogram with an overlaid density plot to visualize both the distribution and its underlying shape.
15.5 Boxplots
A boxplot, also known as a box-and-whisker plot, provides a compact visual summary of a dataset’s distribution and variability. It displays the minimum, lower quartile (Q1), median, upper quartile (Q3), and maximum. The box spans the interquartile range (IQR), which contains the middle 50% of the data. A line inside the box marks the median.
The whiskers extend to the smallest and largest values within 1.5 times the IQR from the quartiles. Data points beyond this range are plotted individually and considered outliers. Boxplots are especially useful for comparing distributions across groups and identifying asymmetry or extreme values in the data.
How To: Boxplots
To create boxplots in Jamovi, go to the Analyses tab, select Exploration, then Descriptives.
- Move interval or ordinal variables into the Variables box.
- Under Plots, check Box plot under Box plots.
15.6 Violin Plots
A violin plot combines the features of a boxplot and a density plot into a single visualization. It displays the distribution of a variable and its probability density across different values, while also including the boxplot structure inside the plot to show the median and quartiles. Unlike a standard boxplot, a violin plot reveals the full shape of the distribution, including potential skewness or multiple peaks. This combination provides a more detailed and informative view of the data, making it especially useful for comparing distributions across groups.
How To: Violin Plots
To create violin plots in Jamovi, go to the Analyses tab, select Exploration, then Descriptives.
- Move interval variables into the Variables box.
- Under Plots, check Violin, Data, and Mean under Box plots.
TIP: You can create a boxplot with an overlaid violin plot to visualize the distribution, its shape, and any outliers, which is helpful for interval variables.
15.7 Q-Q Plots
A Q–Q plot (quantile–quantile plot) is a graphical tool used to assess whether a dataset follows a specific theoretical distribution, such as the normal distribution. It compares the values in your dataset to the expected values if the data were perfectly normal. Each point on the plot represents a single case (or observation) in your dataset. Its position is based on how the value ranks within your data (its sample quantile) and the corresponding value it would have if the data followed the theoretical distribution (the theoretical quantile).
If the data follow the expected distribution, the points will fall roughly along a straight diagonal line. If the points curve away from the line, it suggests the data deviate from that distribution. Q–Q plots are especially useful for visually assessing normality, which is a common assumption in many statistical tests.
How To: Q-Q Plots
To create Q-Q plots in Jamovi, go to the Analyses tab, select Exploration, then Descriptives.
- Move interval variables into the Variables box.
- Under Plots, check Q-Q under Q-Q Plots.
15.8 Choosing the Right Data Visualization
The best data visualization depends on the type of variable, the distribution of the data, and the goal of the analysis. For categorical data, bar plots are ideal for showing frequencies or comparisons between groups. For interval data, histograms and density plots are useful for visualizing the shape and spread of the distribution. Histograms group values into intervals, while density plots offer a smoother view of the distribution and are particularly helpful when comparing multiple groups.
Boxplots are useful when you want to summarize key statistics like the median, quartiles, and outliers, especially when comparing distributions across groups. Violin plots combine the summary features of boxplots with the shape information from density plots, offering a more detailed view of the distribution. These are best used with continuous data and larger sample sizes. Q–Q plots are most appropriate when you need to assess whether your data follow a specific theoretical distribution, such as normality, which is often a requirement for parametric tests.
Ultimately, the right visualization is the one that most clearly and accurately communicates the structure of your data. In many cases, using multiple visualizations together, such as pairing a boxplot with a violin plot, can provide a more complete understanding.
Chapter 15 Summary and Key Takeaways
Data visualization is an invaluable tool for exploring, interpreting, and communicating the structure of a dataset. This chapter introduced several key types of visualizations, including bar plots, histograms, density plots, boxplots, violin plots, and Q–Q plots. Each provides unique insights into a dataset’s distribution, central tendency, and variability. Visual tools help identify patterns, detect outliers, and assess assumptions, such as normality, that guide statistical decision-making. In Jamovi, creating these visualizations is straightforward, enabling researchers to interpret their data more effectively and communicate findings clearly.
- Bar plots are ideal for visualizing and comparing frequencies across categorical variables.
- Histograms display the distribution of continuous data, helping reveal patterns like normality, skewness, or multimodality.
- Density plots provide a smoothed view of a distribution and are especially helpful when comparing multiple groups.
- Boxplots summarize key statistics such as the median and interquartile range, while highlighting potential outliers.
- Violin plots combine the structure of a boxplot with the shape of a density plot, offering a more detailed view of the distribution.
- Q–Q plots allow you to visually assess whether your data follow a theoretical distribution, such as the normal distribution.