"

16: Data Visualization

Chapter 16 Guiding Questions

  1. How can visualizations clarify or distort findings?
  2. What principles guide ethical and effective data visualization?
  3. How should charts be selected based on variable type and purpose?
  4. How can visuals support interpretation for non-technical audiences?

16.1 Understanding Data through Visuals

Data visualization is a powerful tool for exploring, understanding, and communicating data. It translates complex statistical concepts into intuitive, accessible visuals that can reveal patterns, trends, and outliers not easily detected in raw numbers. This chapter introduces common visualization techniques, including bar plots, histograms, density plots, boxplots, violin plots, and Q–Q plots. These tools are essential for summarizing data, assessing distributions, and identifying relationships within your dataset.

16.2 Bar Plots

Bar plots are one of the simplest and most effective ways to visualize nominal data. They use rectangular bars to represent the frequency for each category. The length or height of each bar corresponds to the size or value associated with that category. Bar plots are especially useful for comparing groups within a dataset and identifying patterns across categories.

How To: Bar Plots

To create bar plots in Jamovi, go to the Analyses tab, select Exploration, then Descriptives.

  1. Move nominal variables into the Variables box.
  2. Under the Statistics drop-down, uncheck all options.
  3. Under Plots, check Bar plot under Bar Plots.

Understanding the Output

The output from the bar plots is shown below.

 

Jamovi interface displaying bar charts for nominal variables.
Figure 16.1. Bar Plot Results

Each bar represents a category, and the height of the bar indicates how many observations fall into that category. Taller bars reflect categories with more participants, while shorter bars reflect fewer participants.

To interpret the chart, compare the relative heights of the bars rather than focusing only on individual counts. This allows you to quickly identify which category is most common and which categories are less represented. Large differences in bar height indicate uneven distribution across categories, whereas bars of similar height suggest a more balanced distribution.

Because this is a frequency-based plot, it does not show averages or variability; it simply summarizes how observations are distributed across categories.

16.3 Histograms

A histogram is a graphical representation of the distribution of interval data. It organizes values into intervals, called bins, and shows how many data points fall into each bin. Think of bins as ranges of values. Each bar in the histogram represents one of these ranges, and the height of the bar shows how many values fall within that range.

The number and width of bins are determined by the data range and the number of observations. The goal is to balance detail and readability: too few bins can hide important features, while too many can make the plot noisy or hard to interpret.

Histograms are especially useful for visualizing the shape of a distribution. They help you see how data are spread out and whether they cluster in certain ranges. Histograms also make it easier to identify patterns such as normality, skewness, or multimodality, which may not be obvious from raw data alone.

How To: Histograms

To create histograms in Jamovi, go to the Analyses tab, select Exploration, then Descriptives.

  1. Move interval variables into the Variables box.
  2. Under Plots, check Histogram under Histograms.

Understanding the Output

The output from the histograms is shown below.

 

Jamovi interface showing histograms for interval variables.
Figure 16.2. Histogram Results

To interpret a histogram, begin by examining the overall shape of the distribution. Look for whether the bars form a roughly symmetric pattern, cluster around a central value, or show a noticeable skew to one side. A symmetric shape suggests the data are fairly evenly distributed around the center, while a longer tail on one side indicates skewness.

Next, consider where the highest bars are located. These represent the ranges of values with the greatest concentration of observations. This helps you identify where most participants scored and whether the distribution is concentrated in the lower, middle, or upper portion of the scale.

Also observe the spread of the data across the horizontal axis. A wider spread indicates greater variability, while a tighter cluster of bars suggests less variability. Gaps or isolated bars may indicate potential outliers or less common score ranges.

Finally, assess whether the distribution appears approximately normal (bell-shaped) or deviates from normality. Histograms provide a visual complement to numerical statistics such as mean, standard deviation, skewness, and kurtosis, helping you better understand the overall pattern of the data.

16.4 Density Plots

A density plot is a smoothed version of a histogram that displays the distribution of interval data using a continuous curve. It estimates the probability density function of the variable, showing where values are more or less concentrated. Unlike histograms, which group data into bins, density plots provide a smoother and more precise view of the data’s shape, especially useful when identifying skewness or multiple peaks. Density plots are particularly helpful when comparing distributions across groups, as overlapping curves make differences easier to see than stacked or side-by-side histograms.

How To: Density Plots

To create density plots in Jamovi, go to the Analyses tab, select Exploration, then Descriptives.

  1. Move interval variables into the Variables box.
  2. Under Plots, check Density under Histograms.

TIP: You can create a histogram with an overlaid density plot to visualize both the distribution and its underlying shape.

Understanding the Output

The output from the density plots is shown below.

 

Jamovi interface showing density plots for interval variables.
Figure 16.3. Density Plot Results

To interpret a density plot, focus on the overall shape of the smooth curve. A density plot represents the distribution of an interval variable, showing where values are concentrated across the range of the scale. The higher the curve at a particular point, the greater the concentration of observations in that range.

Begin by identifying where the peak of the curve occurs. This indicates the most common range of values. If the curve has a single peak, the distribution is unimodal; multiple peaks suggest more than one cluster of values.

Next, examine the symmetry of the curve. If the left and right sides mirror each other, the distribution is approximately symmetric. If one tail extends farther than the other, the distribution is skewed in that direction. A longer right tail indicates positive skew, while a longer left tail indicates negative skew.

Also consider the spread of the curve. A wider, flatter curve reflects greater variability, while a narrower, taller curve suggests less variability. Density plots are especially helpful for visually assessing normality and comparing the overall shape of multiple distributions.

16.5 Boxplots

A boxplot, also known as a box-and-whisker plot, provides a compact visual summary of a dataset’s distribution and variability. It displays the minimum, lower quartile (Q1), median, upper quartile (Q3), and maximum. The box spans the interquartile range (IQR), which contains the middle 50% of the data. A line inside the box marks the median.

The whiskers extend to the smallest and largest values within 1.5 times the IQR from the quartiles. Data points beyond this range are plotted individually and considered outliers. Boxplots are especially useful for comparing distributions across groups and identifying asymmetry or extreme values in the data.

How To: Boxplots

To create boxplots in Jamovi, go to the Analyses tab, select Exploration, then Descriptives.

  1. Move interval or ordinal variables into the Variables box.
  2. Under Plots, check Box plot under Box plots and Mean.

Understanding the Output

The output from the boxplots is shown below.

 

Jamovi interface showing box plots for interval variables.
Figure 16.4. Boxplot Results

To interpret a box plot, begin by focusing on the box, which represents the middle 50% of the data. The bottom of the box marks the 25th percentile (Q1), and the top marks the 75th percentile (Q3). The height of the box therefore reflects the interquartile range (IQR), which shows the spread of the central half of the distribution.

The line inside the box represents the median, or the midpoint of the data. If the median is centered within the box, the distribution is relatively symmetric. If it is closer to one edge, this suggests skewness in the direction of the longer portion of the box.

The dot inside the box represents the mean, or average value. When the mean and median are close together, the distribution is fairly symmetric. If the mean is noticeably higher than the median, this suggests positive skew. If the mean is noticeably lower than the median, this suggests negative skew. Because the mean is influenced by extreme values, it may be pulled in the direction of outliers, which is why it does not always align with the median.

The whiskers extend from the box to the lowest and highest values that are not considered outliers. Longer whiskers indicate greater spread in that direction. If one whisker is noticeably longer than the other, this may suggest skewness.

Any points beyond the whiskers are potential outliers. These are values that fall unusually far from the rest of the data and may warrant further investigation.

Box plots allow you to quickly assess central tendency, variability, symmetry, and potential outliers, and they are especially useful for comparing distributions across variables or groups.

16.6 Violin Plots

A violin plot combines the features of a boxplot and a density plot into a single visualization. It displays the distribution of a variable and its probability density across different values, while also including the option of the boxplot structure inside the plot to show the median and quartiles. Unlike a standard boxplot, a violin plot reveals the full shape of the distribution, including potential skewness or multiple peaks. This provides a more detailed and informative view of the data, making it especially useful for comparing distributions across groups.

How To: Violin Plots

To create violin plots in Jamovi, go to the Analyses tab, select Exploration, then Descriptives.

  1. Move interval variables into the Variables box.
  2. Under Plots, check Violin and Data under Box plots.

TIP: You can create a boxplot with an overlaid violin plot to visualize the distribution, its shape, and any outliers, which is helpful for interval variables.

Understanding the Output

The output from the violin plots is shown below.

 

Jamovi interface showing violin plots for interval variables.
Figure 16.5. Violin Plot Results

A violin plot shows both summary data and the overall shape of a distribution. The width of the violin at any point reflects how concentrated the data are at that value. Wider sections indicate more observations, while narrower sections indicate fewer. This allows you to see where scores cluster and whether the distribution is symmetric or skewed. The spread from top to bottom shows the range of the data, and the overlaid points display individual observations, helping you identify clustering, gaps, or potential outliers.

16.7 Q-Q Plots

A Q–Q plot (quantile–quantile plot) is a graphical tool used to assess whether a dataset follows a specific theoretical distribution, such as the normal distribution. It compares the values in your dataset to the expected values if the data were perfectly normal. Each point on the plot represents a single case (or observation) in your dataset. Its position is based on how the value ranks within your data (its sample quantile) and the corresponding value it would have if the data followed the theoretical distribution (the theoretical quantile).

If the data follow the expected distribution, the points will fall roughly along a straight diagonal line. If the points curve away from the line, it suggests the data deviate from that distribution. Q–Q plots are especially useful for visually assessing normality, which is a common assumption in many statistical tests.

How To: Q-Q Plots

To create Q-Q plots in Jamovi, go to the Analyses tab, select Exploration, then Descriptives.

  1. Move interval variables into the Variables box.
  2. Under Plots, check Q-Q under Q-Q Plots.

Understanding the Output

The output from the Q–Q plots is shown below.

 

Jamovi interface showing Q–Q plots for interval variables.
Figure 16.6. Q–Q plot Results

To interpret a Q–Q plot, first check whether the points closely follow the diagonal reference line. If they do, the distribution is approximately normal. Small deviations near the center are typically not concerning. However, noticeable departures from the line, especially consistent curves or larger gaps at the tails, suggest skewness, heavy tails, or other departures from normality. Points that fall far from the line at either end may indicate potential outliers. The closer the points align with the reference line, the stronger the evidence that the normality assumption is reasonable.

16.8 Choosing the Right Data Visualization

The best data visualization depends on the type of variable, the distribution of the data, and the goal of the analysis. For nominal data, bar plots are ideal for showing frequencies or comparisons between groups. For continuos data, histograms and density plots are useful for visualizing the shape and spread of the distribution. Histograms group values into intervals, while density plots offer a smoother view of the distribution and are particularly helpful when comparing multiple groups.

Boxplots are useful when you want to summarize key statistics like the median, quartiles, and outliers, especially when comparing distributions across groups. Violin plots combine the summary features of boxplots with the shape information from density plots, offering a more detailed view of the distribution. These are best used with interval data and larger sample sizes. Q–Q plots are most appropriate when you need to assess whether your data follow a specific theoretical distribution, such as normality, which is often a requirement for parametric tests.

Building on the previous chapter, these visuals can be produced within grouped descriptives to examine patterns across groups alongside measures of central tendency, variability, and distribution.

Jamovi includes a separate Plots tab that allows you to customize visualizations to better meet your specific analysis and presentation needs. However, because you can only produce one plot at a time, it is less efficient when creating multiple visuals as part of a larger analysis.

Ultimately, the right visualization is the one that most clearly and accurately communicates the structure of your data. In many cases, using multiple visualizations together, such as pairing a boxplot with a violin plot, can provide a more complete understanding.

Chapter 15 Summary and Key Takeaways

Data visualization is an invaluable tool for exploring, interpreting, and communicating the structure of a dataset. This chapter introduced several key types of visualizations, including bar plots, histograms, density plots, boxplots, violin plots, and Q–Q plots. Each provides unique insights into a dataset’s distribution, central tendency, and variability. Visual tools help identify patterns, detect outliers, and assess assumptions, such as normality, that guide statistical decision-making. In Jamovi, creating these visualizations is straightforward, enabling researchers to interpret their data more effectively and communicate findings clearly.

  • Bar plots are ideal for visualizing and comparing frequencies across nominal variables.
  • Histograms display the distribution of continuous data, helping reveal patterns like normality, skewness, or multimodality.
  • Density plots provide a smoothed view of a distribution and are especially helpful when comparing multiple groups.
  • Boxplots summarize key statistics such as the median and interquartile range, while highlighting potential outliers.
  • Violin plots combine the structure of a boxplot with the shape of a density plot, offering a more detailed view of the distribution.
  • Q–Q plots allow you to visually assess whether your data follow a theoretical distribution, such as the normal distribution.