Principles of posterior visualization

In Bayesian statistics results appear in the form of the posterior distribution: measure of uncertainty quantified in the terms of probability. Bayesian statistics is a mature field. However, visualization of the posterior distributions have not been understood as a distinct problem. Inappropriate visualization methods developed for other types of problems are widely used. Methods for visualizing posterior over the space of complex objects (e.g. graphs, phylogenetic trees, clusterings, alignments, covariance matrix etc.) are immature. Here I try to establish a few principles which can be used to filter out improper visualization techniques and develop the correct ones.

Of course, to judge if a certain figure is better than another we need to understand the context. Here I am focusing on the visualization intended for communication. It means that we already have a posterior distribution, and we want to present it (or some of it’s features) in a way which would be honest, easy to understand and hard to misinterpret.

Principle 1: Uncertainty should be visualized

In Bayesian statistics, uncertainty is an essential part of the results. Not visualizing it is as wrong as concealing a half of the result.

Principle 2: Visualization of variability ≠ Visualization of uncertainty

Boxplot is a striking example for this principle. Boxplot is a prefect tool for showing a variability in the data, but it should not be used for visualizing the posterior distribution.The inner interval of the boxplot contains almost the same probability mass as the outer intervals, but are presented it a completely different way. This deceives the reader, leaving the overconfident impression about the estimates.

UV_2

Principle 3: Equal probability = Equal ink

UV_2

Here is the same distribution is presented with four different methods united under the same principles: same probability mass is visualized with the same amount of ink. The ink is represented either by the size of the colored area or by the color concentration. These figures are intuitive and hard to misinterpret.

Principle 4: Do not overemphasize of the point estimate

If the mean (median or mode) of the distribution is heavily highlighted, this can yield overconfident impression about the estimates.

UV_1

Here I highlight the means with the thin white lines (color of the background).

Principle 5: Certain estimates should be emphasized over uncertain

More certain results are more significant and interesting then the vague improbable ones. Therefore, visualization should emphasize certain estimates. However, uncertain estimates are naturally represented by wider distribution, occupying more visual space.

UV_0

Note: These principles (as any visualization principles) are contextual, and should be used (or not used) with the goals of this visualization in mind.

Principles of posterior visualization

11 thoughts on “Principles of posterior visualization

  1. Rolf says:

    Many thanks for the suggestions – Could you share the code to produce the graphics in this post and in the post on reinventing the histogram? Thanks!

    Like

Leave a comment