Principles of posterior visualization

In Bayesian statistics results appear in the form of the posterior distribution: measure of uncertainty quantified in the terms of probability. Bayesian statistics is a mature field. However, visualization of the posterior distributions have not been understood as a distinct problem. Inappropriate visualization methods developed for other types of problems are widely used. Methods for visualizing posterior over the space of complex objects (e.g. graphs, phylogenetic trees, clusterings, alignments, covariance matrix etc.) are immature. Here I try to establish a few principles which can be used to filter out improper visualization techniques and develop the correct ones.

Of course, to judge if a certain figure is better than another we need to understand the context. Here I am focusing on the visualization intended for communication. It means that we already have a posterior distribution, and we want to present it (or some of it’s features) in a way which would be honest, easy to understand and hard to misinterpret.

Principle 1: Uncertainty should be visualized

In Bayesian statistics, uncertainty is an essential part of the results. Not visualizing it is as wrong as concealing a half of the result.

Principle 2: Visualization of variability ≠ Visualization of uncertainty

Boxplot is a striking example for this principle. Boxplot is a prefect tool for showing a variability in the data, but it should not be used for visualizing the posterior distribution.The inner interval of the boxplot contains almost the same probability mass as the outer intervals, but are presented it a completely different way. This deceives the reader, leaving the overconfident impression about the estimates.

Principle 3: Equal probability = Equal ink

Here is the same distribution is presented with four different methods united under the same principles: same probability mass is visualized with the same amount of ink. The ink is represented either by the size of the colored area or by the color concentration. These figures are intuitive and hard to misinterpret.

Principle 4: Do not overemphasize of the point estimate

If the mean (median or mode) of the distribution is heavily highlighted, this can yield overconfident impression about the estimates.

Here I highlight the means with the thin white lines (color of the background).

Principle 5: Certain estimates should be emphasized over uncertain

More certain results are more significant and interesting then the vague improbable ones. Therefore, visualization should emphasize certain estimates. However, uncertain estimates are naturally represented by wider distribution, occupying more visual space.

Note: These principles (as any visualization principles) are contextual, and should be used (or not used) with the goals of this visualization in mind.

11 thoughts on “Principles of posterior visualization”

[…] My favourite design in this figure: The height of each histogram is proportional to the square root of the maximum of the distribution. This is made to emphasize the certain estimates (see principles of posterior visualization) […]

LikeLike

April 15, 2015 at 10:12 pm Reply

[…] How to visualize an uncertainty about a time-dependent variable according to the principles of uncertainty visualization? […]

LikeLike

June 23, 2015 at 3:55 pm Reply