How to draw a beautiful histogram chart

BioTuring Team
4 min readDec 12, 2017

--

One of the most well-known tools to investigate the distribution of a variable is the histogram plot. The easiest way to create a histogram in BioVinci is by dropping any numeric column into the view area. Then the histogram will show up as the default plot.

The simple way to create a Histogram chart

Histograms of multiple variables in one view

In some procedures, people visualize two or more variables in one view. These common cases are well-supported in BioVinci by dropping multiple variables to the Value placeholder. The short animation below clearly shows this operation.

Create histogram using multiple variables

Additional options

Now let us examine one important factor that determines how your histograms look like, the number of bins. By default, we set the value equal to 30. However, you should know the common choice for the number of bins is square root of the number of items in your data. In practice, you should try many values of this parameter for the best visualization.

Add mean and median lines

You can add mean or median lines of all the variables in the current histogram just by selecting Add -> mean in the menu.

Add mean line in histogram plot

Switch between density, histogram, and area plot

Besides histogram, users can switch to other distribution-like types of plots such as density and area. In the animation below, I switch my histogram plot to density and area type, respectively.

How to draw density plot rapidly

Switch between identity, fill, and stack position

Identity mode

By default, each bar in the standard histogram represents the frequency for a range of value of your variables. In case you have more than two groups, those bars will overlap. For example in the figure below, we plot the histogram of the variable radius_mean of two groups Benign and Malignant , which are indicated in the diagnosis column. With identity mode, you can see the two distributions overlapping each other.

How to plot overlaid histograms

Stacked mode

The stacked style is a great way to investigate in detail the composition of each group in the frequency of a particular variable. After dropping one column into Variable placeholder and another one into Color placeholder, just simply select on the right-click menu Position -> stack to switch to stacked style.

How to plots histogram using multiple columns

For instance, users can add more than one variables into the Value placeholder to create stacked style histogram. The value of each bar will be accumulated and its colors represent the composition of the frequencies of particular ranges of value.

How to draw stacked style histograms

Fill mode

As same of the stacked style, one can scale their stacked histogram up to 100 percent, then the value in each part of the bar represent its proportion of the sum of frequency in each range. The animation below illustrates how to switch to fill mode of histogram and density, respectively.

How to plot fill styled histogram

Small plots send a big message

Don’t be afraid to make your charts smaller to tell a bigger story. Small charts that align to the same scale will captures the overview of your dataset. These charts reveal the patterns of every individual plots that we can easily compare them to others.

You can split your histogram into multiple charts just by dropping a category column into the Split placeholder, as described in the figure below

How to plot multiple histograms

One can even drop multiple variables into the Value placeholder then using the Split option in the right-click menu to do that operation.

How to plot multiple density plots using multiple columns

Color your histogram then split into groups

Even in the small charts, you can split the data into multiple colors by using both the Color and Split placeholders. In the animation below, I split the data of the Attack variable into groups of Type.1 by putting Type.1 into the Split by placeholder. Similarly, I split the data in each small plot into two groups of Legendary by putting Legendary into the Color placeholder.

Split your data into two levels, Split and color

You can do the same operation with multiple variables like Attack from the previous example after dropping them into the Value placeholder. After that, put a character column such as Type.1 from the previous example into Split placeholder.

Another way to split your data into multiple charts using multiple columns

--

--

BioTuring Team
BioTuring Team

Written by BioTuring Team

At BioTuring, we dream, we think, we code, and we deliver important algorithms and software — to tackle biomedical challenges.

No responses yet