How to draw a beautiful histogram chart
One of the most well-known tools to investigate the distribution of a variable is the histogram plot. The easiest way to create a histogram in BioVinci is by dropping any numeric column into the view area. Then the histogram will show up as the default plot.
Histograms of multiple variables in one view
In some procedures, people visualize two or more variables in one view. These common cases are well-supported in BioVinci by dropping multiple variables to the Value placeholder. The short animation below clearly shows this operation.
Additional options
Now let us examine one important factor that determines how your histograms look like, the number of bins. By default, we set the value equal to 30. However, you should know the common choice for the number of bins is square root of the number of items in your data. In practice, you should try many values of this parameter for the best visualization.
Add mean and median lines
You can add mean or median lines of all the variables in the current histogram just by selecting Add -> mean
in the menu.
Switch between density, histogram, and area plot
Besides histogram, users can switch to other distribution-like types of plots such as density and area. In the animation below, I switch my histogram plot to density and area type, respectively.
Switch between identity, fill, and stack position
Identity mode
By default, each bar in the standard histogram represents the frequency for a range of value of your variables. In case you have more than two groups, those bars will overlap. For example in the figure below, we plot the histogram of the variable radius_mean of two groups Benign and Malignant , which are indicated in the diagnosis column. With identity mode, you can see the two distributions overlapping each other.
Stacked mode
The stacked style is a great way to investigate in detail the composition of each group in the frequency of a particular variable. After dropping one column into Variable placeholder and another one into Color placeholder, just simply select on the right-click menu Position -> stack
to switch to stacked style.
For instance, users can add more than one variables into the Value placeholder to create stacked style histogram. The value of each bar will be accumulated and its colors represent the composition of the frequencies of particular ranges of value.
Fill mode
As same of the stacked style, one can scale their stacked histogram up to 100 percent, then the value in each part of the bar represent its proportion of the sum of frequency in each range. The animation below illustrates how to switch to fill mode of histogram and density, respectively.
Small plots send a big message
Don’t be afraid to make your charts smaller to tell a bigger story. Small charts that align to the same scale will captures the overview of your dataset. These charts reveal the patterns of every individual plots that we can easily compare them to others.
You can split your histogram into multiple charts just by dropping a category column into the Split placeholder, as described in the figure below
One can even drop multiple variables into the Value placeholder then using the Split option in the right-click menu to do that operation.
Color your histogram then split into groups
Even in the small charts, you can split the data into multiple colors by using both the Color and Split placeholders. In the animation below, I split the data of the Attack variable into groups of Type.1 by putting Type.1 into the Split by placeholder. Similarly, I split the data in each small plot into two groups of Legendary by putting Legendary into the Color placeholder.
You can do the same operation with multiple variables like Attack from the previous example after dropping them into the Value placeholder. After that, put a character column such as Type.1 from the previous example into Split placeholder.