Wednesday 7 October 2015

Geometric Means for Citation Counts and Altmetrics

Citation count and altmetric data is typically highly skewed and the arithmetic mean is not the best measure of central tendency because of this. The problem is that the arithmetic mean can be greatly influenced by individual very high values and it is normal to have occasional very high values for citation counts or altmetrics.

A simple alternative to the arithmetic mean is to use the geometric mean. This uses the arithmetic mean of the natural logarithm of the data instead of the raw data. This mean is then transformed back by applying the exponential function to it, which is the inverse (reverse) of the natural logarithm. This reduces the influence of very large values and gives a more stable calculation. A problem with this is that uncited articles have a citation count of zero and it is not possible to calculate the log of zero. The simplest way to get round this problem is to add 1 to the citation counts before taking the natural logarithm. If this step is taken then 1 should also be subtracted after applying the exponential function.

In other words, the recommended process for obtaining an average for any citation or altmetric count-type data is as follows.
  1. Add 1 to the citation count/altmetric count data
  2. Take the natural logarithm of the result.
  3. Calculate the arithmetic mean of this transformed data.
  4. Calculate the exponential function of the result and then subtract 1.

Download the spreadsheet here with the calculations.

Here is an example, showing the formulae used in Excel, with some test data:

In the modification below, the citation count of only one article has changed, becoming very large. This has had a huge impact on the arithmetic mean, increasing it by over four times from about 2.4  to about 13.6 and a much smaller impact on the geometric mean, increasing it by about half from about 1.5 to 2.2. This illustrates the advantage of the geometric mean.


Here is some evidence that the geometric mean approach works.

Fairclough, R., & Thelwall, M. (2015). More precise methods for national research citation impact comparisons. Journal of Informetrics, 9(4), 895-906. 
Thelwall, M. & Fairclough, R. (2015). Geometric journal impact factors correcting for individual highly cited articles. Journal of Informetrics, 9(2),263–272.