Altmetrics for Evaluations

Wednesday 21 December 2016

Equalised Mean-based Normalised Proportion Cited

The Equalised Mean-based Normalised Proportion Cited (EMNPC) is an indicator to assess the proportion of documents cited (i.e., with a non-zero citation count) to see whether it is above or below the world average or the values for another group. It is designed for web indicators for which a high proportion is uncited so that average citation indicators, such as MNLCS, are not very accurate. It is particularly suitable for metrics designed by Kayvan Kousha, such as Wikipedia citation counts, and syllabus mentions.

EMNPC has a simple formula. For each field f in which group g publishes, calculate the proportion of articles with at least one citation, pgf. Now for the same set of fields, calculate the proportion of articles published by the world with at least one citation, pwf. Now sum the first set and divide by the sum of the second set to get EMNPC for the group.

If this has a value greater than 1 then a higher proportion of articles by group g is cited than average for the world, so g's research has above average impact in terms of the proportion cited. Note that this calculation is unfair if the group publishes small numbers of articles in some fields because all fields have equal weight in the above formula. Small fields should therefore be removed.

This calculation also gives a higher implicit weight to fields with a relatively high proportion cited because these can dominate the numerator and denominator of the formula, but this is necessary to get narrower confidence intervals (in contrast to the similar MNPC indicator).

Here is a worked example in Excel, showing the formulae used (click to expand).

Confidence intervals

Confidence intervals can be calculated with the formula below.

Here is a worked example in Excel (click to expand).

Spreadsheet and original publication

Download a spreadsheet containing the EMNPC calculations. This also has MNPC calculations, and a second worksheet repeating the calculations with a continuity correction for greater accuracy.

Additional details can be found in this article.

Thelwall, M. (in press). Three practical field normalised alternative indicator formulae for research evaluation. Journal of Informetrics. 10.1016/j.joi.2016.12.002

Tuesday 20 December 2016

Mean Normalised Log-transformed Citation Score (MNLCS): A field normalised average citation impact indicator for sets of articles from multiple fields and years

The Mean Normalised Log-transformed Citation Score (MNLCS) is a variant of the Mean Normalised Citation Score (MNLCS) to assess the average citation impact of a set of articles. This formula compares the average citation impact of articles within a group to the average citation impact of all articles in the fields and year of the group’s articles. A score of above 1 indicates that the group’s articles have a higher average citation impact than normal for the fields and years in which they were published. The MNLCS uses a log transformation to citation counts before processing them because sets of citation counts are typically highly skewed and this transformation prevents individual articles from having too much influence on the results. The MNLCS calculation is as follows.

Log transformation: For each article A in the group to be assessed, replace its citation count c by a log-transformed version, ln(1+c).
Field normalisation: Divide the log transformed citation count ln(1+c) of each article A in the group by the average (arithmetic mean) log transformed citation count ln(1+x) for all articles x in the same field and year as A.
Calculation: MNLCS is the arithmetic mean of all the field-normalised, log-transformed citation counts of articles in the group.

The MNLCS calculations are illustrated below in Excel for a group publishing articles A,C, H and J in a single field containing articles A to K.

And here is an example of MNLCS calculations for a group that is split between two fields, X and Y, with articles A,C, H and J in field X and O, S, U in field Y.

Confidence intervals can be calculated for MNLCS to assess whether two different MNLCS values are statistically significantly different from each other. The formulae use the means and standard errors of the log transformed citation counts (note: not the field normalised versions) for the articles in the group and the world’s articles in the same field and year as the group. The formula is more complicated if the group publishes in multiple fields and years – this post just discusses the simplest case, with formula below.

The confidence interval calculations are illustrated below in Excel. In this case the group MNLCS is 0.885 but its 95% confidence interval is (0.119, 2.081) so the group average is not statistically significantly different from the world MNLCS value, which is always 1.

All of the above calculations (including confidence intervals) are also built into the free software Webometric Analyst for sets of citation counts from Scopus, the Web of Science as well as for web indicators.

Download a spreadsheet with the above worked examples.

More details are given in this paper:

Thelwall, M. (in press). Three practical field normalised alternative indicator formulae for research evaluation. Journal of Informetrics. 10.1016/j.joi.2016.12.002

Friday 5 August 2016

What do Wikipedia Citation Counts Mean?

A post on this topic can be found on the 3:AM altmetrics workshop blog.

Wednesday 7 October 2015

Geometric Means for Citation Counts and Altmetrics

Citation count and altmetric data is typically highly skewed and the arithmetic mean is not the best measure of central tendency because of this. The problem is that the arithmetic mean can be greatly influenced by individual very high values and it is normal to have occasional very high values for citation counts or altmetrics.

A simple alternative to the arithmetic mean is to use the geometric mean. This uses the arithmetic mean of the natural logarithm of the data instead of the raw data. This mean is then transformed back by applying the exponential function to it, which is the inverse (reverse) of the natural logarithm. This reduces the influence of very large values and gives a more stable calculation. A problem with this is that uncited articles have a citation count of zero and it is not possible to calculate the log of zero. The simplest way to get round this problem is to add 1 to the citation counts before taking the natural logarithm. If this step is taken then 1 should also be subtracted after applying the exponential function.

In other words, the recommended process for obtaining an average for any citation or altmetric count-type data is as follows.

Add 1 to the citation count/altmetric count data
Take the natural logarithm of the result.
Calculate the arithmetic mean of this transformed data.
Calculate the exponential function of the result and then subtract 1.

Download the spreadsheet here with the calculations.

Here is an example, showing the formulae used in Excel, with some test data:

In the modification below, the citation count of only one article has changed, becoming very large. This has had a huge impact on the arithmetic mean, increasing it by over four times from about 2.4 to about 13.6 and a much smaller impact on the geometric mean, increasing it by about half from about 1.5 to 2.2. This illustrates the advantage of the geometric mean.

Here is some evidence that the geometric mean approach works.

Fairclough, R., & Thelwall, M. (2015). More precise methods for national research citation impact comparisons. Journal of Informetrics, 9(4), 895-906.
Thelwall, M. & Fairclough, R. (2015). Geometric journal impact factors correcting for individual highly cited articles. Journal of Informetrics, 9(2),263–272.

Tuesday 12 August 2014

Alternative metrics in the future UK Research Excellence Framework

In the previous UK REF, due to be reported in December 2014, individual subject areas (units of assessment) had the option to request citation counts for submitted articles, together with field and time normalization information. Most did not and there was no option to ask for any alternative metrics.

Here are recommendations for uses of alternative metrics, including altmetrics, in national research evaluation exercises, such as the UK Research Excellence Framework (REF). Please leave comments if you see problems with these recommendations.

Alternative metrics should not be provided routinely for all articles. Alternative metrics seem currently to be highly susceptible to spam and to give little added information for typical articles. There are so many different altmetrics that it does not seem to be worth routinely collecting this data for all articles. In fact it is likely to be damaging to routinely collect alternative metrics for articles for evaluation purposes because this will push academics and research support offices towards wasting their time trying to attract tweets etc. to their work. Of course, a certain amount of self-publicity is a good thing and should not be discouraged but if it is measured then it is likely to get out of hand.
Units of assessment should be given the option to provide alternative metrics in special cases. Research may have impacts that are not obvious from reading it or even from citation counts. For example, a research group may maintain a website that is popular with schools, host key software with a substantial uptake or produce books that are in reading lists around the world. To give a specific case, the point of many blogs is to attract a wide audience but how else can you prove that a blog is widely read that by reporting how many readers or visitors it has? You can immediately tell that you are reading something special when you get to Stephen Curry's blog but its real value comes from the number of other people who have come to the same conclusion. Researchers should have the opportunity to present data to support their claim of having a non-standard impact. For units of assessment that do not allow the routine use of citation counts, I think that citation counts should be allowed in special cases (and I did this in my own case). For all units of assessment, I think that alternative metrics should be allowed in special cases. I think that they will be particularly valuable for social impact case studies but can also be useful to demonstrate educational impacts for research.
Assessors should be cautioned to not interpret alternative metrics at face value but to take them as pointers to the potential impact of the research. There are two important problems with interpreting alternative metrics. First, in most cases it is impossible to effectively normalise them for field and discipline and so it is hard to be sure whether any particular number is good or not. Second, this is exacerbated because an alternative metric could partly reflect "important" impact and partly reflect irrelevant impact, such as fun. For example, an article with a funny title could be tweeted for amusement or for the value of its findings. In practice, this means that assessors should use the alternative metrics to guide them to a starting position about the impact of the research but should make their own final judgement, taking into account the limitations of alternative metrics.
Units of assessment submitting any alternative metrics should complete a declaration of honour to state that they have not attempted to game the metrics and to declare any unintentional manipulation of the metrics. Unintentional manipulation might include librarians teaching students how to tweet articles with examples of the university's REF publications. The declaration of honour should include information that the alternative metrics will be made fully public and that it is likely that future researchers in computer science will develop algorithms to detect manipulation of REF metrics and so it will be highly embarrassing for anyone that has submitted manipulated data, even if unintentionally. It is likely that this process will be highly effective because individuals are likely to gain access to the raw data used in services like Twitter and Mendeley and hence discover, for example, the IP addresses of tweeters and can also detect abnormal patterns of the accumulation of a metric over time so that highly sophisticated manipulation strategies would have a chance of detection. This declaration of honour gives a non-trivial degree of risk to the submitting unit of assessment and should act as a deterrent to using metrics in all except the most important cases.

Introduction

The purpose of this blog is to share ideas about uses of alternative metrics for evaluation. It is in part a response to David Colquhoun's blogging against the use of altmetrics. I believe that alternative metrics can be useful in some research evaluation contexts and think it is useful to have a blog covering these contexts.

As part of the Statistical Cybermetrics Research Group, I have been using alternative metrics for research evaluations since 2007 and this seems like a good time to make recommendations for specific applications. Previous evaluations have been for a large UK organisation promoting innovation (Nesta), the EU, the UNDP and individual university departments. All the evaluations so far have had the common factor that the organisations evaluated produce knowledge, but not primarily traditional academic knowledge in the form of journal articles, and need evidence about the wider impact of their articles. For these, we have used a range of web-based metrics to give evidence of general impact. We always include extensive discussions of the limitations of the metrics used and and also recommend the use of content analysis in parallel with the metrics so that the numerical values can be interpreted more accurately.