Skip to main content
Erschienen in: Systematic Reviews 1/2015

Open Access 01.12.2015 | Erratum

Erratum to: Using text mining for study identification in systematic reviews: a systematic review of current approaches

verfasst von: Alison O’Mara-Eves, James Thomas, John McNaught, Makoto Miwa, Sophia Ananiadou

Erschienen in: Systematic Reviews | Ausgabe 1/2015

download
DOWNLOAD
print
DRUCKEN
insite
SUCHEN
Hinweise
The online version of the original article can be found under doi:10.​1186/​2046-4053-4-5.
The online version of the original article can be found at http://​dx.​doi.​org/​10.​1186/​2046-4053-4-5.

Erratum

Following publication of our article [1], it has come to our attention that two of the formulae in Table 1 were incorrect. The formulae for the measures of precision and burden have been corrected (Table 1). We are publishing this erratum to update these formulae to the following:
Table 1
Definitions of performance measures reported in the studies
Measure
#
Definition
Formula
Recall (sensitivity)
22
Proportion of correctly identified positives amongst all real positives
\( \frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{F}\mathrm{N}} \)
Precision
18
Proportion of correctly identified positives amongst all positives.
\( \frac{TP}{TP+FP} \)
F measure
10
Combines precision and recall. Values of β < 1.0 indicate precision is more important than recall, whilst values of β > 1.0 indicate recall is more important than precision
\( {F}_{\beta, k}\kern0.5em =\kern0.5em \frac{\left({\beta}^2+1\right){\mathrm{TP}}_k}{\left({\beta}^2+1\right){\mathrm{TP}}_k+{\mathrm{FP}}_k+{\beta}^2{\mathrm{FN}}_k} \) Where β is a value that specifies the relative importance of recall and precision.
ROC (AUC)
10
Area under the curve traced out by graphing the true positive rate against the false positive rate. 1.0 is a perfect score and 0.50 is equivalent to a random ordering
Accuracy
8
Proportion of agreements to total number of documents.
\( \frac{\mathrm{TP}+\mathrm{T}\mathrm{N}}{\mathrm{TP}+\mathrm{F}\mathrm{P}+\mathrm{F}\mathrm{N}+\mathrm{T}\mathrm{N}} \)
Work saved over sampling
8
The percentage of papers that the reviewers do not have to read because they have been screened out by the classifier
\( \mathrm{W}\mathrm{S}\mathrm{S}\ \mathrm{at}\ 95\%\ \mathrm{recall} = \kern0.5em \frac{\mathrm{TN}+\mathrm{F}\mathrm{N}}{N-0.05} \)
Time
7
Time taken to screen (usually in minutes)
 
Burden
4
The fraction of the total number of items that a human must screen (active learning)
\( Burden=\frac{t{p}^T+t{n}^T+f{p}^T+t{p}^U+f{p}^U}{N} \)
Yield
3
The fraction of items that are identified by a given screening approach (active learning)
\( \mathrm{Yield}\kern0.5em =\kern0.5em \frac{{\mathrm{tp}}^T+{\mathrm{tp}}^U}{{\mathrm{tp}}^T+{\mathrm{tp}}^U+{\mathrm{fn}}^U} \)
Utility
5
Relative measure of burden and yield that takes into account reviewer preferences for weighting these two concepts (active learning)
\( \frac{\beta \cdot \mathrm{yield}+\left(1\kern0.5em -\kern0.5em \mathrm{burden}\right)}{\beta +1} \) Where β is the user-defined weight
Baseline inclusion rate
2
The proportion of includes in a random sample of items before prioritisation or classification takes place. The number to be screened is determined using a power calculation
\( \frac{n_i}{n_t} \) Where n i  = number of items included in the random sample; n t  = total number of items in the random sample
Performance (efficiency) a
2
Number of relevant items selected divided by the time spent screening, where relevant items were those marked as included by two or more people
\( \frac{\mathrm{Selected},\kern0.5em \mathrm{relevant}\kern0.5em \mathrm{items}}{\mathrm{Time}} \)
Specificity
2
The proportion of correctly identified negatives (excludes) out of the total number of negatives
\( \frac{\mathrm{TN}}{\mathrm{TN}+\mathrm{F}\mathrm{P}} \)
True positives
2
The number of correctly identified positives (includes)
TP
False negatives
1
The number of incorrectly identified negatives (excludes)
FN
Coverage
1
The ratio of positives in the data pool that are annotated during active learning
\( \frac{{\mathrm{TP}}^L}{{\mathrm{TP}}^L+{\mathrm{FN}}^L+{\mathrm{TP}}^U+{\mathrm{FN}}^U} \) Where L refers to labelled items and U refers to unlabelled items
Unit cost
1
Expected time to label an item multiplied by the unit cost of the labeler (salary per unit of time), as calculated from their (known or estimated) salary
timeexpected × costunit
Classification error
1
Proportion of disagreements to total number of documents
100 % − accuracy %
Error
1
Total number of falsely classified items divided by the total number of items
\( \frac{\sum \left(\mathrm{F}\mathrm{P}+\mathrm{F}\mathrm{N}\right)}{\sum \left(\mathrm{T}\mathrm{P}+\mathrm{F}\mathrm{P}+\mathrm{F}\mathrm{N}+\mathrm{T}\mathrm{N}\right)} \)
Absolute screening reduction
1
Number of items excluded by the classifier that do not need to be manually screened
TN + FN
Prioritised inclusion rate
1
The proportion of includes out of the total number screened, after prioritisation or classification takes place
\( \frac{n_{\mathrm{ip}}}{n_{\mathrm{tp}}} \) Where nip = number of items included in prioritised sample; ntp = total number of items in the prioritised sample
Precision = \( \frac{TP}{TP+FP} \)
Burden = \( \frac{t{p}^T+t{n}^T+f{p}^T+t{p}^U+f{p}^U}{N} \)
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
The Creative Commons Public Domain Dedication waiver (https://​creativecommons.​org/​publicdomain/​zero/​1.​0/​) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
download
DOWNLOAD
print
DRUCKEN
Literatur
1.
Zurück zum Zitat O’Mara-Eves A, Thomas J, McNaught J, Miwa M, Ananiadou S. Using text mining for study identification in systematic reviews: a systematic review of current approaches. Systematic Rev. 2015;4:5.CrossRef O’Mara-Eves A, Thomas J, McNaught J, Miwa M, Ananiadou S. Using text mining for study identification in systematic reviews: a systematic review of current approaches. Systematic Rev. 2015;4:5.CrossRef
Metadaten
Titel
Erratum to: Using text mining for study identification in systematic reviews: a systematic review of current approaches
verfasst von
Alison O’Mara-Eves
James Thomas
John McNaught
Makoto Miwa
Sophia Ananiadou
Publikationsdatum
01.12.2015
Verlag
BioMed Central
Erschienen in
Systematic Reviews / Ausgabe 1/2015
Elektronische ISSN: 2046-4053
DOI
https://doi.org/10.1186/s13643-015-0031-5

Weitere Artikel der Ausgabe 1/2015

Systematic Reviews 1/2015 Zur Ausgabe