Scolaris Content Display Scolaris Content Display

Rapid, point‐of‐care antigen tests for diagnosis of SARS‐CoV‐2 infection

This is not the most recent version

Collapse all Expand all

Abstract

available in

Background

Accurate rapid diagnostic tests for SARS‐CoV‐2 infection could contribute to clinical and public health strategies to manage the COVID‐19 pandemic. Point‐of‐care antigen and molecular tests to detect current infection could increase access to testing and early confirmation of cases, and expediate clinical and public health management decisions that may reduce transmission.

Objectives

To assess the diagnostic accuracy of point‐of‐care antigen and molecular‐based tests for diagnosis of SARS‐CoV‐2 infection. We consider accuracy separately in symptomatic and asymptomatic population groups.

Search methods

Electronic searches of the Cochrane COVID‐19 Study Register and the COVID‐19 Living Evidence Database from the University of Bern (which includes daily updates from PubMed and Embase and preprints from medRxiv and bioRxiv) were undertaken on 30 Sept 2020. We checked repositories of COVID‐19 publications and included independent evaluations from national reference laboratories, the Foundation for Innovative New Diagnostics and the Diagnostics Global Health website to 16 Nov 2020. We did not apply language restrictions.

Selection criteria

We included studies of people with either suspected SARS‐CoV‐2 infection, known SARS‐CoV‐2 infection or known absence of infection, or those who were being screened for infection. We included test accuracy studies of any design that evaluated commercially produced, rapid antigen or molecular tests suitable for a point‐of‐care setting (minimal equipment, sample preparation, and biosafety requirements, with results within two hours of sample collection). We included all reference standards that define the presence or absence of SARS‐CoV‐2 (including reverse transcription polymerase chain reaction (RT‐PCR) tests and established diagnostic criteria).

Data collection and analysis

Studies were screened independently in duplicate with disagreements resolved by discussion with a third author. Study characteristics were extracted by one author and checked by a second; extraction of study results and assessments of risk of bias and applicability (made using the QUADAS‐2 tool) were undertaken independently in duplicate. We present sensitivity and specificity with 95% confidence intervals (CIs) for each test and pooled data using the bivariate model separately for antigen and molecular‐based tests. We tabulated results by test manufacturer and compliance with manufacturer instructions for use and according to symptom status.

Main results

Seventy‐eight study cohorts were included (described in 64 study reports, including 20 pre‐prints), reporting results for 24,087 samples (7,415 with confirmed SARS‐CoV‐2). Studies were mainly from Europe (n = 39) or North America (n = 20), and evaluated 16 antigen and five molecular assays.

We considered risk of bias to be high in 29 (37%) studies because of participant selection; in 66 (85%) because of weaknesses in the reference standard for absence of infection; and in 29 (37%) for participant flow and timing. Studies of antigen tests were of a higher methodological quality compared to studies of molecular tests, particularly regarding the risk of bias for participant selection and the index test. Characteristics of participants in 35 (45%) studies differed from those in whom the test was intended to be used and the delivery of the index test in 39 (50%) studies differed from the way in which the test was intended to be used. Nearly all studies (97%) defined the presence or absence of SARS‐CoV‐2 based on a single RT‐PCR result, and none included participants meeting case definitions for probable COVID‐19.

Antigen tests

Forty‐eight studies reported 58 evaluations of antigen tests. Estimates of sensitivity varied considerably between studies. There were differences between symptomatic (72.0%, 95% CI 63.7% to 79.0%; 37 evaluations; 15530 samples, 4410 cases) and asymptomatic participants (58.1%, 95% CI 40.2% to 74.1%; 12 evaluations; 1581 samples, 295 cases). Average sensitivity was higher in the first week after symptom onset (78.3%, 95% CI 71.1% to 84.1%; 26 evaluations; 5769 samples, 2320 cases) than in the second week of symptoms (51.0%, 95% CI 40.8% to 61.0%; 22 evaluations; 935 samples, 692 cases). Sensitivity was high in those with cycle threshold (Ct) values on PCR ≤25 (94.5%, 95% CI 91.0% to 96.7%; 36 evaluations; 2613 cases) compared to those with Ct values >25 (40.7%, 95% CI 31.8% to 50.3%; 36 evaluations; 2632 cases). Sensitivity varied between brands. Using data from instructions for use (IFU) compliant evaluations in symptomatic participants, summary sensitivities ranged from 34.1% (95% CI 29.7% to 38.8%; Coris Bioconcept) to 88.1% (95% CI 84.2% to 91.1%; SD Biosensor STANDARD Q). Average specificities were high in symptomatic and asymptomatic participants, and for most brands (overall summary specificity 99.6%, 95% CI 99.0% to 99.8%).

At 5% prevalence using data for the most sensitive assays in symptomatic people (SD Biosensor STANDARD Q and Abbott Panbio), positive predictive values (PPVs) of 84% to 90% mean that between 1 in 10 and 1 in 6 positive results will be a false positive, and between 1 in 4 and 1 in 8 cases will be missed. At 0.5% prevalence applying the same tests in asymptomatic people would result in PPVs of 11% to 28% meaning that between 7 in 10 and 9 in 10 positive results will be false positives, and between 1 in 2 and 1 in 3 cases will be missed.

No studies assessed the accuracy of repeated lateral flow testing or self‐testing.

Rapid molecular assays

Thirty studies reported 33 evaluations of five different rapid molecular tests. Sensitivities varied according to test brand. Most of the data relate to the ID NOW and Xpert Xpress assays. Using data from evaluations following the manufacturer’s instructions for use, the average sensitivity of ID NOW was 73.0% (95% CI 66.8% to 78.4%) and average specificity 99.7% (95% CI 98.7% to 99.9%; 4 evaluations; 812 samples, 222 cases). For Xpert Xpress, the average sensitivity was 100% (95% CI 88.1% to 100%) and average specificity 97.2% (95% CI 89.4% to 99.3%; 2 evaluations; 100 samples, 29 cases). Insufficient data were available to investigate the effect of symptom status or time after symptom onset.

Authors' conclusions

Antigen tests vary in sensitivity. In people with signs and symptoms of COVID‐19, sensitivities are highest in the first week of illness when viral loads are higher. The assays shown to meet appropriate criteria, such as WHO's priority target product profiles for COVID‐19 diagnostics (‘acceptable’ sensitivity ≥ 80% and specificity ≥ 97%), can be considered as a replacement for laboratory‐based RT‐PCR when immediate decisions about patient care must be made, or where RT‐PCR cannot be delivered in a timely manner. Positive predictive values suggest that confirmatory testing of those with positive results may be considered in low prevalence settings. Due to the variable sensitivity of antigen tests, people who test negative may still be infected.

Evidence for testing in asymptomatic cohorts was limited. Test accuracy studies cannot adequately assess the ability of antigen tests to differentiate those who are infectious and require isolation from those who pose no risk, as there is no reference standard for infectiousness. A small number of molecular tests showed high accuracy and may be suitable alternatives to RT‐PCR. However, further evaluations of the tests in settings as they are intended to be used are required to fully establish performance in practice.

Several important studies in asymptomatic individuals have been reported since the close of our search and will be incorporated at the next update of this review. Comparative studies of antigen tests in their intended use settings and according to test operator (including self‐testing) are required.

Plain language summary

available in

How accurate are rapid tests for diagnosing COVID‐19?

What are rapid point‐of‐care tests for COVID‐19?

Rapid point‐of‐care tests aim to confirm or rule out COVID‐19 infection in people with or without COVID‐19 symptoms. They:

‐ are portable, so they can be used wherever the patient is (at the point of care);

‐ are easy to perform, with a minimum amount of extra equipment or complicated preparation steps;

‐ are less expensive than standard laboratory tests;

‐ do not require a specialist operator or setting; and

‐ provide results ‘while you wait’.

We were interested in two types of commercially available, rapid point‐of‐care tests: antigen and molecular tests. Antigen tests identify proteins on the virus; they come in disposable plastic cassettes, similar to pregnancy tests. Rapid molecular tests detect the virus’s genetic material in a similar way to laboratory methods, but using smaller devices that are easy to transport or to set up outside of a specialist laboratory. Both test nose or throat samples.

Why is this question important?

People with suspected COVID‐19 need to know quickly whether they are infected, so that they can self‐isolate, receive treatment, and inform close contacts. Currently, COVID‐19 infection is confirmed by a laboratory test called RT‐PCR, which uses specialist equipment and often takes at least 24 hours to produce a result.

Rapid point‐of‐care tests could open access to testing for many more people, with and without symptoms, potentially in locations other than healthcare settings. If they are accurate, faster diagnosis could allow people to take appropriate action more quickly, with the potential to reduce the spread of COVID‐19.

What did we want to find out?

We wanted to know whether commercially available, rapid point‐of‐care antigen and molecular tests are accurate enough to diagnose COVID‐19 infection reliably, and to find out if accuracy differs in people with and without symptoms.

What did we do?

We looked for studies that measured the accuracy of any commercially produced, rapid antigen or molecular point‐of‐care test, in people tested for COVID‐19 using RT‐PCR. People could be tested in hospital or the community. Studies could test people with or without symptoms.

Tests had to use minimal equipment, be performed safely without risking infection from the sample, and have results available within two hours of the sample being collected.

What we found

We included 64 studies in the review. They investigated a total of 24,087 nose or throat samples; COVID‐19 was confirmed in 7415 of these samples. Studies investigated 16 different antigen tests and five different molecular tests. They took place mainly in Europe and North America.

Main results

Antigen tests

In people with confirmed COVID‐19, antigen tests correctly identified COVID‐19 infection in an average of 72% of people with symptoms, compared to 58% of people without symptoms. Tests were most accurate when used in the first week after symptoms first developed (an average of 78% of confirmed cases had positive antigen tests). This is likely to be because people have the most virus in their system in the first days after they are infected.

In people who did not have COVID‐19, antigen tests correctly ruled out infection in 99.5% of people with symptoms and 98.9% of people without symptoms.

Different brands of tests varied in accuracy. Pooled results for one test (SD Biosensor STANDARD Q) met World Health Organization (WHO) standards as ‘acceptable’ for confirming and ruling out COVID‐19 in people with signs and symptoms of COVID‐19. Two more tests met the WHO acceptable standards (Abbott Panbio and BIONOTE NowCheck) in at least one study.

Using summary results for SD Biosensor STANDARD Q, if 1000 people with symptoms had the antigen test, and 50 (5%) of them really had COVID‐19:

‐ 53 people would test positive for COVID‐19. Of these, 9 people (17%) would not have COVID‐19 (false positive result).

‐ 947 people would test negative for COVID‐19. Of these, 6 people (0.6%) would actually have COVID‐19 (false negative result).

In people with no symptoms of COVID‐19 the number of confirmed cases is expected to be much lower than in people with symptoms. Using summary results for SD Biosensor STANDARD Q in a bigger population of 10,000 people with no symptoms, where 50 (0.5%) of them really had COVID‐19:

‐ 125 people would test positive for COVID‐19. Of these, 90 people (72%) would not have COVID‐19 (false positive result).

‐ 9,875 people would test negative for COVID‐19. Of these, 15 people (0.2%) would actually have COVID‐19 (false negative result).

Molecular tests

Although overall results for diagnosing and ruling out COVID‐19 were good (95.1% of infections correctly diagnosed and 99% correctly ruled out), 69% of the studies used the tests in laboratories instead of at the point‐of‐care and few studies followed test manufacturer instructions. Most of the data relate to the ID NOW and Xpert Xpress tests. We noted a large difference in COVID‐19 detection between the two tests, but we cannot be certain about whether results will remain the same in a real world setting. We could not investigate differences in people with or without symptoms, nor time from when symptoms first showed because the studies did not provide enough information about their participants.

How reliable were the results of the studies?

In general, studies that assessed antigen tests used more rigorous methods than those that assessed molecular tests, particularly when selecting participants and performing the tests. Sometimes studies did not perform the test on the people for whom it was intended and did not follow the manufacturers’ instructions for using the test. Sometimes the tests were not carried out at the point‐of‐care. Nearly all the studies (97%) relied on a single negative RT‐PCR result as evidence of no COVID‐19 infection. Results from different test brands varied, and few studies directly compared one test brand with another. Finally, not all studies gave enough information about their participants for us to judge how long they had had symptoms, or even whether or not they had symptoms.

What does this mean?

Some antigen tests are accurate enough to replace RT‐PCR when used in people with symptoms. This would be most useful when quick decisions are needed about patient care, or if RT‐PCR is not available. Antigen tests may be most useful to identify outbreaks, or to select people with symptoms for further testing with PCR, allowing self‐isolation or contact tracing and reducing the burden on laboratory services. People who receive a negative antigen test result may still be infected.

Several point‐of‐care molecular tests show very high accuracy and potential for use, but more evidence of their performance when evaluated in real life settings is required.

We need more evidence on rapid testing in people without symptoms, on the accuracy of repeated testing, testing in non‐healthcare settings such as schools (including self‐testing), and direct comparisons of test brands, with testers following manufacturers’ instructions.

How up‐to‐date is this review?

This review updates our previous review and includes evidence published up to 30 September 2020.

Authors' conclusions

Implications for practice

We consider the implications for practice for this review separately for symptomatic and for asymptomatic testing.

In the Role of index test(s) section, we suggested that for symptomatic individuals, and if sufficiently accurate, point‐of‐care testing could be used either to replace laboratory‐based RT‐PCR or as a triage to RT‐PCR. As point‐of‐care tests are more accessible and provide a result more quickly than RT‐PCR, theoretically their use may increase detection and speed up isolation and contact‐tracing, leading to reduction in disease spread and reduce the burden on laboratory services.

The evidence included to date suggests that:

1. For diagnosis in symptomatic individuals in the first few days of symptoms, the most accurate rapid antigen tests are a useful alternative to laboratory‐based RT‐PCR where immediate results are required for timely patient management or where there are significant logistical or financial challenges in delivering RT‐PCR in a timely manner. Rapid antigen tests are only sufficiently sensitive in the first week since onset of symptoms.

Antigen tests vary in sensitivity, and only those shown to meet appropriate criteria, such as WHO's priority target product profiles for COVID‐19 diagnostics (i.e. sensitivity ≥ 80% and specificity ≥ 97%; WHO 2020c), could be considered as a rational substitute for RT‐PCR.

Tests had high specificity, thus in symptomatic populations (where prevalence is likely to be high) the risk of false positives is low. At 80% sensitivity compared to RT‐PCR, the probability that infected individuals are missed is 20% higher than for RT‐PCR. Thus the possibility of false negative results should be considered in those with a high clinical suspicion of COVID‐19, particularly if tested several days after onset of symptoms when viral load levels may have fallen.

2. Rapid antigen tests may be used simultaneously in combination with RT‐PCR for symptomatic people, particularly where RT‐PCR turn‐around times are slow, to exploit the benefits of earlier results and consequent contact‐tracing and isolation. Given the risk of false‐negative results, isolation may be required until RT‐PCR‐negative results are obtained. Similarly, for investigation of local outbreaks, rapid antigen testing in a clearly defined population may establish cases and contacts that require isolation whilst awaiting results from RT‐PCR.

In other circumstances rapid antigen tests may be used to triage to follow‐on RT‐PCR tests (rather than all receiving PCR tests) dependent on prevalence and the consideration of the consequences of false positive and false negative results.

Where prevalence is low, positive rapid test results require confirmatory testing to avoid unnecessary quarantine measures (PPVs around 85% to 90% for antigen assays mean that between 1 in 10 and 1 in 7 positive results will be falsely positive). If unverified, negative rapid test results should be delivered with appropriate advice on self‐isolation procedures for the duration of symptoms in order to minimise the effect on transmission of infection from missed cases. RT‐PCR tests should still be considered for people with a high clinical suspicion of COVID‐19 and negative rapid test..

Where prevalence is higher (i.e. 20% or higher), false positives are less of a concern (PPVs are 96% to 100%) but the impact from false negative results becomes increasingly important and all test negatives may be considered for verification. At 20% prevalence, and using data for the more sensitive of our three exemplar assays, between 3% and 6% of those with negative rapid test results are missed cases of SARS‐CoV‐2 (24 to 50 cases missed out of a total of 200 cases). The lower the NPV the greater the potential effect on transmission of infection from missed cases and greater the impact from delays in commencement of contact tracing. For scenarios in which positive results do not have confirmatory testing, it is important that assays with high specificities (in the range of 99% to 100%) are selected in order to minimise the impact from false positive results at higher prevalences of disease.

3. We identified virtually no evidence for mass screening of asymptomatic individuals using rapid antigen tests in people with no known exposure. A small study screening travellers returning from high‐risk countries (Cerutti 2020), identified only five SARS‐CoV‐2 infections (prevalence of 3%) with a reported sensitivity of antigen testing for detecting infection of 40%. However, important larger studies have been published since the end of our search, as mentioned above.

The key focus in mass screening is identification of individuals who are or will become infectious. PCR‐positives define those who had detectable viral particles on their swab, which will include most of those who are or will become infectious, but also include individuals post‐infection with residual viral particles. Without a reference standard for infectiousness, test accuracy studies cannot assess the ability of the test to detect the infectious subgroup of infections, and cannot provide evidence as to how well rapid antigen tests differentiate between individuals requiring isolation and those who provide no risk. The effectiveness of mass screening using these tests will only be established though outcome studies, such as cluster‐randomised community trials.

Given the low false positive rate of rapid tests, when used in a period of outbreak, those found testing positive will have a high chance of being true positives, and thus the test can be used to identify cases requiring isolation. Consideration should be made as to whether test positives should be confirmed with PCR to identify false positives. With a 1% prevalence, a test with 40% sensitivity and 99.6% specificity would yield as many false positives as true positives.

However, the low and variable sensitivity, and lack of evidence that those who test negative are not, or will not become, infectious indicates that those who are rapid antigen test‐negative cannot be considered free of risk of being, or of becoming, infectious. In any screening or mass testing programme people testing negative may still have a non‐negligible risk of infection.

4. We did not find any evidence of test accuracy in at‐risk asymptomatic groups, such as contacts of confirmed cases, hospital workers, or during local outbreaks at schools, workplaces, or care homes. The impact of low‐sensitivity tests in these settings is greater than in mass screening, as there will be higher numbers of false negatives, which could either create new outbreaks or will increase the severity of existing outbreaks. Positive cases will be more likely to be true positives than in mass screening settings.

5. We did not find any evidence evaluating the repeated use of tests. Although serial testing (over a number of days), or combinations of different rapid tests (e.g. an antigen test followed by a rapid molecular test) on the same sample are proposed to overcome the limitations of low test sensitivity, they all require validation. Use of multiple tests may increase false positive results, and there are likely to be many individuals with repeated false negative results reducing the expected benefit of subsequent tests. It is unlikely that models will be able to predict how well repeated tests and test combinations would work.

6. Some rapid molecular tests showed promising accuracy levels approximating those of laboratory‐based RT‐PCR and thus may have a role in small‐capacity settings where obtaining test results within two hours will enable appropriate decision making. Results for Xpert Xpress, COVID Nudge and SAMBA II all showed high sensitivity and specificity. However, we identified methodological concerns with many of the evaluations such that we cannot be certain as to how the tests will perform when used in a point‐of‐care setting. Any application in practice should be accompanied with a proper evaluation to ascertain performance in real‐world settings. Rapid molecular tests do not have all the logistical advantage of rapid antigen tests and the resource implications of their use at scale are potentially high, but they may be well suited for some testing scenarios. There is no evidence for use of rapid molecular tests in asymptomatic populations.

Our conclusions are in line with those in the first version of this review despite the increase in the evidence base. Ultimately, decisions around rapid testing will be driven not only by diagnostic accuracy but by acceptable levels of test complexity, time to result, access and acceptability to those being tested, and how test results influence individual behaviour, all of which might vary according to the setting in which the tests are to be used.

Implications for research

There is now a considerable volume of research for point‐of‐care tests for SARS‐CoV‐2 infection. However further well designed prospective and comparative evaluations of individual tests and test strategies in clinically relevant settings are urgently needed. Studies should recruit consecutive series of eligible participants and should clearly describe the clinical status, document time from symptom onset or time since exposure. Point‐of‐care tests must be conducted in accordance with manufacturer instructions for use, and across the spectrum of point‐of care settings and test operators.

There needs to be evaluations of both individual tests and strategies of use of repeated tests. For molecular assays field trials are needed, not only to demonstrate test accuracy in these groups but acceptability and ease of use outside of centralised laboratories.

We observed a number of studies of molecular assays employing discrepant analysis to confirm the disease status of samples with false positive results in particular. There is a considerable risk of this type of selective re‐testing leading to distorted results. If there is sufficient concern about the reliability of a single RT‐PCR test then all samples should be tested with two RT‐PCR assays. Finally, any future research study needs to be clear about eligibility and exclusion decisions throughout the whole diagnostic pathway, and should conform to the updated Standards for Reporting of Diagnostic Accuracy (STARD) guideline (Bossuyt 2015).

Consideration needs to be made of the best method for evaluating mass screening programmes. Whilst test accuracy studies help indicate which tests are likely to detect the greatest numbers of cases with the fewest false positives, assessing whether detecting asymptomatic cases leads to worthwhile reductions in disease spread will only be properly answered by studies of impact not accuracy.

Summary of findings

Open in table viewer
Summary of findings 1. Diagnostic accuracy of point‐of‐care antigen and molecular‐based tests for the diagnosis of SARS‐CoV‐2 infection

Question

What is the diagnostic accuracy of rapid point‐of‐care antigen and molecular‐based tests for the diagnosis of SARS‐CoV‐2 infection?

Population

Adults or children with suspected:

  • current SARS‐CoV‐2 infection

or populations undergoing screening for SARS‐CoV‐2 infection, including

  • asymptomatic contacts of confirmed COVID‐19 cases

  • community screening

Index test

Any rapid antigen or molecular‐based test for diagnosis of SARS‐CoV‐2 meeting the following criteria:

  • portable or mains‐powered device

  • minimal sample preparation requirements

  • minimal biosafety requirements

  • no requirement for a temperature‐controlled environment

  • test results available within 2 hours of sample collection

Target condition

Detection of current SARS‐CoV‐2 infection

Reference standard

For COVID‐19 cases: positive RT‐PCR alone or clinical diagnosis of COVID‐19 based on established guidelines or combinations of clinical features

For non‐COVID‐19 cases: negative RT‐PCR or pre‐pandemic sources of samples

Action

False negative results mean missed cases of COVID‐19 infection, with either delayed or no confirmed diagnosis and increased risk of community transmission due to false sense of security

False positive results lead to unnecessary self‐isolation or quarantine, with the potential for new infection to be acquired

Quantity of evidence

Sample type

Number studies

Total samples

Samples from confirmed SARS‐CoV‐2 cases

Respiratory

77

24,418

7484

Non‐respiratory

1

79

29

Limitations in the evidence

Risk of bias

(based on 78 studies)

Participants: high (29) or unclear (27) risk in 56 studies (72%)

Index test (antigen tests): high (0) or unclear (19) risk in 19 studies (40% of 48 studies)

Index test (molecular tests): high (3) or unclear (22) risk in 25 studies (83% of 30 studies)

Reference standard: high (66) unclear (6) risk in 72 studies (92%)

Flow and timing: high (29) or unclear (36) risk in 65 studies (83%)

Concerns about applicability

(based on 78 studies)

Participants: high concerns in 35 studies (45%)

Index test (antigen tests): high concerns in 23 studies (48% of 48 studies)

Index test (molecular tests): high concerns in 16 studies (53% of 30 studies)

Reference standard: high concerns in 76 studies (97%)

Findings: antigen tests

Evaluations (studies)

Samples (SARS‐CoV‐2 cases)

Sensitivity (95% CI)

[Range]

Specificity (95% CI)

[Range]

Symptomatic

37 (27)

15,530 (4410)

72.0 (63.7 to 79.0)

[0% to 100%]

99.5 (98.5 to 99.8)

[8% to 100%]

Symptomatic (up to 7 days from onset of symptoms)a

26 (21)

2320 (2320)

78.3 (71.1 to 84.1)

[15% to 95%]

Asymptomatic

12 (10)

1581 (295)

58.1 (40.2 to 74.1)

[29% to 85%]

98.9 (93.6 to 99.8)

[14% to 100%]

Examples of pooled results for individual antigen tests using data for evaluations compliant with manufacturer instructions for use according to symptom status

Tests

Evaluations

Samples

SARS‐CoV‐2

cases

Sensitivity (95% CI)

Specificity (95% CI)

Symptomatic participants

Coris Bioconcept ‐ COVID‐19 Ag Respi‐Strip

3

780

414

34.1 (29.7 to 38.8)

100 (99.0 to 100)

Abbott ‐ Panbio Covid‐19 Ag

3

1094

252

75.1 (57.3 to 87.1)

99.5 (98.7 to 99.8)

SD Biosensor ‐ STANDARD Q COVID‐19 Ag

3

1947

336

88.1 (84.2 to 91.1)

99.1 (97.8 to 99.6)

Asymptomatic participants

Coris Bioconcept ‐ COVID‐19 Ag Respi‐Strip

2

45

14

28.6 (8.4 to 58.1)

100 (88.8 to 100)

Abbott ‐ Panbio Covid‐19 Ag

1

474

47

48.9 (35.1 to 62.9)

98.1 (96.3 to 99.1)

SD Biosensor ‐ STANDARD Q COVID‐19 Ag

1

127

13

69.2 (38.6 to 90.9)

99.1 (95.2 to 100)

Symptomatic participants: average sensitivity and specificity (and 95% CIs) applied to a hypothetical cohort of 1000 patients where 50, 100 and 200 have COVID‐19 infection

Test

Prevalence

TP (95% CI)

FP (95% CI)

FN (95% CI)

TN (95% CI)

PPV

1 – NPV

Coris Bioconcept

5%

17 (15 to 19)

0 (0 to 10)

33 (31 to 35)

950 (941 to 950)

100%

3.4%

10%

34 (30 to 39)

0 (0 to 9)

66 (61 to 70)

900 (891 to 900)

100%

6.8%

20%

68 (59 to 78)

0 (0 to 8)

132 (122 to 141)

800 (792 to 800)

100%

14.1%

Abbott ‐ Panbio Covid‐19 Ag

5%

38 (29 to 44)

5 (2 to 12)

12 (6 to 21)

945 (938 to 948)

89%

1.3%

10%

75 (57 to 87)

5 (2 to 12)

25 (13 to 43)

896 (888 to 898)

94%

2.7%

20%

150 (115 to 174)

4 (2 to 10)

50 (26 to 85)

796 (790 to 798)

97%

5.9%

SD Biosensor ‐ STANDARD Q COVID‐19 Ag

5%

44 (42 to 46)

9 (4 to 21)

6 (4 to 8)

941 (929 to 946)

84%

0.6%

10%

88 (84 to 91)

8 (4 to 20)

12 (9 to 16)

892 (880 to 896)

92%

1.3%

20%

176 (168 to 182)

7 (3 to 18)

24 (18 to 32)

793 (782 to 797)

96%

2.9%

Asymptomatic participants: average sensitivity and specificity (and 95% CIs) applied to a hypothetical cohort of 10,000 patients where 50, 100 and 200 have COVID‐19 infection

Coris Bioconcept

0.5%

14 (4 to 29)

0 (0 to 1114)

36 (21 to 46)

9950 (8836 to 9950)

100%

0.4%

1%

29 (8 to 58)

0 (0 to 1109)

71 (42 to 92)

9900 (8791 to 9900)

100%

0.7%

2%

57 (17 to 116)

0 (0 to 1098)

143 (84 to 183)

9800 (8702 to 9800)

100%

1.4%

Abbott ‐ Panbio Covid‐19 Ag

0.5%

24 (18 to 31)

189 (90 to 368)

26 (19 to 32)

9761 (9582 to 9860)

11%

0.3%

1%

49 (35 to 63)

188 (89 to 366)

51 (37 to 65)

9712 (9534 to 9811)

21%

0.5%

2%

98 (70 to 126)

186 (88 to 363)

102 (74 to 130)

9614 (9437 to 9712)

34%

1.0%

SD Biosensor ‐ STANDARD Q COVID‐19 Ag

0.5%

35 (19 to 45)

90 (0 to 478)

15 (5 to 31)

9860 (9472 to 9950)

28%

0.2%

1%

69 (39 to 91)

89 (0 to 475)

31 (9 to 61)

9811 (9425 to 9900)

44%

0.3%

2%

138 (77 to 182)

88 (0 to 470)

62 (18 to 123)

9712 (9330 to 9800)

61%

0.6%

Findings: rapid molecular tests

Evaluations (studies)

Samples

SARS‐CoV‐2 cases

Average sensitivity (95% CI)

[Range]

Average specificity (95% CI)

[Range]

29 (26)

4351

1787

95.1 (90.5 to 97.6)

[57% to 100%]

98.8 (98.3 to 99.2)

[ 92% to 100%]

Pooled results for individual tests using data from compliant with manufacturer instructions for use

Tests

Evaluations

Samples

SARS‐CoV‐2

cases

Sensitivity (95% CI)

Specificity (95% CI)

Abbott ‐ ID NOW

4

812

222

73.0 (66.8 to 78.4)

99.7 (98.7 to 99.9)

Cepheid ‐ Xpert Xpress

2

100

29

100 (88.1 to 100)

97.2 (89.4 to 99.3)

DRW ‐ SAMBA II

1

149

33

87.9 (71.8 to 96.6)

97.4 (92.6 to 99.5)

DNANudge COVID Nudge

1

386

71

94.4 (86.2 to 98.4)

100 (98.8 to 100)

Average sensitivity and specificity (and 95% CIs) applied to a hypothetical cohort of 1000 patients where 50, 100 and 200 have COVID‐19 infection

Tests

Prevalence

TP (95% CI)

FP (95% CI)

FN (95% CI)

TN (95% CI)

PPVb

1 – NPVc

ID NOW

5%

37 (33 to 39)

3 (1 to 12)

14 (11 to 17)

947 (938 to 949)

93%

1.4%

10%

73 (67 to 78)

3 (1 to 12)

27 (22 to 33)

897 (888 to 899)

96%

2.9%

20%

146 (134 to 157)

2 (1 to 10)

54 (43 to 66)

798 (790 to 799)

98%

6.3%

Xpert Xpress

5%

50 (44 to 50)

27 (7 to 101)

0 (0 to 6)

923 (849 to 943)

65%

0.0%

10%

100 (88 to 100)

25 (6 to 95)

0 (0 to 12)

875 (805 to 894)

80%

0.0%

20%

200 (176 to 200)

22 (6 to 85)

0 (0 to 24)

778 (715 to 794)

90%

0.0%

SAMBA II

5%

44 (36 to 48)

25 (5 to 70)

6 (2 to 14)

925 (880 to 945)

64%

0.6%

10%

88 (72 to 97)

23 (5 to 67)

12 (3 to 28)

877 (833 to 896)

79%

1.4%

20%

176 (144 to 193)

21 (4 to 59)

24 (7 to 56)

779 (741 to 796)

89%

3.0%

COVID Nudge

5%

47 (43 to 49)

0 (0 to 11)

3 (1 to 7)

950 (939 to 950)

100%

0.3%

10%

94 (86 to 98)

0 (0 to 11)

6 (2 to 14)

900 (889 to 900)

100%

0.6%

20%

189 (172 to 197)

0 (0 to 10)

11 (3 to 28)

800 (790 to 800)

100%

1.4%

1 – NPV: 1 – negative predictive value (the percentage of people with negative results who are infected); Ag: antigen;CI: confidence interval; FN: false negative; FP: false positive;IFU: [manufacturers'] instructions for use; PPV: positive predictive value (the percentage of people with positive results who are infected); RT‐PCR: reverse transcription polymerase chain reaction; TN: true negative; TP: true positive

aSpecificity only estimated in 8 of 26 evaluations by time after symptom onset.
bPPV (positive predictive value) defined as the percentage of positive rapid test results that are truly positive according to the reference standard diagnosis.
c1‐NPV (negative predictive value), where NPV is defined as the percentage of negative rapid test results that are truly negative according to the reference standard diagnosis.

Background

Severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) and the resulting COVID‐19 pandemic present important diagnostic evaluation challenges. These range from: understanding the value of signs and symptoms in predicting possible infection; assessing whether existing biochemical and imaging tests can identify infection or people needing critical care; and evaluating whether in vitro diagnostic tests can accurately identify and rule out current SARS‐CoV‐2 infection, and identify those with past infection, with or without immunity.

We are creating and maintaining a suite of living systematic reviews to cover the roles of tests and patient characteristics in the diagnosis of COVID‐19. This review is the first update of a review summarising evidence of the accuracy of rapid antigen and molecular tests that are suitable for use at the point of care. In some scenarios the tests could potentially be used as alternatives to standard laboratory‐based molecular assays, such as reverse transcription polymerase chain reaction (RT‐PCR) assays, that are relied on for identifying current infection, in others they may be used where no testing is currently done. If sufficiently accurate, point‐of‐care tests have the potential to greatly expand access and speed of testing, In turn, if accurate, they may have greater impact on public health than laboratory‐based molecular methods as they are less expensive, provide results more quickly and do not require the same technical expertise and laboratory capacity. These tests can be undertaken locally, avoiding the need for centralised testing facilities that rarely meet the needs of patients, caregivers, health workers and society as a whole, especially in low‐ and middle‐income countries. As these are rapid tests, their results can be returned within the same clinical encounter, facilitating timely decisions concerning the need for isolation and contract tracing activities.

Target condition being diagnosed

COVID‐19 is the disease caused by infection with the SARS‐CoV‐2 virus. The key target conditions for this suite of reviews are current SARS‐CoV‐2 infection, current COVID‐19 disease, and past SARS‐CoV‐2 infection. The tests included in this review concern the identification of current infection, as defined by reference standard methods of diagnosis, including molecular assays such as RT‐PCR, or internationally recognised clinical guidelines for diagnosis of SARS‐CoV‐2. In the context of test evaluation, and throughout this review, we use the term 'reference standard' to denote the best available method (test or tests) for diagnosing the target condition, as opposed to other uses of the term in diagnostic virology (such as reference methods or reference materials).

For current infection, the severity of the disease is of ultimate importance for patient outcomes. However, rapid testing does not establish severity of disease, and for this review we consider the role of point‐of‐care tests for detecting SARS‐CoV‐2 infection of any severity, distinguishing only between symptomatic and asymptomatic infection.

COVID‐19 public health interventions focus on reducing disease transmission, thus it is important to identify and isolate people who are infected before or whilst they are infectious. It is reasonably presumed that people with symptoms who meet national criteria for COVID‐19 testing, or who are identified through contact tracing, have a high enough risk of being infectious to ask them to isolate. However, assessing the risk of an individual being infectious in asymptomatic screening is more difficult, as there is no reference standard test for being ‘infectious’. Using RT‐PCR status as a reference standard (as is done for target condition of ‘infection’) will ensure that infectious people are not missed, but as RT‐PCR continues to detect viral RNA days and weeks after the onset of infection will wrongly classify some people as infectious. Alternative reference standards that have been proposed for infectiousness include assessing the viability of the virus using viral culture, or using a value of the cycle threshold (Ct value) from RT‐PCR results to group individuals above or below a particular value (as a proxy for viral load) as more or less likely to be infectious. Converting Ct values (also known as quantification cycle (Cq) or crossing point (Cp) values) into direct quantitative values of viral load (viral copies per cell) is possible but challenging, as the relationship between Ct values and viral load varies between machines and laboratories. Thus comparison at fixed Ct values is unlikely to be comparable across studies. Viral culture is unsuitable as a reference standard because it is technically complex and often unreliable, which leads to it being an insensitive test (the failure to culture virus potentially being a result of the culture technique and not an indicator of non‐infectiousness). The suitability of RT‐PCR is limited as the inverse relationship between viral load (Ct value) and risk of infection is a continuum of risk without there being a meaningful cut‐point (with virus being cultured from samples with Ct values as high as 35 (Singanayagam 2020)). Similarly, those with low viral loads at the onset of infection will be missed. A preferable alternative, of tracking contacts for evidence of secondary infections, requires longitudinal follow‐up and is better considered as a question about risk of transmission, which can be addressed using predictive modelling approaches (taking into account host, agent and environmental factors). This is in contrast to the diagnostic test accuracy paradigm which can only determine if individuals are infected at a single point in time.

For these reasons, this review only focuses on the target condition of 'infection' for both symptomatic and asymptomatic applications of tests. We do report results where they are presented split by an RT‐PCR Ct value to report on accuracy according to groups with higher and lower viral load, but advise caution on their interpretation considering the lack of standardisation of PCR Ct values. Given the current state of the scientific knowledge we do not consider it appropriate to consider these as groups which are defined as 'infectious' and 'not infectious'.

RT‐PCR carries a very small risk of false positive results for infection and a higher risk of false negative results. False positive results may result from failures in sampling or laboratory protocols (e.g. mislabelling), contamination during sampling or processing, or low‐level reactions during PCR (Healy 2020; Mayers 2020). At times when SARS‐CoV‐2 infections have been rare, population prevalence surveys using RT‐PCR have shown test positivity rates of 0.44% (95% credible interval: 0.22% to 0.76%) (August 2020; ONS 2020), and 0.077% (0.065%, 0.092%) (June to July 2020; Riley 2020 React‐1 study). These values can be used to place an upper bound on the possible false positive rate of RT‐PCR of less than 0.077% (as the total numbers testing positive will comprise both true positive and false positive RT‐PCR results). The World Health Organization (WHO) recently issued a notice of concern regarding interpretation of specimens at or near the limit for PCR positivity (i.e. those with high cycle threshold (Ct) values), citing potential difficulties in distinguishing the presence of the target virus from these types of background ‘noise’ (WHO 2020a). False negative rates have been estimated by looking at individuals with symptoms who initially test negative, but positive on a subsequent test. These rates have been estimated to be as high as 20% to 30% in the first week of symptom onset; Arevalo‐Rodriguez 2020; Yang 2020a; Zhao 2020; Kucirka 2020). Including probable COVID‐19 cases within the target condition, as defined by internationally recognised clinical guidelines for diagnosis of SARS‐CoV‐2 will partially mitigate these missed cases.

Index test(s)

The primary consideration for the eligibility of tests for inclusion in this review is that they should detect current infection and should have the capacity to be performed at the ‘point of care’ or in a ‘near‐patient’ testing role. There is an ongoing debate around the specific use and definitions of these terms, therefore for the purposes of this review, we consider ‘point‐of‐care’ and ‘near patient’ to be synonymous, but for consistency and avoidance of confusion, we use the term ‘point‐of‐care’ throughout.

We have adapted a definition of point‐of‐care testing, namely that it “refers to decentralized testing that is performed by a minimally trained healthcare professional near a patient and outside of central laboratory testing” (WHO 2018), with the additional caveat that test results must be available within a single clinical encounter (Pai 2012). Our criteria for defining a point‐of‐care test are therefore:

  • the equipment for running and or reading the assay must be portable or easily transported, although mains power may be required;

  • minimal sample preparation requirements, for example, single‐step mixing, with no requirement for additional equipment or precise sample volume transfer unless a disposable automatic fill or graduated transfer device is used;

  • minimal biosafety requirements, for example, personal protective equipment (PPE) for sample collector and test operator, good ventilation and a biohazard bag for waste disposal;

  • no requirement for a temperature‐controlled environment; and

  • test results available within two hours of sample collection.

Tests for detection of current infection that are currently suitable for use at the point of care include antigen tests and molecular‐based tests. Both types of test use the same respiratory‐tract samples acquired by swabbing, washing or aspiration as for laboratory‐based RT‐PCR. Rapid antigen tests use lateral flow immunoassays, which are disposable devices, usually in the form of plastic cassettes akin to a pregnancy test. Viral antigen is captured by dedicated antibodies that are either colloidal gold‐ or fluorescent‐labelled. Antigen detection is indicated by visible lines appearing on the test strip (colloidal gold‐based immunoassays, or CGIA), or through fluorescence, which can be detected using an immunofluorescence analyser (fluorescence immunoassays or FIA). Molecular‐based tests to detect viral ribonucleic acid (RNA) have historically been laboratory‐based assays using RT‐PCR technology (see Alternative test(s)). In recent years, automated, single‐step RT‐PCR methods have been developed, as well as other nucleic acid amplification methods, such as isothermal amplification, that do not require the sophisticated thermo cycling involved in RT‐PCR (Green 2020). These technological advances have allowed molecular technologies to be developed that are suitable for use in a point‐of‐care context (Kozel 2017), however they still require small portable machines and many take longer to produce results than antigen tests.

Following the emergence of COVID‐19 there has been prolific industry activity to develop accurate tests. The Foundation for Innovative Diagnostics (FIND) and Johns Hopkins Centre for Health Security have maintained online lists of available tests for SARS‐CoV‐2 (FIND 2020). At the time of writing (5 January 2021), FIND listed 129 rapid antigen tests, 118 of which are described as "commercialised" and 92 have been identified as having regulatory approval. These numbers are a substantial increase on the 48 listed, 32 commercialised and 21 with regulatory approval at the time of our original review (19 July 2020). A total of 142 molecular tests were described as automated, including both laboratory‐based assays and assays suitable for use outside of a laboratory setting (i.e. near or at the point of care). Further information from FIND indicates that 53 of the 142 assays were categorised as point‐of‐care or near point‐of‐care tests, including 43 with regulatory approval. This classification was based on the information provided to FIND by the test manufacturers and does not necessarily mean that these tests meet the criteria for point‐of‐care tests that we have specified for this review. The numbers of tests of these types will continue to increase over time.

Given the urgent need to identify the evidence base for tests that are available for purchase, the focus of this first update of the review is on tests that are commercially produced. All commercially produced assays are supplied with a specific product code, product inserts or instructions for use (IFU) sheets that document the intended use of the test; sample storage and preparation and testing procedures; who should deliver the test and in whom; and any restrictions around the type of samples that can be used.

There are many proposals for serial testing with lateral flow tests to detect infection, rather than a single use. In this case it would be appropriate to evaluate the accuracy of the strategy rather than a single test.

Clinical pathway

Patients may be tested for SARS‐CoV‐2 when they present with symptoms, have had known exposure to a confirmed case, or in a screening context, with no known exposure to SARS‐CoV‐2. The standard approach to diagnosis of SARS‐CoV‐2 infection is through laboratory‐based testing of swab samples taken from the upper respiratory (e.g. nasopharynx, oropharynx) or lower respiratory tract (e.g. bronchoalveolar lavage or sputum) with RT‐PCR. RT‐PCR is the primary method for detecting infection during the acute phase of the illness while the virus is still present. Both the WHO and the China CDC (National Health Commission of the People's Republic of China), have produced case definitions for COVID‐19 that include the presence of convincing clinical evidence (some including positive serology tests) when RT‐PCR is negative (Appendix 1).

Prior test(s)

Signs and symptoms are used in the initial diagnosis of suspected SARS‐CoV‐2 infection and to help identify those requiring tests. A number of key symptoms have been suggested as indicators of mild to moderate COVID‐19, including: cough, fever greater than 37.8 °C, headache, breathlessness, muscle pain, fatigue, and loss of sense of smell and taste (Struyf 2021). However, the recently published review of signs and symptoms found good evidence for the accuracy for these symptoms alone or in combination to be lacking (Struyf 2021).

Where people are asymptomatic but are being tested as part of screening (e.g. universal testing of students as part of a risk‐reduction effort) or on the basis of epidemiological risk factors, such as exposure to someone with confirmed SARS‐CoV‐2 or following travel to more highly endemic countries, no prior tests will have been conducted.

Role of index test(s)

For most settings in which testing for acute SARS‐CoV‐2 infection in symptomatic individuals takes place, results of molecular laboratory‐based RT‐PCR tests are unlikely to be available within a single clinical encounter. Point‐of‐care tests potentially have a role either as a replacement for RT‐PCR (if sufficiently accurate), or as a means of triaging and rapid management (quarantine or treatment, or both), with confirmatory RT‐PCR testing for those with negative rapid test results (CDC 2020; WHO 2020b). Obtaining quick results within a healthcare visit will allow faster decisions about isolation and healthcare interventions for those with positive test results, and allow contact tracing to begin in a more timely manner. Modelling studies suggest contact tracing is most effective if it starts within 24 hours of case detection, with delays in testing (e.g. due to laboratory turnaround time for reporting PCR results) leading to reductions in the proportion of onward transmissions per index case that can be prevented by track and trace (Kretzschmar 2020).

If sufficiently accurate, negative rapid test results in symptomatic patients could allow faster return to work or school, therefore conferring important economic and educational implications. Negative results also allow immediate consideration of other causes of symptoms, which may be time‐sensitive, for example bacterial pneumonia or thrombo‐embolism.

For asymptomatic individuals, if accurate, rapid tests may also be considered for screening at‐risk (exposed) populations, for example in hospital workers or in local outbreaks.

Rapid tests, particularly antigen tests which can be more easily delivered at scale, could also be used for mass screening purposes as recently piloted in Slovakia and in Liverpool UK (University of Liverpool 2020), or used in a more targeted fashion such as single test application at airports or for border entry, to allow entry to large public gatherings, or screening students as a risk‐reduction strategy (Ferguson 2020). Preliminary data on the rollout of such a policy in the UK has highlighted the many challenges in such an approach (Deeks 2020a; Nabavi 2021), and the requirement for full and proper field trial evaluations. Frequent repeated use of antigen tests in asymptomatic individuals with no known exposure to identify COVID‐19 cases has also been proposed (Larremore 2020), but field trial evaluations would be required to determine whether promising results from modelling studies can be borne out in practical settings (Crozier 2021).

Alternative test(s)

This review is one of seven that cover the range of tests and clinical characteristics being considered in the management of COVID‐19 (Deeks 2020b; McInnes 2020), five of which have already been published (Deeks 2020c; Salameh 2020; Stegeman 2020; Struyf 2021), including the first iteration of this review (Dinnes 2020). Full details of the alternative tests and evidence of their accuracy is summarised in these reviews. The SARS‐CoV‐2‐specific biomarker tests that might be considered as alternatives to point‐of‐care tests are considered here.

Laboratory‐based molecular tests

RT‐PCR tests for SARS‐CoV‐2 identify viral ribonucleic acid (RNA). Reagents for RT‐PCR were rapidly produced once the viral RNA sequence was published (Corman 2020). Testing is undertaken in central laboratories and can be very labour‐intensive, with several points along the path of performing a single test where errors may occur, although some automation of parts of the process is possible. The amplification process requires thermal cycling equipment to allow multiple temperature changes within a cycle, with cycles repeated up to 40 times until viral DNA is detected (Carter 2020). Although the amplification process for RT‐PCR can be completed in a relatively short timeframe, the stages of extraction, sample processing and data management (including reporting) mean that test results are typically only available in 24 to 48 hours. Where testing is undertaken in a centralised laboratory, transport times increase this further. The time to result for fully automated RT‐PCR assays is shorter than for manual RT‐PCR, however most assays still require sample preparation steps that make them unsuitable for use at the point of care. Other nucleic acid amplification methods, including loop‐mediated isothermal amplification (LAMP), or CRISPR‐based nucleic acid detection methods, that allow amplification at a constant temperature are now commercially available (Chen 2020). These methods have the potential to reduce the time to produce test results after extraction and sample processing to minutes, but the time for the whole process may still be significant. Laboratory‐based molecular tests are most often applied to upper and lower respiratory samples although they are also being used on faecal and urine samples.

Antibody tests

Serology tests to measure antibodies to SARS‐CoV‐2 have been evaluated in people with active infection and in convalescent cases (Deeks 2020c). Antibodies are formed by the body's immune system in response to infections, and can be detected in whole blood, plasma or serum. Antibody tests are available for laboratory use including enzyme‐linked immunosorbent assay (ELISA) methods, or more advanced chemiluminescence immunoassays (CLIA). There are also rapid lateral flow assays (LFA)s for antibody testing that use a minimal amount of whole blood, plasma or serum on a testing strip as opposed to the respiratory specimens that are used for rapid antigen tests; all assays for antibody detection are considered in Deeks 2020c.

Rationale

It is essential to understand the clinical accuracy of tests and clinical features to identify the best way they can be used in different settings to develop effective diagnostic and management pathways for SARS‐CoV‐2 infection and disease. The suite of Cochrane living systematic reviews summarises evidence on the clinical accuracy of different tests and diagnostic features. Estimates of accuracy from these reviews will help inform diagnosis, screening, isolation, and patient‐management decisions.

Summary of the previous version of the review

The first iteration of this review (Dinnes 2020), included 22 publications reporting on a total of 18 study cohorts with 3198 unique samples, 1775 of which had confirmed SARS‐CoV‐2 infection. We identified data for eight commercial tests (four antigen and four molecular) and one in‐house antigen test.

We did not find any studies at low risk of bias and had concerns about applicability of results across all studies. We judged patient selection to be at high risk of bias in 50% of the studies because of deliberate oversampling of samples with confirmed SARS‐CoV‐2 infection (sample enrichment) and unclear in 38% (7/18) because of poor reporting. Sixteen (89%) studies used only a single, negative RT‐PCR to confirm the absence of SARS‐CoV‐2 infection, risking missing infection. There was a lack of information on blinding of index test (n = 11), and about participant exclusions from analyses (n = 10). We did not observe differences in methodological quality between antigen and molecular test evaluations.

The eight evaluations of antigen tests reported considerable variation in sensitivity across studies (from 0% to 94%) with less variation in specificities (from 90% to 100%). The average sensitivity was 56.2% (95% CI 29.5 to 79.8%) and average specificity was 99.5% (95% CI 98.1% to 99.9%) (based on 943 samples, 596 with confirmed SARS‐CoV‐2). Data for individual antigen tests were limited with no more than two studies for any test.

We observed less variation in sensitivities across 13 evaluations of rapid molecular assays (range 68% to 100%) with similar variation in specificities (range 92% to 100%). Average sensitivity was 95.2% (95% CI 86.7% to 98.3%) and specificity 98.9% (95% CI 97.3% to 99.5%) based on a total of 2255 samples.

We were able to calculate pooled results for only two molecular tests: ID NOW (Abbott Laboratories; 5 evaluations) and Xpert Xpress (Cepheid Inc; 6 evaluations). Summary sensitivity for the Xpert Xpress assay (99.4%, 95% CI 98.0% to 99.8%) was 22.6 (95% CI 18.8 to 26.3) percentage points higher than that of ID NOW (76.8%, (95% CI 72.9% to 80.3%), whilst the specificity of Xpert Xpress (96.8%, 95% CI 90.6% to 99.0%) was marginally lower than ID NOW (99.6%, 95% CI 98.4% to 99.9%; a difference of −2.8 percentage points (95% CI from 6.4 percentage points lower to 0.8 higher).

Changes in the evidence base since the previous version

There has been a considerable increase in the number of evaluations available of antigen tests, and a lesser rise in the number of evaluations of molecular tests. More studies report key population features such as setting, and symptom status, and there has been an increase in direct swab testing as would occur in a point‐of‐care setting. However, due to the nature of sampling and the use of direct swab testing, few comparative studies are available. This review considers the available evidence in relevant population groups and settings according to test brand and compliance with manufacturer IFUs. We used the WHO's priority target product profiles for COVID‐19 diagnostics (i.e. acceptable performance criterion of sensitivity ≥ 80% and specificity ≥ 97%, or desirable criterion of ≥ 80% sensitivity and ≥ 99% specificity; WHO 2020c) as a benchmark against which to consider test performance.

We will update this review as often as is feasible to ensure that it provides current evidence about the accuracy of point‐of‐care tests.

This review follows a generic protocol that covers six of the seven Cochrane COVID‐19 diagnostic test accuracy reviews (Deeks 2020b). The Background and Methods sections of this review therefore use some text that was originally published in the protocol (Deeks 2020b), and text that overlaps some of our other reviews (Deeks 2020c; Struyf 2021).

Objectives

To assess the diagnostic accuracy of rapid point‐of‐care antigen and molecular‐based tests to determine if a person presenting in the community or in primary or secondary care has current SARS‐CoV‐2 infection, and to consider accuracy separately in symptomatic and asymptomatic population groups.

We estimated accuracy overall and separately according to symptom status (symptomatic and asymptomatic). Although we might expect to see differences in accuracy for testing of asymptomatic individuals with an epidemiological exposure to SARS‐CoV‐2 (targeted screening) compared to testing of asymptomatic individuals in a population screening setting, we did not anticipate finding sufficient numbers of studies for each testing application to allow any such difference to be explored. We will revisit this decision in subsequent iterations of this review.

Secondary objectives

Where data are available, we will investigate potential sources of heterogeneity that may influence diagnostic accuracy (either by stratified analysis or meta‐regression) according to test method and index test, participant or sample characteristics (duration of symptoms and viral load), study setting, study design and reference standard used.

We investigated adherence to manufacturers' IFUs in sensitivity analyses.

Methods

Criteria for considering studies for this review

Types of studies

We applied broad eligibility criteria to include all patient groups (that is, if patient population was unclear, we included the study) and all variations of a test.

We included studies of all designs that produce estimates of test accuracy or provide data from which we can compute estimates, including the following.

  • Studies restricted to participants confirmed to either have (or to have had) the target condition (to estimate sensitivity) or confirmed not to have (or have had) the target condition (to estimate specificity). These types of studies may be excluded in future review updates.

  • Single‐group studies, which recruit participants before disease status has been ascertained

  • Multi‐group studies, where people with and without the target condition are recruited separately (often referred to as two‐gate or diagnostic case‐control studies)

  • Studies based on either patients or samples

We excluded studies from which we could not extract data to compute either sensitivity or specificity.

We carefully considered the limitations of different study designs in the quality assessment and analyses.

We included studies reported in published journal papers, as preprints, and publicly available reports from independent bodies.

Participants

We included studies recruiting people presenting with suspicion of current SARS‐CoV‐2 infection or those recruiting populations where tests were used to screen for disease (for example, contact tracing or community screening).

We also included studies that recruited people known to have SARS‐CoV‐2 infection and known not to have SARS‐CoV‐2 infection (i.e. cases only or multi‐group studies).

We excluded small studies with fewer than 10 samples or participants. Although the size threshold of 10 is arbitrary, such small studies are likely to give unreliable estimates of sensitivity or specificity and may be biased.

Index tests

We included studies evaluating any rapid antigen or molecular‐based test for diagnosis of SARS‐CoV‐2, if it met the criteria outlined in the Background, that is:

  • requiring minimal equipment;

  • minimal sample preparation and biosafety considerations;

  • results available within two hours of sample collection; and

  • should be commercially produced (with test name and manufacturer or distributor documented).

All sample types (respiratory or non‐respiratory) were eligible. Strategies based on multiple applications of a test were also eligible for inclusion.

Target conditions

The target condition was current SARS‐CoV‐2 infection (either symptomatic or asymptomatic). We also refer to SARS‐CoV‐2 infection as ‘COVID‐19 infection’, particularly in the Plain Language Summary and summary of findings Table 1.

Reference standards

We anticipated that studies would use a range of reference standards to define both the presence and absence of SARS‐CoV‐2 infection. For the QUADAS‐2 (Quality Assessment tool for Diagnostic Accuracy Studies; Whiting 2011), assessment we categorised each method of defining the presence of SARS‐CoV‐2 according to the risk of bias (the chances that it would misclassify the presence or absence of infection) and whether it defined COVID‐19 in an appropriate way that reflected cases encountered in practice. Likewise, we considered the risk of bias in definitions of the absence of SARS‐CoV‐2, and whether the definition captured all those who might be tested in practice.

Evaluations of molecular tests generally consider agreement between molecular assays, for example, agreement of a new rapid test against a more standard RT‐PCR test. For the purposes of this review, we considered RT‐PCR to be the ‘reference standard’ for SARS‐CoV‐2 infection, and present results as ‘sensitivity’ and ’specificity’ as opposed to percentage agreement. The result of further RT‐PCR analysis of discrepant cells (samples with results disagreeing on the rapid test and the RT‐PCR) were also considered in sensitivity analyses. As discrepant analysis involves retesting only a sub‐sample of patients selected according to index and reference standard results, it can introduce bias (Hadgu 1999). Retesting of all samples with a second test in a composite reference standard would be preferable when there are concerns over the accuracy of the first reference test.

Search methods for identification of studies

Electronic searches

We used two main sources for our electronic searches through 30 September 2020, which were devised with the help of an experienced Cochrane Information Specialist with diagnostic test accuracy review expertise (RSp). These searches aimed to identify all articles related to COVID‐19 and SARS‐CoV‐2 and were not restricted to those evaluating a particular type of test. Thus, the searches used no terms that specifically focused on an index test, diagnostic accuracy or study methodology.

Cochrane COVID‐19 Study Register searches

We used the Cochrane COVID‐19 Study Register (covid-19.cochrane.org/), for searches conducted from inception of the Register to 28 March 2020. At that time, the register was populated by searches of PubMed, as well as trials registers at US National Institutes of Health Ongoing Trials Register ClinicalTrials.gov (clinicaltrials.gov) and the WHO International Clinical Trials Registry Platform (apps.who.int/trialsearch).

Search strategies were designed for maximum sensitivity, to retrieve all human studies on COVID‐19 and with no language limits. See Appendix 2.

COVID‐19 Living Evidence Database from the University of Bern

From 28 March 2020, we used the COVID‐19 Living Evidence database from the Institute of Social and Preventive Medicine (ISPM) at the University of Bern (www.ispm.unibe.ch), as the primary source of records for the Cochrane COVID‐19 diagnostic test accuracy reviews. This search includes PubMed, Embase, and preprints indexed in bioRxiv and medRxiv databases. The strategies as described on the ISPM website are described here (ispmbern.github.io/covid-19/). See Appendix 3. To ensure comprehensive coverage we also downloaded records from the ‘Bern feed’ from 1 January to 28 March 2020 and de‐duplicated them against those obtained via the Cochrane COVID‐19 Study Register.

Due to the increased volume of published and preprint articles, from 25 May 2020 onwards we used artificial intelligence text analysis to conduct an initial classification of documents, based on their title and abstract information, for relevant and irrelevant documents (Appendix 4).

The decision to focus primarily on the Bern feed was because of the exceptionally large numbers of COVID‐19 studies available only as preprints. We are continuing to monitor the coverage of the Cochrane COVID‐19 Study Register and may move back to it as the primary source of records for subsequent review updates.

Other electronic sources

Prior to 28 March 2020 (when we began using the ‘Bern feed’), we identified Embase records through the Centers for Disease Control and Prevention (CDC), Stephen B Thacker CDC Library, COVID‐19 Research Articles Downloadable Database (cdc.gov/library/researchguides/2019novelcoronavirus/researcharticles.html), and de‐duplicated them against results from the Cochrane COVID‐19 Study Register. See Appendix 5.

We also checked our search results against two additional repositories of COVID‐19 publications up to 30 September 2020:

Both repositories allow their contents to be filtered according to studies potentially relating to diagnosis, and both have agreed to provide us with updates of new diagnosis studies added.

Searching other resources

We have also contacted or accessed the websites of independent research groups undertaking test evaluations (for example, UK Public Health England (PHE), the Société Française Microbiologie (SFM), the Dutch National Institute for Public Health and the Environment (RIVM)) and studies co‐ordinated by FIND (finddx.org/covid-19/sarscov2-eval) and accessed the Diagnostics Global Health listing of manufacturer independent evaluations of antigen detecting rapid diagnostic tests (Ag‐RDTs) for SARS‐CoV‐2 (diagnosticsglobalhealth.org). We last accessed these additional resources on 16 November 2020.

We appeal to researchers to supply details of additional published or unpublished studies at the following email address, which we will consider for inclusion in future updates ([email protected]).

Data collection and analysis

Selection of studies

A team of experienced systematic review authors from the University of Birmingham screened the titles and abstracts of all records retrieved from the literature searches following the application of artificial intelligence text analysis (described in Electronic searches). Two review authors independently screened studies in Covidence. A third, senior review author resolved any disagreements. We tagged all records selected as potentially eligible according to the Cochrane COVID‐19 diagnostic test accuracy review(s) for which they might be eligible and we then exported them to separate Covidence reviews for each review title.

We obtained the full texts for all studies flagged as potentially eligible. Two review authors independently screened the full texts for one of the COVID‐19 biomarker reviews (molecular, antigen or antibody tests). We resolved any disagreements on study inclusion through discussion with a third review author.

Data extraction and management

One review author extracted the characteristics of each study, which a second review author checked. Items that we extracted are listed in Appendix 6.

Both review authors independently performed data extraction of 2x2 contingency tables of the number of true positives, false positives, false negatives and true negatives. They resolved disagreements by discussion. Where possible, we separately extracted data according to symptom status (symptomatic, asymptomatic, mixed symptom status or not reported), viral load (high or low, according to Ct cut‐offs defined within each study), and time post‐symptom onset (week one versus week two) and for molecular assays, before and after re‐analysis of samples in discrepant cells. For categorisation by symptom status, we classed studies reporting at least 75% of participants as symptomatic as ‘mainly symptomatic', we considered studies with less than 75% symptomatic participants to report ‘mixed’ groups along with those that reported recruiting both symptomatic and asymptomatic participants but did not provide the percentages in each group. We considered studies that provided no information as to the symptom status of included participants ‘not reported’. We also coded evaluations according to compliance with manufacturer IFUs. We based coding on three aspects of testing:

  1. sample type (use of any sample not explicitly mentioned on the IFU scored 'No', otherwise scored 'Yes'),

  2. provision of instructions for samples in viral transport medium ((VTM); only scored for evaluations using samples in VTM and only scored 'Yes' if specific instructions provided; scored 'Unclear' if VTM used and instructions for use of samples in VTM not documented in IFU); and

  3. timing between sample collection and testing (scored 'Yes' only if all tests were carried out within specified time period, e.g. immediate on‐site testing, or for testing in laboratories if all tests reported to have been carried out within specified time period; scored 'Unclear' if time frame for testing was not reported and 'No' if any testing was carried out beyond the maximum stipulated timeframe).

We encourage study authors to contact us regarding missing details on the included studies ([email protected]).

Assessment of methodological quality

Two review authors independently assessed risk of bias and applicability concerns using the QUADAS‐2 checklist tailored to this review (Appendix 7; Whiting 2011). The two review authors resolved any disagreements by discussion.

Ideally, studies examining the use of tests in symptomatic people should prospectively recruit a representative sample of participants presenting with signs and symptoms of COVID‐19, either in community or primary care settings or to a hospital setting, and they should clearly record the time of testing after the onset of symptoms. Studies in asymptomatic people at risk of infection should document time from exposure. Studies applying tests in a screening setting should document eligibility criteria for screening, particularly if a targeted approach is used and should take care to record any previous confirmed or suspected SARS‐CoV‐2 infection or any relevant epidemiological exposures. Studies should perform tests in their intended use setting, using appropriate samples with or without viral transport medium and within the time period following specimen collection as indicated in the IFU document. Tests should be performed by relevant personnel (e.g. healthcare workers), and should be interpreted blinded to the final diagnosis (presence or absence of SARS‐CoV‐2). The reference standard diagnosis should be blinded to the result of the rapid test, and should not incorporate the result of the rapid test. If the reference standard includes clinical diagnosis of COVID‐19 for RT‐PCR‐negative patients, then established criteria should be used. Studies including samples from participants known not to have COVID‐19 should use pre‐pandemic sources or if contemporaneous samples then at least two RT‐PCR‐negative tests were required to confirm the absence of infection. Data should be reported for all study participants, including those where the result of the rapid test was inconclusive, or participants in whom the final diagnosis of COVID‐19 was uncertain. Studies should report whether results relate to participants (one sample per participant), or samples (multiple samples per participant).

Statistical analysis and data synthesis

We analysed rapid antigen and molecular tests separately. Studies often referred to ‘samples’ rather than ‘patients’, especially for the rapid molecular tests, however for many studies we do not suspect that inclusion of multiple samples per study participant was a significant issue. For consistency of terminology throughout the review, we refer to results on a per‐sample basis. If studies evaluated multiple tests in the same samples, we included them multiple times. We present estimates of sensitivity and specificity per study for each test brand using paired forest plots, and summarise results using average sensitivity and specificity in tables as appropriate. As heterogeneity is apparent in many analyses, these point estimates must be interpreted as the average of a distribution of values.

We did not make any formal comparisons between antigen assay brands because of the large number of different assays and small study numbers for many of them. We did however carry out a formal comparison (based on between‐study comparisons) for studies using two brands of molecular tests (ID NOW (Abbott Laboratories) and Xpert Xpress (Cepheid Inc)).

We estimated summary sensitivities and specificities with 95% confidence intervals (CI) using the bivariate model (Reitsma 2005), via the meqrlogit command of Stata/SE 16.0. When few studies were available, we simplified models by first assuming no correlation between sensitivity and specificity estimates and secondly by setting near‐zero variance estimates of the random effects to zero (Takwoingi 2017). In cases where there was only one study per test, we reported individual sensitivities and specificities with 95% CI constructed using the binomial exact method.

Where studies presented only estimates of sensitivity or of specificity, we fitted univariate, random‐effects, logistic regression models. In a number of instances where there was 100% sensitivity or specificity for all evaluations, we computed estimates and 95% CIs by summing the counts of TP, FP, FN and TN across 2x2 tables. These analyses are clearly marked in the tables. We present all estimates with 95% confidence intervals.

Investigations of heterogeneity

We examined heterogeneity between studies by visually inspecting the forest plots of sensitivity and specificity. Where adequate data were available, we investigated heterogeneity related to symptom status, time post‐symptom onset, viral load, test brand, and test method by including indicator variables in the random‐effects logistic regression models. Absolute differences between the sensitivity or specificity and the P values were reported from the model. In instances where only one study was available per test or when tests were being directly compared following summing of counts of the 2x2 tables, we performed test comparison using the two‐sample test of proportions. Few studies reported specificity estimates by time after symptom onset, therefore for this variable and for analyses by viral load, we considered only effects on sensitivity.

Sensitivity analyses

We performed four sensitivity analyses.

  1. We estimated summary sensitivities and specificities according to test brand and symptom status using only studies that were compliant to the IFU.

  2. We estimated sensitivity with and without studies that only evaluated samples with RT‐PCR‐confirmed SARS‐CoV‐2 (and thus did not estimate specificity).

  3. We performed the same analysis for specificity in studies that only evaluated RT‐PCR‐negative control samples.

  4. We made comparisons between analyses using the primary reference standard and analyses using results adjusted after retesting of samples with discrepant results with a second RT‐PCR test (discrepant analysis).

Assessment of reporting bias

We made no formal assessment of reporting bias but have indicated where we were aware that study results were available but unpublished.

Summary of findings

We summarised key findings in a 'Summary of findings' table indicating the strength of evidence for each test and findings, and highlighted important gaps in the evidence.

Updating

We are aware of additional studies published since the electronic searches were conducted on 30 September 2020 and plan to update this review. We have already conducted the next search to 1 January 2021.

Results

Results of the search

We screened 34,742 unique records (published or preprints) for inclusion in the complete suite of reviews to assist in the diagnosis of COVID‐19 (Deeks 2020b; McInnes 2020). Of 1749 records selected for further assessment for inclusion in any of the four molecular, antigen or antibody test reviews, we identified 199 full‐text reports requiring assessment for inclusion in this review; 90 for the first iteration of the review and 109 for this review update. See Figure 1 for the PRISMA flow diagram of search and eligibility results (McInnes 2018; Moher 2009).


Study flow diagram

Study flow diagram

We included 64 reports in this review, and we excluded 135 publications that did not meet our inclusion criteria. Exclusions were mainly based on index test (n = 85) or ineligible study designs (n = 26), for example, designs that did not allow estimation of test accuracy. The reasons for exclusion of all 135 publications are provided in Characteristics of excluded studies. Appendix 8 provides a list of studies evaluating eligible tests but excluded for other reasons (n = 5), and studies evaluating technologies not yet suitable for use at the point of care (n = 41).

Of the 64 study reports, 18 were available only as preprints, 38 were published papers and eight were publicly available reports either from independent reference laboratories (one from Public Health England and two identified via the SMF) or were independent evaluations co‐ordinated by FIND (n = 5).

We contacted the authors of 10 study reports for further information (Blairon 2020; Courtellemont 2020; Diao 2020; Gibani 2020; Gremmels 2020(a); Linares 2020; Nash 2020; Porte 2020a; Schildgen 2020 [A]; Weitzel 2020 [A]), and received replies and the requested information with one exception (Linares 2020). We also contacted the evaluation teams at FIND and Public Health England and received additional information about study methods from FIND and some additional data from Public Health England.

The 64 included study reports relate to 78 separate studies. Please note when naming studies, we use the letters [A], [B], [C] etc. in square brackets to indicate data on different tests evaluated in the same study and (a), (b), (c) to indicate data from different participant cohorts from the same study report. For example, the five included reports from FIND correspond to eight ‘studies’ because three reports separately provided data from more than one evaluation centre.

Of the 78 studies, 77 reported data for respiratory samples and one (Szymczak 2020), reported data for faecal samples. The main results, Tables and Figures focus on the respiratory samples, with Szymczak 2020 reported separately.

Description of included studies

The 77 studies using respiratory samples included a total of 24,418 unique samples, with 7484 samples with RT‐PCR‐confirmed SARS‐CoV‐2 (some samples were analysed by more than one index test). Forty‐eight studies evaluated antigen tests (Albert 2020; Alemany 2020; Billaud 2020; Blairon 2020; Cerutti 2020; Courtellemont 2020; Diao 2020; Fenollar 2020(a); Fenollar 2020(b); FIND 2020a; FIND 2020b; FIND 2020c (BR); FIND 2020c (CH); FIND 2020d (BR); FIND 2020d (DE); FIND 2020e (BR); FIND 2020e (DE); Fourati 2020 [A]; Gremmels 2020(a); Gremmels 2020(b); Gupta 2020; Kruger 2020(a); Kruger 2020(b); Kruger 2020(c); Lambert‐Niclot 2020; Linares 2020; Liotti 2020; Mak 2020; Mertens 2020; Nagura‐Ikeda 2020; Nash 2020; PHE 2020(a); PHE 2020(b); PHE 2020(c) [non‐HCW tested]; PHE 2020(d) [HCW tested]; PHE 2020(d) [Lab tested]; PHE 2020(e); Porte 2020a; Porte 2020b [A]; Schildgen 2020 [A]; Scohy 2020; Shrestha 2020; Takeda 2020; Van der Moeren 2020(a); Van der Moeren 2020(b); Veyrenche 2020; Weitzel 2020 [A]; Young 2020) and 29 studies evaluated molecular tests (Assennato 2020; Broder 2020; Chen 2020a; Collier 2020; Cradic 2020(a); Cradic 2020(b); Dust 2020; Ghofrani 2020; Gibani 2020; Goldenberger 2020; Harrington 2020; Hogan 2020; Hou 2020; Jin 2020; Jokela 2020; Lephart 2020 [A]; Lieberman 2020; Loeffelholz 2020; Mitchell 2020; Moore 2020; Moran 2020; Rhoads 2020; Smithgall 2020 [A]; SoRelle 2020; Stevens 2020; Thwe 2020; Wolters 2020; Wong 2020; Zhen 2020 [A]). Summary study characteristics are presented in Table 1 with further details of study design and index test details in Appendix 9 and Appendix 10 for antigen assays and Appendix 11 and Appendix 12 for molecular assays. Full details are provided in the Characteristics of included studies table.

Open in table viewer
Table 1. Description of studies

No. of studies (%)

Participants

Antigen tests

Rapid molecular

Number of studies

48

29

Sample size (by test type)

Median (IQR)

291.5 (155 to 502.5)

104 (75 to 172)

Range

56 to 1676

19 to 524

Number of COVID‐19 cases (by test type)

Median (IQR)

99.5 (45.5 to 128.5)

50 (20 to 88)

Range

0, 951

6, 220

Setting

COVID‐19 test centre

22 (46)

0 (0)

Contacts

4 (8)

0 (0)

Hospital A&E

3 (6)

3 (10)

Hospital inpatient

2 (4)

2 (7)

Laboratory‐based

11 (23)

20 (69)

Mixed

4 (8)

4 (14)

Unclear

2 (4)

0 (0)

Symptom status

Asymptomatic

3 (6)

0 (0)

Symptomatic

16 (33)

12 (41)

Mainly symptomatica

11 (23)

0 (0)

Mixed

8 (17)

3 (10)

Not reported

10 (21)

14 (48)

Study design

Recruitment structure

Single group – sensitivity and specificity

29 (60)

17 (59)

Two or more groups ‐ sensitivity and specificity

10 (21)

7 (24)

Unclear

2 (4)

2 (7)

Single group – sensitivity only

6 (13)

3 (10)

Single group – specificity only

1 (2)

0 (0)

Reference standard for COVID‐19 cases

All RT‐PCR‐positive

47 (98)

29 (100)

No. of studies = 42

No. of studies = 26

Reference standard for non‐COVID‐19

COVID suspects (single RT‐PCR‐negative)

39 (93)

24 (92)

COVID suspects (double+ RT‐PCR‐negative)

1 (2)

1 (4)

Current other disease (RT‐PCR‐negative)

0 (0)

1 (4)

Pre‐pandemic (not described)

1 (2)

0 (0)

Pre‐pandemic other disease

1 (2)

0 (0)

Tests

No. of evaluations (%)

Total number of test evaluations

58

32

Number of tests per study

1

44 (92)

26 (90)

2

1 (2)

3 (10)

3

1 (2)

0 (0)

4

1 (2)

0 (0)

5

1 (2)

0 (0)

Test method

CGIA

41 (71)

0 (0)

FIA

9 (16)

0 (0)

LFA (alkaline phosphatase labelled)

2 (3)

0 (0)

LFA (not otherwise specified)

6 (10)

0 (0)

Automated RT‐PCR

0 (0)

18 (56)

Isothermal amplification

0 (0)

13 (41)

Other molecular (PCR + LFA)

0 (0)

1 (3)

Sample type

NP alone

30 (52)

16 (50)

NP + OP combined

12 (21)

2 (6)

Nasal alone

2 (3)

2 (6)

OP alone

1 (2)

1 (3)

Two or more of NP, or nasal or OP

8 14)

8 (25)

Saliva

1 (2)

1 (3)

Other

3 (5)

0 (0)

Mixed (including lower respiratory)

4 (7)

1 (3)

Not specified

0 (0)

1 (3)

Sample storage

Direct

28 (48)

7 (22)

VTM

20 (35)

12 (38)

Saline

1 (2)

0 (0)

Direct or VTM

0 (0)

1 (3)

VTM or PBS

1 (2)

0 (0)

VTM or other

0 (0)

6 (19)

Not specified

8 (14)

6 (19)

Sample collection

HCW

15 (26)

2 (6)

Trained non‐HCW

3 (5)

0 (0)

Self‐collected

6 (10)

0 (0)

HCW or self‐collection

0

1 (3)

Not specified

34 (59)

29 (91)

Sample testing

HCW (on‐site)

13 (22)

0

Trained non‐HCW (on‐site)

3 (5)

0

HCW or on‐site laboratory personnel

0 (0)

1 (3)

Not specified (on‐site testing)

5 (9)

1 (3)

Laboratory staff

12 (21)

4 (13)

Not stated (laboratory setting)

15 (26)

16 (50)

IFU compliance

No

16 (28)

16 (50)

Yes

29 (50)

9 (28)

Unclear

13 (22)

7 (22)

A&E: accident and emergency department; CGIA: colloidal gold immunoassay; CI: confidence intervals; DRW: Diagnostics for the Real World; FIA: fluorescent immunoassay; HCW: healthcare worker; IFU: instructions for use; IQR: inter‐quartile range; LFA: lateral flow assay; NP: nasopharyngeal; OP: oropharyngeal; PBS: phosphatase‐buffered saline; RT‐PCR: reverse transcription polymerase chain reaction; VTM: viral transport medium

a‘mainly’ symptomatic indicates ≥ 75% of included participants reported as symptomatic.

The median sample size of the included studies is 182 (interquartile range (IQR) 104 to 400) and median number of SARS‐CoV‐2 confirmed samples included is 63 (IQR 38 to 119). Sample sizes for antigen test evaluations were larger than those for molecular test evaluations (median 291.5 (IQR 155 to 502.5) compared to 104 (IQR 75 to 172)). Half of the studies (39/77, 51%) were conducted in Europe, 20 in North America, seven in South America, seven in Asia, one study included samples from more than one country and in one, the country of sample origin was unclear.

Participant characteristics
Antigen tests

Over half of the antigen test studies included samples from participants presenting in the community for COVID‐19 testing at: community test centres (22/48, 46%); emergency departments (3, 6%); or as part of contact tracing or outbreak investigations (4, 8%) (Table 1). Eleven antigen test studies (23%) selected samples from those submitted to laboratories for routine RT‐PCR testing with limited detail of the participants providing the samples (‘laboratory‐based’ studies), or included multiple (8%) or unclear (2%) settings. Over half of antigen test studies were conducted in symptomatic (16, 33%) or mainly symptomatic (11, 23%) populations, with only three (6%) exclusively in asymptomatic populations (two in asymptomatic contacts of confirmed cases (Fenollar 2020(b); Shrestha 2020), and one involved staff screening, all of whom were RT‐PCR‐negative (PHE 2020(e)). The remaining antigen studies included samples from populations with mixed symptom status (8, 17%) or provided no information regarding symptom status (10, 21%). Of the 10 that provided no information, seven were laboratory‐based studies providing no details of the settings from which the tested samples had been obtained, one included samples from a COVID‐19 test centre, one was an outbreak investigation and in one the study setting could not be derived. There were no studies evaluating strategies of multiple tests.

A total of 13 studies provided accuracy data for people with no symptoms at the time of testing (3 studies exclusively in asymptomatic populations, and 10 studies providing subgroup data for people with no reported symptoms); one study provided only specificity data. Of the 12 datasets reporting both sensitivity and specificity, one (Alemany 2020), purportedly described preventive screening of the general population (although the reported prevalence of 24% is very high for such a scenario), one (Cerutti 2020), described targeted traveller screening, four (Billaud 2020; Fenollar 2020(b); Gupta 2020; Shrestha 2020), tested contacts of confirmed cases (one as part of an outbreak investigation) and the remaining six datasets were subgroups of samples from people presenting for routine testing. We identified one additional asymptomatic dataset in a report of several substudies but we did not include it as participants underwent antigen testing up to five days after a positive PCR test and it was not possible to determine the time point at which symptom status was recorded; it was also not possible to determine which 'substudy’ the data related to (PHE 2020(d) [HCW tested]; PHE 2020(d) [Lab tested]).

Thirty‐one of the 48 studies evaluating antigen tests reported results for SARS‐CoV‐2‐confirmed samples above and below a Ct value from the reference standard RT‐PCR. The median proportion of participants with 'high' viral load was 52% (IQR 35% to 60%). The most commonly used threshold was 24 or 25 Ct or less (n = 29 studies (or 36/58 test evaluations); 11 studies (15/58 test evaluations) reported results with at a threshold of between 31 and 33 Ct or less ; and 13 studies (13 evaluations) reported other thresholds including less than: 28 Ct (n = 3), 30 Ct (n = 5), 31 Ct (n = 3), or 35 Ct (n = 2)

Molecular tests

In contrast, studies evaluating molecular tests were mainly laboratory‐based (20, 69%), with three (10%) including samples from participants presenting to emergency department or urgent care settings, two in hospital inpatients (7%), and four (14%) including samples from participants presenting in multiple settings. Twelve of the 29 studies (41%) reported included only samples from symptomatic patients, four reported mixed symptom status (10%) and 14 (48%) provided no information regarding symptom status. Of the 14 that provided no information, one was based in a hospital Accident and Emergency department, and the remaining 13 were laboratory‐based studies, only three of which gave any details of the settings from which the tested samples had been obtained (three reported inclusion of samples from either inpatients and outpatients (n = 1), inpatients and ambulatory patients (n = 1) or inpatients and emergency department patients (n = 1) but did not provide the number of samples from each source). There were no studies evaluating strategies of multiple tests.

Five studies evaluating molecular assays, reported proportions with high viral load ranging from 33% to 80%, median 46%. All five studies reported results above and below a Ct value of 30.

Study design and reference standards

Table 1 shows a similar distribution of study designs between those evaluating antigen and molecular tests. Overall, 60% of studies (n = 46) used a ‘single group’ design to estimate both sensitivity and specificity and 22% (n = 17) used a ‘two group’ design with separate selection of RT‐PCR‐positive and RT‐PCR‐negative samples. In four studies (5%), the design could not be fully determined but probably deliberate separate sampling of RT‐PCR‐positive and RT‐PCR‐negative samples had been used.

Nine studies included only samples with confirmed SARS‐CoV‐2, thus only allowing estimation of sensitivity (six antigen and three molecular assay studies), and one study included only SARS‐CoV‐2‐negative samples allowing estimation of specificity only. All studies defined the presence or absence of SARS‐CoV‐2 infection based on RT‐PCR. Of the 68 studies that included SARS‐CoV‐2‐negative samples, 63 (93%) required a single, negative PCR to confirm absence of infection and two (3%) required two negative PCR results. The remaining three studies used pre‐pandemic samples (n = 2) or contemporaneous samples with other respiratory infections.

Thirty‐three studies (43%), obtained paired swabs for index and reference standard, 39 (51%) used the same swab for point‐of‐care and RT‐PCR (18 antigen and 21 molecular studies) and five studies used a mix of paired and same swabs (n = 1) or it was not possible to determine this information from the study report.

Index tests

Fifteen studies evaluated only one test, seven compared two or more tests in the same participants (four with two tests each, one with three tests and one each with four or five tests). In total the 77 studies that used respiratory samples reported on a total of 90 test evaluations. Appendix 13 provides details extracted from the manufacturer’s instructions for use documents for all included tests.

Antigen tests

Forty‐eight studies reported 58 evaluations of antigen tests; 41 of CGIAs, nine FIA, two alternative type of LFA using alkaline phosphatase‐labelled antibodies, and six where assay type could not be determined. Studies evaluated 16 different commercially produced assays, as documented, with full assay identification details, in Appendix 13. One study reported the development of the Shenzhen Bioeasy assay (Diao 2020), but it is not clear whether the commercially available assay is identical to the one reported in the study or whether it has undergone further refinement. One study reported evaluating a Roche SARS‐CoV‐2 assay, which appears to be the SD Biosensor STANDARD Q (Schildgen 2020 [A]). Only 12 studies provided product codes for the tests evaluated (FIND 2020a; FIND 2020b; FIND 2020c (BR); FIND 2020c (CH); FIND 2020d (BR); FIND 2020d (DE); FIND 2020e (BR); FIND 2020e (DE); Gremmels 2020(a); Gremmels 2020(b); Porte 2020a; Weitzel 2020 [A]). The study reports or manufacturer IFUs for 11 assays reported targeting the nucleocapsid protein; this information was not reported for the Beijing Savant, Bionote, Biosynex, Liming Bio‐Products, or RapiGEN Inc assays (Appendix 13). We were unable to identify any information for Beijing Savant, E25Bio or Liming Bio‐Products assays online.

Multiple combinations of sample types and use of direct swab testing or swabs in viral transport medium or saline were reported across the studies (Table 1). Forty‐one of 58 evaluations used nasopharyngeal (n = 30), oropharyngeal (n = 1) or nasal (n = 2) samples (type of nasal sample was not reported), or combinations of nasopharyngeal, nasal or oropharyngeal samples (n = 8; nasopharyngeal or nasal mid‐turbinate in one, nasopharyngeal or combined naso‐ and oropharyngeal in two, naso‐ or oropharyngeal in two, and naso‐ or oropharyngeal or combined naso‐ and oropharyngeal samples in three. Thirteen evaluations used combined naso‐ and oropharyngeal samples for all participants, one used saliva samples and three evaluations (from one study) used bronchoalveolar lavage or throat wash samples. Of the six studies using nasal samples either alone (n = 2) or for at least some participants (n = 4), one reported that these were nares swabs, and the remaining five did not specify the type of nasal sample. Almost half of studies used direct swab testing (n = 28, 48%), 22 (38%) tested samples in viral transport medium, saline or other medium, and in 8 (14%) this information was not provided.

IFUs for five assays explicitly recommend against using any transport medium for swab testing (assays from Becton Dickinson, Bionote, Quidel and SD Biosensor; Appendix 13), one (Coris BioConcept) states that viral transport medium may be used, and the other nine do not mention use of transport medium, although two of the nine (from AAZ and Biosynex) imply that viral transport medium should not be used (using statements such as "use within one hour, stored in clean unused plastic tube"). We considered 29 of 58 antigen evaluations (50%) to be compliant with manufacturer IFUs in terms of sample type, use of viral transport medium and time interval between collection and testing. Sixteen evaluations were not compliant with IFUs; nine used viral transport medium, four used freezing, four tested samples not listed on the IFUs, and in two testing was not always conducted within the one‐hour time period specified in the IFU. For the remaining 13 evaluations either no IFU was available (n = 4), viral transport medium or saline was used but the IFU did not specifically address whether viral transport medium was recommended or not (n = 7), or insufficient detail was provided in the study.

Samples were collected by healthcare workers in 15 (26%) evaluations, by trained non‐healthcare workers, such as firefighters or Ministry of Health employees in three (5%) evaluations, self‐collected in six (10%) and the collection was not described in 34 evaluations (59%). Sample testing was conducted ‘on‐site’ immediately or within one hour of collection in 21 (36%) evaluations by the same healthcare workers (n = 13), trained non‐healthcare workers (n = 3) who collected the samples, or this information was not provided (n = 5). In the remaining 27 evaluations (47%), testing was conducted by laboratory staff (n = 12) or was inferred to be by laboratory staff (n = 15). For the latter group, the time interval between sample collection and testing was on receipt at the laboratory, some reporting delays of up to six hours.

Molecular tests

Twenty‐nine studies reported 32 evaluations of five different commercially available rapid molecular tests: 13 evaluating ID NOW (Abbott Laboratories), 15 evaluating Xpert Xpress (Cepheid Inc), two of SAMBA II (Diagnostics for the Real World), and one evaluation each of Accula (Mesa Biotech Inc.) and COVID Nudge (DNANudge). None of the studies reported product codes for the tests evaluated. One study of Xpert Xpress used the 'research use only' (RUO) version of the test but reported that the RUO version contains the same reagents as the 'emergency use authorisation' (EUA) version. The RUO test allows the user to view the amplification curves for the RdRp gene as well as for the E‐gene and N2 targets whereas the EUA version restricts the amplification curves to E and N2 only. ID NOW and SAMBA‐II use isothermal techniques, Xpert Xpress and COVID Nudge are based on RT‐PCR, and Accula is described as a PCR plus lateral flow assay.

Multiple combinations of sample types and use of direct swab testing or swabs in viral transport medium or saline were reported across the studies (Table 1). The sample types used included combined naso‐ and oropharyngeal samples (n = 2), nasopharyngeal samples alone (n = 16), nasal alone (n = 2), oropharyngeal samples alone (n = 1), or a combination of two or more of either nasopharyngeal or nasal or oropharyngeal samples (n = 8). One evaluation used throat saliva or lower respiratory tract specimens, one used saliva samples alone and one did not specify the sample type used. Of the six studies using nasal samples either alone (n = 2) or for at least some participants (n = 4), one reported using nares swabs, and the remaining five did not specify the type of nasal sample used.

Eight evaluations (25%) reported direct swab testing in some (n = 1) or all (n = 7) samples, 18 (59%) used swabs in viral transport medium only (n = 12) or in viral transport medium or some other transport medium (n = 6), and six did not report whether they used any transport medium.

Sample collection was described in only three evaluations (9%) (Gibani 2020; Harrington 2020; Rhoads 2020; Table 1); the remaining studies did not describe sample collection but it is likely that samples were collected as part of routine care by healthcare workers. Sample testing was clearly described as conducted on‐site by medical personnel or by laboratory personnel at local laboratories in one of the studies reporting sample collection (Harrington 2020), while a second implied testing as soon as possible after collection, possibly by the same healthcare worker (Gibani 2020). Four (12.5%) evaluations stated that laboratory staff carried out the tests. In 16 of the remaining 26 studies, testing by laboratory staff was inferred, based on delays between collection and testing of 18 hours to seven days (n = 10), or reported use of archived or frozen samples (n = 6). The remaining eight evaluations provided no useful information regarding who carried out the test (Assennato 2020; Dust 2020; Ghofrani 2020; Jin 2020; Jokela 2020; Moran 2020; Rhoads 2020; SoRelle 2020).

Two of the five manufacturers document IFU for samples stored in transport medium (Xpert Xpress and SAMBA II assays); two explicitly recommend against the use of viral transport medium (ID NOW and Accula), although at the time of the test evaluations some viral transport media were documented as acceptable for ID NOW; and one IFU does not mention the use of viral transport medium (COVID Nudge). Although immediate sample testing is preferred, all manufacturers document an acceptable period of refrigerated storage of between eight hours (COVID Nudge), and seven days with refrigeration (Xpert Xpress). See Appendix 13.

We considered only nine of 32 (28%) evaluations to be compliant with manufacturer IFUs in regard to sample type, use of viral transport medium and time interval between collection and testing. Sixteen evaluations were not compliant with IFUs; eight used viral transport medium, six used frozen samples, and two tested samples not listed on the IFUs. For the remaining seven evaluations, either the testing interval from sample collection was unclear (n = 5) or saline was used but the IFU did not specifically address whether this was recommended or not (n = 2).

Methodological quality of included studies

We report the overall methodological quality assessed using the QUADAS‐2 tool for all included studies (n = 78) in Figure 2 (Whiting 2011). See Appendix 14 for separate summary plots by test method and for a plot of study‐level ratings by quality domain. We explain how we reached these judgements in the Characteristics of included studies table.


Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies. Numbers in the bars indicate the number of studies

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies. Numbers in the bars indicate the number of studies

We considered whether the findings of individual studies were at risk of bias, and whether there were concerns that results might not apply to standard use of the tests. We did not judge any study at low risk of bias, although in 11 of 78 studies the only concern was that a single negative RT‐PCR was used to confirm absence of COVID infection rather than the preferred two negative tests. All studies raised concerns regarding the applicability of their results, but in 13 of 78 studies the only concern was the reliance on only PCR to identify SARS‐CoV‐2 cases (and nine of these 13 are in common with the 11 using a single negative RT‐PCR).

Participant selection

We judged 22 studies (28%) to be at low risk of bias, and 29 (37%) at high risk of bias because of deliberate sampling of participants based on the reference standard result (n = 25; 16 two‐group studies and nine that only included samples with confirmed SARS‐CoV‐2 infection or absence of infection) or use of convenience sampling (n = 4). In 27 studies (35%) the risk of bias was unclear because of poor reporting of recruitment procedures or inclusion criteria (Figure 2).

A third (27/78) of studies were likely to have selected an appropriate patient group, recruiting participants from COVID‐19 test centres, urgent care or emergency departments or identified through contact tracing. We had high concerns about the applicability of the selected participants in almost half of studies (35/78). Recruited participants were unlikely to be similar to those in whom the test would be used in clinical practice because of deliberate sampling (n = 25) or sample inclusion based on the availability of residual and sometimes frozen samples, or both (n = 22).

Index tests

Poor reporting meant we could not clearly assess whether there was a risk of bias through performance of the index test in 41 (53%) studies. In general, antigen test studies were of a higher methodological standard for the index test domain compared to studies of molecular tests (Figure 2).

For antigen tests, we observed low risk of bias in 60% of studies (29/48). Risk of bias was unclear in the remaining studies because we could not judge whether interpretation of the index test was undertaken with knowledge of the reference standard result. For molecular tests, risk of bias was low in only 17% of studies (5/30). We observed high risk of bias in three studies (Moran 2020; Smithgall 2020 [A]; Wolters 2020) because they did not follow the manufacturer’s prespecified threshold for the Xpert Xpress test (re‐testing of samples with presumptive positive results). Risk of bias was unclear in 73% (22/30) of studies because they did not report blinding to the reference standard (n = 22), six of these studies also did not report how they handled presumptive positive results on Xpert Xpress.

Fourteen studies (18%), including 13 antigen and one molecular test study, conducted testing as would be expected in practice (low concern regarding applicability). We had high concerns about applicability in half of all studies (39/78); 48% (23/48) of antigen and 57% (16/30) of molecular studies. Twenty‐seven (11 antigen and 16 molecular) did not comply with manufacturers’ IFU and a further 10 (all antigen studies), did not carry out tests as would occur in practice (i.e. trained, centralised laboratory staff carried out testing). In another two antigen studies concerns for applicability were high because tests were not available for purchase (Diao 2020; Nash 2020). Of the remaining 25 studies (12 antigen and 13 molecular) 16 conducted the test within the manufacturer IFU but none clearly described the setting for testing or personnel conducting the test.

Reference standards

Six studies were at low risk of bias for the reference standard. Although 12 used an appropriate reference standard, half (6/12) did not clearly implement blinding of the reference standard to the index test. High risk of bias (66/78) was present because studies did not use an adequate reference standard (Figure 2); they used either a single negative RT‐PCR to define absence of SARS‐CoV‐2 infection (n = 64) or the index test formed part of a composite reference standard (n = 2).

A total of 36 studies reported blinded RT‐PCR interpretation, two (with composite reference standard) did not implement blinding, and 40 (51%) provided insufficient information about blinding of the reference standard to the index test to judge risk of bias.

We judged 76 of the 78 studies to raise concerns about applicability (97%) because of defining the presence of SARS‐CoV‐2 infection based on a single RT‐PCR‐positive result. These studies will have excluded individuals who are RT‐PCR‐negative but have exposure and clinical features that meet the case definitions for COVID‐19.

Flow and timing

Only 13 (17%) studies (all of antigen tests) were at low risk of bias for participant flow and timing (Figure 2). Twenty‐nine (37%) were at high risk of bias (19 antigen and 10 molecular) because of exclusion of samples following invalid index test results (n = 23); delays between ‘paired’ swabs of up to three days (n = 4), different reference standards used (n = 3), or because they provided results on a per sample instead of per patient basis (n = 2). These categories are not mutually exclusive.

We judged risk of bias unclear for 36 (46%) studies, primarily because of lack of clarity about participant inclusion and exclusion from analyses (n = 34), with no missing data or indeterminate test results reported and no Standards for Reporting Diagnostic Accuracy Studies (STARD)‐style participant flow diagram and checklist (Bossuyt 2015), to fully report outcomes for all samples.

Conflicts of interest

In 27 studies all authors declared no conflicts of interest, although one study that reported the validation of a new test included a co‐author affiliated to the test manufacturing company. Of these 27 studies, 19 were independent evaluations published by FIND or were from national reference laboratories. Twenty studies did not provide a conflict of interest statement, including 13 published studies and one study that reported affiliations to the test manufacturer. In the 12 remaining studies at least one author declared potential conflicts of interest in relation to the test.

Twenty‐six studies provided no funding statement, 12 reported no funding sources to declare, and the remainder (n = 40) reported one or more funding sources.

Findings

Of the 78 included studies, eight reported evaluations of more than one test using the same samples and one reported evaluations of three tests using different samples (Table 1). To include all results from all tests in these analyses we have treated results from different tests of the same samples within a study as separate data points, such that data are available on 91 test evaluations (58 evaluations of antigen tests in 48 studies and 33 evaluations of rapid molecular tests in 30 studies).

As previously stated, 77 of the 78 studies reported data for respiratory samples and one (Szymczak 2020), reported data for non‐respiratory (faecal) samples. The main results, Tables and Figures focus on the respiratory samples, with Szymczak 2020 reported separately.

The results tables identify where estimates are based on multiple assessments of the same samples by including both the number of test evaluations and the number of studies. Nine datasets are from ‘cases only’ studies reporting only sensitivity estimates (six for antigen tests and three for molecular assays), and one antigen test evaluation is for ‘non‐COVID‐19’ cases reporting only specificity. Summary results are presented for studies providing both sensitivity and specificity data and then adding in the data from sensitivity‐ or specificity‐only evaluations. The numbers of true positives, false positives, and total samples with and without confirmed SARS‐CoV‐2 infection are based on test result counts.

We present results for antigen tests overall and by subgroup in Table 2. Table 3 and Table 4 present results by test brand overall and by symptom status, and give results of sensitivity analyses restricting by compliance with manufacturer IFU. Forest plots of study data for the primary analysis are in Figure 3 and for subgroup analyses by symptom status and time after symptom onset are in Figure 4 and Figure 5. Appendix 15 provides forest plots for study data according to Ct value and study design. Individual plots by test brand are provided in Figure 6 for test brands with three or more evaluations and Figure 7 for test brands with one or two evaluations. Figure 8 shows data from studies comparing the accuracy of two or more antigen assays. Full identification details for studies of antigen‐based assays are provided in Appendix 9 and Appendix 10.

Open in table viewer
Table 2. Antigen tests: summary of sensitivity and specificity analyses

Subgroup

Test

Evaluations

Samples

Cases

Average sensitivity, % (95% CI)

Average specificity, % (95% CI)

Overall analysis

Evaluations reporting both sensitivity and specificity

51

21,614

6136

68.9 (61.8 to 75.1)

99.6 (99.0 to 99.8)

Evaluations reporting sensitivity dataa

57

22,605

7127

67.7 (60.8 to 74.0)

N/A

Evaluations reporting specificity dataa

52

22,152

6136

N/A

99.5 (99.0 to 99.8)

Subgroup analyses (with sensitivity analyses restricting to direct comparisons)

Symptom status (all)

Symptomatic

37

15,530

4410

72.0 (63.7 to 79.0)

99.5 (98.5 to 99.8)

Asymptomatic

12

1581

295

58.1 (40.2 to 74.1)

98.9 (93.6 to 99.8)

Difference

13.8 (33.1 to 5.4)

P = 0.159

0.6 (2.6 to 1.4)

P = 0.551

Symptomatic: direct comparison

9

2437

890

68.0 (51.4 to 81.1)

99.2 (83.9 to 100)

Asymptomatic: direct comparison

9

1182

213

53.6 (35.0 to 71.3)

99.2 (85.5 to 100)

Difference

14.4 (38.8 to 10.0)

P = 0.246

0.01 (3.2 to 3.2),

P = 0.995

Mixed symptoms or not reported

19

6220

2392

63.0 (52.2 to 72.6)

98.4 (98.0 to 98.8)

Time post‐symptom onset

(sensitivity only)

Week 1

26

5769

2320

78.3 (71.1 to 84.1)a

N/A

Week 2

22

935

692

51.0 (40.8 to 61.0)a

N/A

Difference

27.3 (32.8 to −21.9)

P < 0.0001

Week 1: direct comparison

22

4978

2164

76.6 (68.2 to 83.4)a

N/A

Week 2: direct comparison

22

935

692

48.8 (37.9 to 59.8)a

N/A

Difference

27.9 (33.3 to −22.5)

P < 0.0001

Ct value

(sensitivity only)

Higher viral load (< or ≤ 25 Ct threshold)b

36

2613

2613

94.5 (91.0 to 96.7)a

N/A

Lower viral load (> or >= 25 Ct threshold)b

36

2632

2632

40.7 (31.8 to 50.3)a

N/A

Difference

53.8 (63.6 to −44.1)

P < 0.0001

Higher viral load (≤ 32 or33 Ct threshold)c

15

2127

2127

82.5 (74.0 to 88.6)a

N/A

Lower viral load (> 32 or 33 Ct threshold)c

15

346

346

8.9 (3.3 to 21.7)a

N/A

Difference

73.5 (84.7 to −62.4)

P < 0.0001

Study design

Single group: sensitivity and specificity

29

15,336

3536

72.1 (64.8 to 78.3)

99.6 (99.1 to 99.8)

Two or more groups: sensitivity and specificity

20

5729

2396

64.1 (48.5 to 77.2)

97.3 (96.7 to 97.8)

8.0 (24.2 to 8.2)

P = 0.334

2.3 (2.9 to −1.6)

P < 0.0001

Unclear

2

549

204

65.2 (39.6 to 84.3)

96.3 (88.0 to 98.9)

Test method

CGIA

36

17,448

5085

64.0 (55.7 to 71.6)

99.0 (98.8 to 99.2)

FIA

9

2820

712

79.6 (67.5 to 88.0)

97.7 (95.3 to 98.8)

Difference

15.6 (2.6 to 28.5)

P = 0.019

1.3 (3.0 to 0.3)

P = 0.113

LFA (not otherwise specified)

5

1184

277

78.0 (46.0 to 93.7)

96.0 (94.5 to 97.1)

LFA (ALP)

1

162

62

80.6 (68.6 to 89.6)

100 (96.4 to 100)

ALP: alkaline phosphatase labelled; CGIA: colloidal gold immunoassay; CI: confidence intervals; Ct: cycle threshold; FIA: fluorescent immunoassay; LFA: lateral flow assay; N/A: not applicable

aSeparate pooling of sensitivity or specificity, or both.

b threshold for 'higher' viral load was < 25 Ct in 18 evaluations and ≤ 25 Ct in 18 evaluations

c threshold for 'higher' viral load ≤ 33 Ct in 13 evaluations and < 32 in 2 evaluations

Open in table viewer
Table 3. Antigen tests: summary data by test brand and compliance with manufacturers' instructions for use

Test

All

IFU‐compliant

Number of evaluations; samples (cases)

Average sensitivity, % (95% CI)

Average specificity, % (95% CI)

Number of evaluations; samples (cases)

Average sensitivity, % (95% CI)

Average specificity, % (95% CI)

AAZ ‐ COVID‐VIRO

(2 studies not pooled)

1; 632 (295)

61.7 (55.9 to 67.3)

100 (98.9 to 100)

1; 248 (101)

96.0 (90.2 to 98.9)

86.4 (79.8 to 91.5)

1; 248 (101)

96.0 (90.2 to 98.9)

86.4 (79.8 to 91.5)

Abbott ‐ Panbio Covid‐19 Ag

10; 5509 (1849)

72.0 (60.6 to 81.1)

99.3 (99.0 to 99.6)

5; 1776 (362)

72.0 (56.5 to 83.5)

99.2 (98.5 to 99.5)

including sensitivity‐only cohort

11; 2031 (2031)

72.8 (62.6 to 81.0)a

6; 544 (544)

73.5 (61.1 to 83.0)a

Becton Dickinson ‐ BD Veritor

2; 602 (55)

82.3 (62.1 to 93.0)

99.5 (98.3 to 99.8)

including sensitivity‐only cohort

3; 180 (180)

79.4 (72.9 to 84.7)a

BIONOTE ‐ NowCheck COVID‐19 Ag

1; 400 (102)

89.2 (81.5 to 94.5)

97.3 (94.8 to 98.8)

1; 400 (102)

89.2 (81.5 to 94.5)

97.3 (94.8 to 98.8)

Biosynex ‐ Biosynex COVID‐19 Ag BSS

1; 634 (297)

59.6 (53.8 to 65.2)

100 (98.9 to 100)

Coris Bioconcept ‐ COVID‐19 Ag Respi‐Strip

7; 1781 (707)

39.7 (31.3 to 48.7)

98.3 (97.4 to 98.9)

7; 1781 (707)

39.7 (31.3 to 48.7)

98.3 (97.4 to 98.9)

E25Bio ‐ DART (N‐based)

1; 190 (100)

80.0 (70.8 to 87.3)

91.1 (83.2 to 96.1)

Fujirebio ‐ ESPLINE SARS‐CoV‐2

(2 studies not pooled)

1; 162 (62)

80.6 (68.6 to 89.6)

100 (96.4 to 100)

1; 103 (103)

11.6 (6.2 to 19.5)

Innova Medical Group ‐ Innova SARS‐CoV‐2 Ag

3; 2945 (596)

47.9 (34.3 to 61.8)

99.8 (99.5 to 99.9)

1; 1676 (372)

57.5 (52.3 to 62.6)

99.6 (99.1 to 99.9)

including sensitivity‐only cohorts

5; 1017

59.0 (43.4to 73.0)a

3; 793

69.1 (58.3to 78.2)a

including specificity‐only cohort

4; 2887

99.8 (99.5to 99.9)a

2; 1842

99.7 (99.3to 99.9)a

Liming Bio‐Products ‐ StrongStep® COVID‐19 Ag

1; 19 (9)

0 (0 to 33.6)

90.0 (55.5 to 99.7)

Quidel Corporation ‐ SOFIA SARS Ag

1; 64 (32)

93.8 (79.2 to 99.2)

96.9 (83.8 to 99.9)

RapiGEN ‐ BIOCREDIT COVID‐19 Ag

5; 2010 (310)

63.3 (45.7 to 78.0)

99.5 (99.1 to 99.8)

3; 1828 (189)

73.0 (57.4 to 84.4)

99.8 (99.4 to 99.9)

including sensitivity‐only cohort

6; 470 (470)

57.7 (39.8to 73.8)a

Roche ‐ SARS‐CoV‐2

1; 73 (42)

88.1 (74.4 to 96.0)

19.4 (7.5 to 37.5)

Savant Biotech ‐ Huaketai SARS‐CoV‐2 N Protein

1; 109 (78)

16.7 (9.2 to 26.8)

100 (88.8 to 100)

SD Biosensor ‐ STANDARD F COVID‐19 Ag

4; 1552 (295)

72.6 (54.0 to 85.7)

97.5 (96.4 to 98.2)

2; 1129 (159)

75.5 (68.2 to 81.5)

97.2 (96.0 to 98.1)

SD Biosensor ‐ STANDARD Q COVID‐19 Ag

6; 3480 (821)

79.3 (69.6 to 86.6)

98.5 (97.9 to 98.9)

4; 2522 (421)

85.8 (80.5 to 89.8)

99.2 (98.2 to 99.6)

Shenzhen Bioeasy Biotech ‐ 2019‐nCoV Ag

development‐phase publication

3; 965 (177)

86.2 (72.4 to 93.7)

93.8 (91.9 to 95.3)

1; 727 (15)

66.7 (38.4 to 88.2)

93.1 (91.0 to 94.9)

1; 239 (208)

67.8 (61.0 to 74.1)

100 (88.8 to 100)

Ag: antigen; CI: confidence interval; IFU: [manufacturers'] instructions for use; N: nucleoprotein

aSeparate pooling of sensitivity or specificity.
b2x2 tables combined prior to calculating estimates.

Open in table viewer
Table 4. Antigen tests: summary data by symptom status, test brand and compliance with manufacturers' instructions for use

All

IFU‐compliant

Number of evaluations; samples (cases)

Average sensitivity, % (95% CI)

Average specificity, % (95% CI)

Number of evaluations; samples (cases)

Average sensitivity, % (95% CI)

Average specificity, % (95% CI)

SYMPTOMATIC participants by test

AAZ ‐ COVID‐VIRO

(2 studies not pooled)

1; 632 (295)

61.7 (55.9 to 67.3)

100 (98.9 to 100)

1; 248 (101)

96.0 (90.2 to 98.9)

86.4 (79.8 to 91.5)

1; 248 (101)

96.0 (90.2 to 98.9)

86.4 (79.8 to 91.5)

Abbott ‐ Panbio Covid‐19 Ag

8; 3699 (1162)

74.1 (60.8 to 84.0)

99.8 (99.5 to 99.9)

3; 1094 (252)

75.1 (57.3 to 87.1)

99.5 (98.7 to 99.8)

including sensitivity‐only cohort

9; 1344 (1344)

74.8 (63.4 to 83.6)a

4; 434 (434)

76.2 (63.6to 85.4)a

Becton Dickinson ‐ BD Veritor

2; 602 (55)

82.3 (62.1 to 93.0)

99.5 (98.3 to 99.8)

including sensitivity‐only cohort

3; 180 (180)

79.4 (72.9to 84.7)a

BIONOTE ‐ NowCheck COVID‐19 Ag

1; 400 (102)

89.2 (81.5 to 94.5)

97.3 (94.8 to 98.8)

1; 400 (102)

89.2 (81.5 to 94.5)

97.3 (94.8 to 98.8)

Biosynex ‐ Biosynex COVID‐19 Ag BSS

1; 634 (297)

59.6 (53.8 to 65.2)

100 (98.9 to 100)

Coris Bioconcept ‐ COVID‐19 Ag Respi‐Strip

3; 780 (414)

34.1 (29.7 to 38.8)a

100 (99.0 to 100)a,b

3; 780 (414)

34.1 (29.7 to 38.8)a

100 (99.0 to 100)a,b

Fujirebio ‐ ESPLINE SARS‐CoV‐2

1; 88 (88)

11.4 (5.6 to 19.9)

Innova Medical Group ‐ Innova SARS‐CoV‐2 Ag

2; 2794 (550)

56.2 (52.0 to 60.3)

99.8 (99.5 to 99.9)

1; 1676 (372)

57.5 (52.3 to 62.6)

99.6 (99.1 to 99.9)

including sensitivity‐only cohorts

4; 971 (971)

65.5 (54.8to 74.9)†

3; 793 (793)

69.1 (58.3to 78.2)†

Liming Bio‐Products ‐ StrongStep® COVID‐19 Ag

1; 19 (9)

0 (0 to 33.6)

90.0 (55.5 to 99.7)

Quidel Corporation ‐ SOFIA SARS Ag

1; 64 (32)

93.8 (79.2 to 99.2)

96.9 (83.8 to 99.9)

RapiGEN ‐ BIOCREDIT COVID‐19 Ag

3; 608 (206)

58.4 (36.3 to 77.5)

96.4 (82.8 to 99.3)

1; 476 (117)

74.4 (65.5 to 82.0)

98.9 (97.2 to 99.7)

Roche ‐ SARS‐CoV‐2

1; 23 (10)

100 (69.2 to 100)

7.7 (0.2 to 36.0)

Savant Biotech ‐ Huaketai SARS‐CoV‐2 N Protein

1; 109 (78)

16.7 (9.2 to 26.8)

100 (88.8 to 100)

SD Biosensor ‐ STANDARD F COVID‐19 Ag

3; 1193 (191)

78.0 (71.6 to 83.3)

97.2 (96.0 to 98.1)

2; 1129 (159)

75.5 (68.2 to 81.5)

97.2 (96.0 to 98.1)

SD Biosensor ‐ STANDARD Q COVID‐19 Ag

5; 2760 (731)

80.1 (68.5 to 88.1)

98.1 (97.4 to 98.6)

3; 1947 (336)

88.1 (84.2 to 91.1)

99.1 (97.8 to 99.6)

Shenzhen Bioeasy Biotech ‐ 2019‐nCoV Ag

3; 965 (177)

86.2 (72.5 to 93.7)

93.8 (91.9 to 95.3)

1; 727 (15)

66.7 (38.4 to 88.2)

93.1 (91.0 to 94.9)

ASYMPTOMATIC participants by test

Abbott ‐ Panbio Covid‐19 Ag

6; 1097 (190)

58.1 (41.7 to 72.9)

98.4 (92.2 to 99.7)

2; 474 (47)

48.9 (35.1 to 62.9)

98.1 (96.3 to 99.1)

Coris Bioconcept ‐ COVID‐19 Ag Respi‐Strip

1; 45 (14)

28.6 (8.4 to 58.1)

100 (88.8 to 100)

1; 45 (14)

28.6 (8.4 to 58.1)

100 (88.8 to 100)

Fujirebio ‐ ESPLINE SARS‐CoV‐2

1; 15 (15)

13.3 (1.7 to 40.5)

N/A

RapiGEN ‐ BIOCREDIT COVID‐19 Ag

2; 140 (60)

63.2 (21.7 to 91.4)

98.9 (82.9 to 99.9)

1; 113 (47)

85.1 (71.7 to 93.8)

100 (94.6 to 100)

Roche ‐ SARS‐CoV‐2

1; 27 (13)

84.6 (54.6 to 98.1)

14.3 (1.8 to 42.8)

SD Biosensor ‐ STANDARD Q COVID‐19 Ag

2; 272 (18)

61.1 (37.9 to 80.2)

99.6 (97.3 to 99.9)

1; 127 (13)

69.2 (38.6 to 90.9)

99.1 (95.2 to 100)

Ag: antigen; CI: confidence interval; N: nucleoprotein; N/A: not applicable

aseparate pooling of sensitivity or specificity.
b2x2 tables combined prior to calculating estimates.


Forest plot of studies evaluating antigen tests. BR: Brazil; CH: Switzerland; DE: Germany; HCW: healthcare worker; Lab: laboratory

Forest plot of studies evaluating antigen tests. BR: Brazil; CH: Switzerland; DE: Germany; HCW: healthcare worker; Lab: laboratory


Forest plot of data for antigen tests according to symptom status. A&E: accident and emergency; BR: Brazil; CH: Switzerland; DE: Germany; HCW: healthcare worker; Lab: laboratory

Forest plot of data for antigen tests according to symptom status. A&E: accident and emergency; BR: Brazil; CH: Switzerland; DE: Germany; HCW: healthcare worker; Lab: laboratory


Forest plot of antigen test evaluations by week post symptom onset (pso). A&E: accident and emergency; Ag: antigen; BR: Brazil; CH: Switzerland; DE: Germany

Forest plot of antigen test evaluations by week post symptom onset (pso). A&E: accident and emergency; Ag: antigen; BR: Brazil; CH: Switzerland; DE: Germany


Forest plot by test brand for assays with ≥ 3 evaluations. BR: Brazil; CGIA: colloidal‐gold immunoassay; CH: Switzerland; DE: Germany; FIA: fluorescent immunoassay; HCW: healthcare worker; IFU: instructions for use; Lab: laboratory; LFA: lateral flow assay

Forest plot by test brand for assays with ≥ 3 evaluations. BR: Brazil; CGIA: colloidal‐gold immunoassay; CH: Switzerland; DE: Germany; FIA: fluorescent immunoassay; HCW: healthcare worker; IFU: instructions for use; Lab: laboratory; LFA: lateral flow assay


Forest plot by test brand for assays with < 3 evaluations; CGIA: colloidal‐gold immunoassay; FIA: fluorescent immunoassay; IFU: instructions for use; LFA: lateral flow assay

Forest plot by test brand for assays with < 3 evaluations; CGIA: colloidal‐gold immunoassay; FIA: fluorescent immunoassay; IFU: instructions for use; LFA: lateral flow assay


Forest plot of studies reporting comparative data. CGIA: colloidal‐gold immunoassay; FIA: fluorescent immunoassay; LFA: lateral flow assay; nos: not otherwise specified

Forest plot of studies reporting comparative data. CGIA: colloidal‐gold immunoassay; FIA: fluorescent immunoassay; LFA: lateral flow assay; nos: not otherwise specified

Results for molecular tests overall and by subgroup are reported in Table 5. Forest plots of study data for the primary analysis is in Figure 9 and for subgroup analyses by Ct value, study design and sensitivity analyses by pre‐ and post‐discrepant analysis in Appendix 16. Individual plots by test brand are provided in Figure 10. Full identification details for studies of molecular‐based assays are provided in Appendix 11 and Appendix 12. Appendix 17 provides forest plots for study data according to Ct value and discrepant analysis.

Open in table viewer
Table 5. Molecular tests: summary of sensitivity and specificity analyses

Test or subgroup

Evaluations

Samples

Cases

Average sensitivity, % (95% CI)

Average specificity, % (95% CI)

Overall analysis

Evaluations reporting both sensitivity and specificity

29

4351

1787

95.1 (90.5 to 97.6)

98.8 (98.3 to 99.2)

Evaluations reporting sensitivity dataa

32

4537

1973

95.5 (91.5 to 97.7)

N/A

Subgroup analyses (with sensitivity analyses restricting to direct comparisons)

Viral load

(sensitivity only)

High viral load (≤ 30 Ct)

6

204

204

100 (98.2 to 100)a,b

N/A

Low viral load (> 30 Ct)

6

149

149

95.6 (55.7 to 99.7)

N/A

By study design

Single group – sensitivity and specificity

18

2899

976

93.2 (85.5 to 97.0)

99.4 (98.4 to 99.8)

Two or more groups ‐ sensitivity and specificity

9

1265

718

97.2 (90.7 to 99.2)

99.3 (96.5 to 99.8)

Difference

4.0 (‐2.2to 10.1)

P = 0.211

‐0.2 (‐1.3to 1.0)

P = 0.771

Unclear designs

2

187

93

93.2 (71.0 to 98.7)a

100 (96.2 to 100)a,b

Test brand

Abbott – ID NOW

12

1853

634

78.6 (73.7 to 82.8)

99.8 (99.2 to 99.9)

Cepheid – Xpert Xpress

13

1691

911

99.1 (97.7 to 99.7)

97.9 (94.6 to 99.2)

Difference

19.8 (14.9to 24.7)

P < 0.0001

‐1.9 (‐3.8to ‐0.1)

P = 0.036

Abbott – ID NOW (including sensitivity only cohort)

13

1949

730

81.5 (75.2 to 86.5)a

N/A

Cepheid – Xpert Xpress (including sensitivity only cohorts)

15

1781

1001

99.1 (97.8 to 99.6)a

N/A

DNANudge – COVID Nudge

1

386

71

94.4 (86.2 to 98.4)

100 (98.8 to 100)

Diagnostics for the Real World – SAMBA II

2

321

121

96.0 (81.1 to 99.3)

97.0 (93.5 to 98.6)

Mesa Biotech – Accula

1

100

50

68.0 (53.3 to 80.5)

100 (92.9 to 100)

Test brand

(restricted to IFU‐compliant)

Abbott – ID NOW

4

812

222

73.0 (66.8 to 78.4)

99.7 (98.7 to 99.9)

Cepheid – Xpert Xpress

2

100

29

100 (88.1 to 100)a

97.2 (89.4 to 99.3)a

DRW – SAMBA II

1

149

33

87.9 (71.8 to 96.6)

97.4 (92.6 to 99.5)

DNANudge – COVID Nudge

1

386

71

94.4 (86.2 to 98.4)

100 (98.8 to 100)

Discrepant analysis

Before discrepant analysis

6

1533

623

97.9 (88.1 to 99.7)

97.8 (96.6 to 98.6)

After discrepant analysis

6

1533

632

99.2 (93.6 to 99.9)

99.6 (98.8 to 99.8)

Difference

1.3 (‐2.8to 5.4)

P = 0.528

1.8 (0.7to 2.8)

P = 0.001

CI: confidence interval; Ct: cycle threshold; IFU: [manufacturers'] instructions for use; N/A: not applicable

aSeparate pooling of sensitivity or specificity.
b2x2 tables combined prior to calculating estimates.


Forest plot of studies evaluating rapid molecular tests. A&E: accident and emergency

Forest plot of studies evaluating rapid molecular tests. A&E: accident and emergency


Forest plot by test brand for molecular assays. A&E: accident and emergency; IFU: instructions for use

Forest plot by test brand for molecular assays. A&E: accident and emergency; IFU: instructions for use

Accuracy of antigen tests overall and by subgroup

Results showed high levels of heterogeneity in sensitivity. Average sensitivity was 68.9% (95% CI 61.8% to 75.1%) and average specificity was 99.6% (95% CI 99.0% to 99.8%) across the 51 evaluations of antigen tests reporting both sensitivity and specificity (based on 21,614 samples, including 6136 samples with confirmed SARS‐CoV‐2; Table 2; Figure 3). Adding the six ‘sensitivity only’ datasets and single ‘specificity only’ datasets had a negligible impact on results (Table 2). In the sections below we show that there are substantial differences between subgroups of studies according to symptom status, timing, test method and brand, therefore this average value is unlikely to accurately predict the performance of the test in a given setting and should not be used for this purpose.

Subgroup analysis by symptom status

Subgroup analysis by symptom status suggests that average test sensitivity to detect infection is 13.8 percentage points lower in asymptomatic (58.1%, 95% CI 40.2% to 74.1%; based on 12 evaluations, 1581 samples and 295 cases) compared to symptomatic (72.0%, 95% CI 63.7% to 79.0%; based on 37 evaluations, 15,530 samples and 4410 cases) participants (95% CI for the difference in sensitivity: 33.1 percentage points lower to 5.4 percentage points higher; Table 2; Figure 4). Restricting the comparison by symptom status to the nine evaluations reporting data for both symptomatic and asymptomatic subgroups (thus ensuring the comparison is made between the same tests used in the same way) showed a similar difference in sensitivity (14.4 percentage points lower in asymptomatic participants, 95% CI 38.8 lower to 10.0 percentage points higher; Table 2). Average results for the 19 evaluations in participants with mixed symptom status (n = 10) or symptom status not reported (n = 9) were between those observed for the symptomatic and asymptomatic subgroups: sensitivity 63.0% (95% CI 52.2% to 72.6%) and specificity 98.4% (95% CI 98.0% to 98.8%) (6220 samples; 2392 cases).

We did not observe any important differences in specificity according to symptom status (Table 2).

Subgroup analysis by time from symptom onset

We pooled data by time from symptom onset separately for sensitivity and specificity because the majority of evaluations did not report these data for people without SARS‐CoV‐2 (Table 2; Figure 5). Sensitivity was 78.3% (95% CI 71.1% to 84.1%) (26 evaluations; 5769 samples, 2320 cases) in the first seven days after symptom onset compared to 51.0% (40.8% to 61.0%) (22 evaluations; 935 samples, 692 cases) in the second week of symptoms (a decrease of 27.3 percentage points, 95% CI −32.8 to −21.9 percentage points decrease). This difference remained on restriction to the 22 evaluations reporting data for people in both week one and week two of symptoms (removing other between‐study differences; Table 2).

We did not observe any differences in specificity according to time after symptom onset (Table 2).

Subgroup analysis by Ct value

A total of 36 evaluations reported sensitivity according to Ct value using a threshold of 24 (n = 18) or 25 (n = 18) Ct or less to define higher viral load (Table 2; Appendix 15). Summary sensitivity in those with higher viral load was 94.5% (95% CI 91.0% to 96.7%) (based on 2613 cases), compared to 40.7% in those with lower viral load (95% CI 31.8% to 50.3%) (based on 2632 cases) (i.e. sensitivity was 53.8 percentage points lower for those with lower viral load; 95% CI 63.6 to 44.1 percentage points lower)). Applying a Ct threshold of ≤ 33 (n = 13) or < 32 (n = 2) led to a bigger difference in sensitivity although the number of samples in the lower viral load subgroup was considerably smaller: sensitivity associated with higher viral load was 82.5% (95% CI 74.0% to 88.6%) (based on 2127 samples) and for lower viral load was 8.9% (3.3% to 21.7%) (based on 346 samples), a difference of 73.5 percentage points (95% CI 84.7 to 62.4 percentage points lower).

Subgroup analysis by study design

We did not observe any clear differences in average sensitivity or specificity when studies were grouped by study design (15,336 samples and 3536 cases in 29 single group studies and 5729 samples and 2396 cases in 20 two‐group studies; Table 2; Appendix 15). Average sensitivity was lower in two‐group studies (64.1%, 95% CI 48.5% to 77.2%) compared to single‐group studies (72.1%, 95% CI 64.8% to 78.3%), however confidence intervals overlapped and the difference was within that which may be expected by chance (8.0 percentage points lower, 95% CI from 24.2 percentage points lower to 8.2 higher). Average specificities were 2.3 percentage points lower in the two‐group studies (95% CI from 2.9 to 1.6 percentage points lower), at 97.3% (95% CI 96.7% to 97.8%) compared to 99.6% (95% CI 99.1% to 99.8%) in single‐group studies.

Subgroup analysis by test method

We observed differences in accuracy according to test method (Table 2). The majority of evaluations (n = 36; 17,448 samples, 5085 cases) reported using a CGIA, average sensitivity was lower (64.0%, 95% CI 55.7% to 71.6%) than for FIAs (79.6%, 95% CI 67.5% to 88.0%; n = 9; 2820 samples, 712 cases; absolute difference of 15.6 percentage points, 95% CI 2.6 to 28.5 percentage points). We also observed marginal differences in specificity, with estimates of 99.0% (95% CI 98.8% to 99.2%) for CGIA and 97.7% (95% CI 95.3% to 98.8%) for FIA, a difference of 1.3 percentage points (95% from 3.0 percentage points lower to 0.3 higher). Results for lateral flow assays where the method could not be determined (n = 5) and for the single evaluation of an alkaline phosphatase (ALP)‐labelled assay were heterogeneous but largely in the realms of those observed for the other assay types (Table 2).

Results by test brand according to symptom status and IFU compliance

Results by test brand overall and sensitivity analyses by IFU compliance (based on sample type, use of viral transport medium, and time period between sample collection and test procedure) are reported in Table 3. Results by test brand for symptomatic and asymptomatic subgroups overall and by IFU compliance are in Table 4. Given the mixed settings in which asymptomatic individuals were tested (Results of the search), the data for asymptomatic subgroups cannot be considered applicable to any particular scenario for asymptomatic testing. Only three studies reported direct comparisons of tests, two using nasopharyngeal or oropharyngeal samples (Fourati 2020 [A]; Weitzel 2020 [A]).

We observed considerable heterogeneity in sensitivities for all assays.

AAZ – COVID‐VIRO

Two evaluations of the COVID‐VIRO assay included 880 samples and 396 SARS‐CoV2‐positive samples (Figure 7). We did not pool the studies due to the heterogeneity in both sensitivity and specificity, although both were conducted in symptomatic or mainly symptomatic participants using nasopharyngeal samples.

In one study that compared antigen assays using nasopharyngeal samples in viral transport medium, sensitivity was 61.7% (95% CI 55.9% to 67.3%) and specificity (in pre‐pandemic samples) 100% (95% CI 98.9% to 100%; 632 samples, 295 cases;'Fourati 2020 [E]).

The second study used direct swab testing in compliance with the manufacturer’s IFU. Twenty participants in the study who previously tested positive on PCR retested negative with PCR at the time of the antigen test. All twenty samples showed weak lines on antigen testing. We considered these as false positives in the review (based on the negative result of the concurrent PCR test) whereas the study authors considered them to be true positives. With our re‐calculation, the test demonstrated sensitivity of 96.0% (95% CI 90.2% to 98.9%) and specificity of 86.4% (95% CI 79.8% to 91.5%; Courtellemont 2020). Sensitivity in this study may have been inflated by the inclusion of hospitalised, confirmed SARS‐CoV‐2‐positive participants.

Abbott – Panbio Covid‐19 Ag

We identified 11 evaluations of the Panbio assay, including 5691 unique samples, with 2031 SARS‐CoV‐2‐positive cases (Figure 6). One of the 11 evaluations included only SARS‐CoV‐2‐positive cases (n = 182 samples). Studies were conducted in community COVID‐19 test centres or emergency departments (n = 6), in contacts of confirmed cases (n = 2), and laboratory‐based evaluations (n = 2). The setting was not clear in one study. Participants were reportedly symptomatic (n = 5), asymptomatic (n = 1), with mixed symptom status (n = 4), or symptom status was not reported (n = 1). Nine evaluations used nasopharyngeal samples (Albert 2020; Billaud 2020; Fenollar 2020(b); FIND 2020b; Fourati 2020 [C]; Gremmels 2020(a); Gremmels 2020(b); Linares 2020), one (Alemany 2020), tested nasopharyngeal or nasal samples and one (Schildgen 2020 [A]), used bronchoalveolar lavage or throat wash samples. Only three of the 11 evaluations reported product codes for the assays used, one of which was for the assay for use with nasopharyngeal swabs (41FK10) and two (from the same study report) were for the assay for use with nasal swabs (41FK11), although the study reports using nasopharyngeal samples (Gremmels 2020(a); Gremmels 2020(b)).

Five of the 11 evaluations complied with manufacturer IFU for the test. Reasons for non‐compliance included use of viral transport medium, frozen storage, type of swab tested, or lack of clear reporting of test procedures used.

The average sensitivity and specificity of the Panbio assay were:

  • 72.0% (95% CI 60.6% to 81.1%) and 99.3% (95% CI 99.0% to 99.6%) overall (n = 10; 5509 samples; 1849 cases; Table 3);

  • 74.1% (95% CI 60.8% to 84.0%) and 99.8% (95% CI 99.5% to 99.9%) in symptomatic people (n = 8; 3699 samples, 1162 cases); and

  • 58.1% (95% CI 41.7% to 72.9%) and 98.4% (95% CI 92.2% to 99.7%) in asymptomatic people (n = 6; 1097 samples, 190 cases; Table 4).

Restricting to IFU‐compliant evaluations, average sensitivities and specificities were:

  • 72.0% (95% CI 56.5% to 83.5%) and 99.2% (95% CI 98.5% to 99.5%) overall (n = 5; 1776 samples, 362 cases; Table 3);

  • 75.1% (95% CI 57.3% to 87.1%) and 99.5% (95% CI 98.7% to 99.8%) in symptomatic people (n = 3; 1094 samples, 252 cases); and

  • 48.9% (95% CI 35.1% to 62.9%) and 98.1% (95% CI 96.3% to 99.1%) in asymptomatic people (n = 2; 474 samples, 47 cases; Table 4).

The addition of one evaluation that reported sensitivity only in symptomatic participants led to only marginal differences in average sensitivity (Fenollar 2020(a); Table 4).

Becton Dickinson ‐ BD Veritor

We identified three evaluations of the BD Veritor assay, including 727 unique samples, with 180 SARS‐CoV‐2‐positive cases (Figure 6). One of the three evaluations included only SARS‐CoV‐2‐positive cases (n = 125 samples). Studies were conducted in community COVID‐19 test centres (n = 2), or in multiple settings (n = 1). All participants were symptomatic. Two evaluations used combined naso‐ and oropharyngeal samples and one tested nasal samples.

None of the evaluations complied with manufacturer IFU for the test because the interval between sample collection and testing was greater than the maximum of one hour.

Average sensitivity and specificity of the BD Veritor assay were:

Adding the ‘cases only’ evaluation reduced average sensitivity to 79.4% (95% CI 72.9% to 84.7%) (n = 3; 180 cases; Van der Moeren 2020(b)).

The BD Veritor assay requires interpretation using a Veritor analyzer device, but Van der Moeren 2020(a) found that visual inspection of the test device resulted in the same sensitivity as with the Analyzer device, and similar specificity (100% compared to 99% using the Analyzer device).

BIONOTE ‐ NowCheck COVID‐19 Ag

We identified a single IFU‐compliant evaluation of the NowCheck assay in symptomatic participants (FIND 2020a;Figure 7). The study included 400 samples with 102 SARS‐CoV‐2‐positive cases, from participants presenting at a community‐based COVID‐19 test centre.

The sensitivity and specificity in this study were 89.2% (95% CI 81.5% to 94.5%) and 97.3% (95% CI 94.8% to 98.8%; Table 3; Table 4).

Biosynex ‐ Biosynex COVID‐19 Ag BSS

We identified a single evaluation of the Biosynex assay in symptomatic participants (Fourati 2020 [D]), including 634 samples with 297 with confirmed SARS‐CoV‐2 (Figure 7). The evaluation was not in compliance with the manufacturer’s IFU because samples were stored in viral transport medium and frozen prior to testing. The setting in which participants presented for testing was not reported.

Observed sensitivity was 59.6% (95% CI 53.8% to 65.2%) and specificity 100% (95% CI 98.9% to 100%; Table 3; Table 4).

Coris Bioconcept ‐ COVID‐19 Ag Respi‐Strip

The seven evaluations of the Coris Bioconcept assay included 1781 samples, with 707 SARS‐CoV‐2‐positive cases (Blairon 2020; Fourati 2020 [A]; Kruger 2020(b); Lambert‐Niclot 2020; Mertens 2020; Scohy 2020; Veyrenche 2020; Figure 6). Five of the seven were laboratory‐based evaluations with limited detail regarding study participants. One study recruited from community‐based COVID‐19 test centres and one included samples from hospital inpatients. Three studies included only or mainly symptomatic participants, one was in a mixed group and three did not report symptom status.

All evaluations tested naso‐ or oropharyngeal swabs and were compliant with the manufacturer IFU, however, it may be worth noting that the IFU for this assay permits the use of viral transport medium and freezing of samples, although immediate testing is recommended.

The average sensitivity and specificity of the COVID‐19 Ag Respi‐Strip were:

  • 39.7% (95% CI 31.3% to 48.7%) and 98.3% (95% CI 97.4% to 98.9%) overall (n = 7; 1781 samples, 707 cases; Table 3);

  • 34.1% (95% CI 29.7% to 38.8%) and 100% (95% CI 99.0% to 100%) in symptomatic people (n = 3; 780 samples, 414 cases); and

  • 28.6% (95% CI 8.4% to 58.1%) and 100% (95% CI 88.8% to 100%) in asymptomatic people (n = 1; 45 samples, 14 cases; Scohy 2020;Table 4).

E25Bio ‐ DART (nasopharyngeal)

We identified a single evaluation of the E25Bio DART assay that included 190 samples, 100 with SARS‐CoV‐2 (Nash 2020; Figure 7). The symptom status of included participants was not reported and the manufacturer IFU is not yet available as the assay has been submitted for Emergency Use Authorisation (EUA) approval with the US Food and Drug Administration (FDA).

Sensitivity was 80.0% (95% CI 70.8% to 87.3%) and specificity 91.1% (95% CI 83.2% to 96.1%; Table 3).

Fujirebio ‐ ESPLINE SARS‐CoV‐2

We included two eligible evaluations were included, with a total of 265 samples, 165 were SARS‐COV‐2‐positive (Nagura‐Ikeda 2020; Takeda 2020; Figure 7). One study reported only sensitivity data (Nagura‐Ikeda 2020).

Takeda 2020 reported sensitivity of 80.6% (95% CI 68.6% to 89.6%) and specificity of 100% (95% CI 96.4% to 100%) in nasopharyngeal samples (162 samples, 62 cases; Table 3). They did not report symptom status of participants and provided insufficient detail to allow us to judge IFU compliance.

Nagura‐Ikeda 2020 evaluated the assay using saliva samples in symptomatic participants (not within IFU specifications), the ESPLINE assay correctly identified 12 of 103 PCR‐positive samples (sensitivity 11.6%, 95% CI 6.2% to 19.5%; Table 3; Table 4).

Innova Medical Group ‐ Innova SARS‐CoV‐2 Ag

We included one report that evaluated the Innova study as six separate substudies; three reported both sensitivity and specificity (PHE 2020(a); PHE 2020(b); PHE 2020(c) [non‐HCW tested]), two reported sensitivity alone (PHE 2020(d) [HCW tested]; PHE 2020(d) [Lab tested]), and one reported specificity alone (PHE 2020(e); Figure 6). The studies reported a total of 3904 participants, including 1017 SARS‐CoV‐2‐positive cases. Detail regarding symptom status, was limited, however the study populations were coded as: symptomatic (samples from hospital inpatients in PHE 2020(a)), mainly symptomatic for samples from COVID‐19 testing centres (PHE 2020(c) [non‐HCW tested]; PHE 2020(d) [HCW tested]; PHE 2020(d) [Lab tested]), although data on symptom status were reported for only two of these studies (PHE 2020(d) [HCW tested]; PHE 2020(d) [Lab tested]), not reported for the outbreak investigation in PHE 2020(b) and asymptomatic staff screening for PHE 2020(e). The study authors for the outbreak evaluation study did not report the sensitivity value of 28.3% (95% CI 16.0% to 43.5%) in the publications but provided it to us on request.

All evaluations used naso‐ or oropharyngeal samples, two in viral transport medium (PHE 2020(a); PHE 2020(b)), and four using direct swab testing in compliance with manufacturer IFU (PHE 2020(c) [non‐HCW tested]; PHE 2020(d) [HCW tested]; PHE 2020(d) [Lab tested]; PHE 2020(e)).

For studies reporting both sensitivity and specificity, average sensitivity and specificity were:

  • 47.9% (95% CI 34.3% to 61.8%) and 99.8% (95% CI 99.5% to 99.9%) overall (n = 3; 2945 samples, 596 cases; Table 3); and

  • 56.2% (95% CI 52.0% to 60.3%) and 99.8% (95% CI 99.5% to 99.9%) in symptomatic people (n = 2; 2794 samples, 550 cases; Table 4).

Only one of the three studies that reported both sensitivity and specificity was compliant with manufacturer IFU, the sensitivity and specificity were:

  • 57.5% (95% CI 52.3% to 62.6%) and 99.6 (95% CI 99.1%, 99.9%) overall (n = 1; 1676 samples, 372 cases).

Summary results from the four IFU‐compliant evaluations were calculated as follows:

  • average sensitivity across three evaluations of mainly symptomatic participants 69.1% (95% CI 58.3% to 78.2%; n = 3; 793 cases; Table 3; Table 4);

  • average specificity from two evaluations of 99.7% (95% CI 99.3% to 99.9%; n = 2; 1842 samples with no SARS‐CoV‐2; Table 3).

Adding data from single‐group evaluations in either RT‐PCR‐positive or RT‐PCR‐negative participants:

  • average sensitivity was 59.0% (43.4%, 73.0%) (n = 5; 1015 cases)

  • average specificity was 99.8% (99.5%, 99.9%) (n = 4; 2887 RT‐PCR negative samples) (Table 4).

Results for each of the three IFU‐compliant evaluations by test operator were (Figure 6):

  • sensitivity of 57.5% (95% CI 52.3% to 62.6%) and specificity 99.6% (95% CI 99.1% to 99.9%), when the test was used by self‐trained, non‐healthcare workers (n = 1; 1676 samples, 372 cases; PHE 2020(c) [non‐HCW tested]);

  • sensitivity of 70.0% (95% CI 63.5% to 75.9%) when the test was used by healthcare workers (n = 1; 223 cases; PHE 2020(d) [HCW tested]);

  • sensitivity of 78.8% (95% CI 72.4% to 84.3%) when the test was used by laboratory scientists (n = 1; 198 cases; PHE 2020(d) [Lab tested]).

Liming Bio‐Products ‐ StrongStep® COVID‐19 Ag

We identified a single evaluation of the StrongStep assay in 19 symptomatic participants with nine SARS‐CoV‐2 positive samples ((Weitzel 2020 [B]; Figure 7). We could not identify the manufacturer’s IFU for this assay. The study authors terminated the evaluation early following poor early results for this assay.

Sensitivity was 0% (95% CI 0% to 33.6%) and specificity 90.0% (95% CI 55.5% to 99.7%; 19 samples, 9 cases; Table 3; Table 4).

Quidel Corporation ‐ SOFIA SARS Antigen

We identified a single evaluation of the SOFIA assay in symptomatic participants, including 64 samples with 32 SARS‐CoV‐2‐positive cases (Porte 2020b [A]; Figure 7). The study used combined naso‐ and oropharyngeal swab samples in viral transport medium, therefore the evaluation was not compliant with the manufacturer IFU.

Sensitivity was 93.8% (95% CI 79.2% to 99.2%) and specificity was 96.9% (95% CI 83.8% to 99.9%; Table 3; Table 4).

RapiGEN ‐ BIOCREDIT COVID‐19 Ag

We identified six evaluations of the RapiGen BIOCREDIT assay; these reported data for 2170 samples, with 470 confirmed SARS‐COV‐2‐positive cases (FIND 2020e (BR); FIND 2020e (DE); Mak 2020;Schildgen 2020 [A]; Shrestha 2020; Weitzel 2020 [A]; Figure 6). One laboratory‐based study included cases only (n = 160). The other evaluations included participants from community‐based COVID‐19 test centres (n = 2), emergency departments (n = 1), contact tracing (n = 1) or did not clearly report the setting (n = 1). Two studies included only symptomatic participants, two reported including both symptomatic and asymptomatic participants (mixed group) and one did not report symptom status. All evaluations apart from one (Schildgen 2020 [A]), tested nasopharyngeal or combined naso‐ or oropharyngeal samples.

Only three of the six evaluations complied with manufacturer IFU, with non‐compliance because of the use of viral transport medium, or the type of swab tested.

The average sensitivity and specificity of the BIOCREDIT assay were:

  • 63.3% (95% CI 45.7% to 78.0%) and 99.5% (95% CI 99.1 to 99.8) overall (n = 5; 2010 samples, 310 cases; Table 3);

  • 58.4% (95% CI 36.3% to 77.5%) and 96.4% (95% CI 82.8% to 99.3%) in symptomatic people (n = 3; 608 samples, 206 cases);

  • 63.2% (95% CI 21.7% to 91.4%) and 98.9% (95% CI 82.9% to 99.9%) in asymptomatic people (n = 2; 140 samples, 60 cases) (Table 4).

Restricting to IFU‐compliant evaluations, average sensitivities and specificities were:

  • 73.0% (95% CI 57.4% to 84.4%) and 99.8% (95% CI 99.4% to 99.9%) overall (n = 3; 1828 samples, 189 cases; Table 3);

  • 74.4% (95% CI 65.5% to 82.0%) and 98.9% (95% CI 97.2% to 99.7%) in symptomatic people (n = 1; 476 samples, 117 cases);

  • 85.1% (95% CI 71.7% to 93.8%) and 100% (95% CI 94.6% to 100%) in asymptomatic people (n = 1; 113 samples, 47 cases; Shrestha 2020;Table 4).

The addition of one evaluation that reported sensitivity only led to a decrease in overall average sensitivity of 5.6 percentage points (Mak 2020; Table 4).

Roche ‐ SARS‐CoV‐2

According to the manufacturer IFU, the Roche SARS‐CoV‐2 assay is available under a partnership with SD Biosensor.

There was a single evaluation of the Roche assay using 73 bronchoalveolar lavage or throat wash samples (not covered by the IFU) in participants with mixed symptom status (Figure 7); 42 of the 73 samples were RT‐PCR‐positive (Schildgen 2020 [A]).

Overall, using bronchoalveolar lavage or throat wash samples, the sensitivity and specificity were 88.1% (95% CI 74.4% to 96.0%) and 19.4% (95% CI 7.5% to 37.5%) (73 samples, 42 cases; Table 3). Only the results for the subgroup of 50 throat wash samples could be separated by symptom status:

  • in symptomatic participants, sensitivity was 100% (95% CI 69.2% to 100%) and specificity was 7.7% (95% CI 0.2% to 36.0%) with 23 throat wash samples and 10 cases;

  • in asymptomatic participants, sensitivity was 84.6% (95% CI 54.6% to 98.1%) and specificity was 14.3% (95% CI 1.8% to 42.8%), with 27 throat wash samples, 13 cases; Table 4).

Savant Biotech ‐ Huaketai SARS‐CoV‐2 N Protein

We identified a single evaluation of the Huaketai assay in 109 symptomatic participants, using combined naso‐ or oropharyngeal swabs in viral transport medium (Weitzel 2020 [C]; Figure 7). We could not obtain the manufacturer IFU.

Sensitivity was 16.7% (95% CI 9.2% to 26.8%) and specificity was 100% (95% CI 88.8% to 100%; 109 samples, 78 cases; Table 3; Table 4).

SD Biosensor ‐ STANDARD F COVID‐19 Ag

We identified four evaluations of the STANDARD F assay; these reported data for 1552 samples, with 295 confirmed SARS‐COV‐2‐positive cases (FIND 2020d (BR); FIND 2020d (DE); Liotti 2020; Porte 2020b [B]; Figure 6). Three evaluations included all or mainly symptomatic participants from community‐based COVID‐19 test centres and one was a laboratory‐based study that did not provide details regarding symptom status.

All evaluations tested nasopharyngeal or combined naso‐ or oropharyngeal samples, however only two complied with manufacturer IFU. Reasons for non‐compliance were the use of viral transport medium, or lack of information concerning viral transport medium.

The average sensitivity and specificity of the STANDARD F COVID‐19 Ag assay were:

  • 72.6% (95% CI 54.0% to 85.7%) and 97.5% (95% CI 96.4% to 98.2%) overall (n = 4; 1552 samples, 295 cases; Table 3);

  • 78.0% (95% CI 71.6% to 83.3%) and 97.2% (95% CI 96.0% to 98.1%) in symptomatic people (n = 3; 1193 samples, 191 cases; Table 4).

No data for asymptomatic people were available.

Restricting to IFU‐compliant evaluations, average sensitivity and specificity were:

  • 75.5% (95% CI 68.2% to 81.5%) and 97.2% (95% CI 96.0 to 98.1%), both studies in symptomatic people (n = 2; 1129 samples, 159 cases; Table 4).

SD Biosensor ‐ STANDARD Q COVID‐19 Ag

We identified six evaluations of the STANDARD Q assay; these reported data for 3480 samples, with 821 confirmed SARS‐CoV‐2‐positive cases (Figure 6). Four evaluations included participants from community‐based COVID‐19 test centres, one was a laboratory‐based study, and one included multiple settings. Four evaluations included symptomatic or mainly symptomatic participants, and two included mixed symptomatic and asymptomatic participants.

All evaluations tested nasopharyngeal or combined naso‐ or oropharyngeal samples, four of which were compliant with manufacturer’s IFUs, the other two used samples in viral transport medium.

The average sensitivity and specificity of the STANDARD Q COVID‐19 Ag assay were:

  • 79.3% (95% CI 69.6% to 86.6%) and 98.5% (95% CI 97.9% to 98.9%) overall (n = 6; 3480 samples, 821 cases; Table 3);

  • 80.1% (95% CI 68.5% to 88.1%) and 98.1% (95% CI 97.4% to 98.6%) in symptomatic people (n = 5; 2760 samples, 731 cases); and

  • 61.1% (95% CI 37.9% to 80.2%) and 99.6% (95% CI 97.3% to 99.9%) in asymptomatic people (n = 2; 272 samples, 18 cases; Table 4).

Restricting to IFU‐compliant evaluations, average sensitivities and specificities were:

  • 85.8% (95% CI 80.5% to 89.8%) and 99.2% (95% CI 98.2% to 99.6%) overall (n = 4; 2522 samples, 421 cases; Table 3);

  • 88.1% (95% CI 84.2% to 91.1%) and 99.1% (95% CI 97.8% to 99.6%) in symptomatic people (n = 3; 1947 samples, 336 cases); and

  • 69.2% (95% CI 38.6% to 90.9%) and 99.1% (95% CI 95.2% to 100%) in asymptomatic people (n = 1; 127 samples, 13 cases; Table 4).

Shenzhen Bioeasy Biotech ‐ 2019‐nCoV Ag

We included three evaluations of the Bioeasy FIA; these included 965 samples with 177 SARS‐CoV‐2‐positive cases ((Kruger 2020(a); Porte 2020a; Weitzel 2020 [D]; Figure 6). Studies were conducted in hospital emergency departments (n = 2) or a community COVID‐19 test centre (n = 1). Participants in studies were all symptomatic or mainly symptomatic.

Two evaluations used combined naso‐ or oropharyngeal swabs and one tested either nasopharyngeal or oropharyngeal swabs. Two evaluations used swabs in viral transport medium, which was not documented as suitable for use on the manufacturer IFU.

The average sensitivity and specificity of the Shenzhen Bioeasy assay were :

  • 86.2% (95% CI 72.4% to 93.7%) and 93.8 (95% CI 91.9% to 95.3%) overall (all symptomatic; n = 3; 965 samples, 177 cases; Table 3; Table 4).

The single IFU‐compliant evaluation Kruger 2020(a) reported sensitivity of 66.7% (95% CI 38.4% to 88.2%) and specificity of 93.1% (95% CI 91.0% to 94.9%; 727 samples, 15 cases).

We also included an additional study that reported the development of this assay but we did not pool data with the other evaluations as it was a development and not a validation study (Diao 2020; Figure 7). Sensitivity was 67.8% (95% CI 61.0% to 74.1%) and specificity was 100% (95% CI 88.8% to 100%; 239 samples, 208 cases).

Direct test comparisons

Three studies reported direct comparisons of different antigen assays in naso‐ or oropharyngeal samples; however none of the studies had any assay comparisons in common. All three studies utilised swabs in viral transport medium and all were conducted in symptomatic participants. We cannot derive any clear conclusions about comparative performance of tests from these studies.

Figure 8 shows variable diagnostic performance between and to some extent within studies. Four of the five assays in Fourati 2020 [A] demonstrated sensitivities in the range of 55% to 62% (SD Biosensor STANDARD Q, Abbott Panbio Covid‐19 Ag, Biosynex COVID‐19 Ag, AAZ – COVID‐VIRO), with one outlier (Coris Bioconcept – Covid‐19 Ag) at 35% (maximum of 297 cases). Specificity was 100% for all assays apart from SD Biosensor SDQ (specificity 93%; 337 pre‐pandemic samples).

In Porte 2020b [A] (32 cases) both assays had sensitivities over 90% (SD Biosensor STANDARD F and Quidel Sofia SARS Antigen), with specificities 97% (32 non‐COVID‐19 samples)

Weitzel 2020 [A] observed a range in assay sensitivities from 0% for the Liming Bio‐Products assay (based on only nine cases), to 17% (for Savant Biotech – Huaketai SARS‐CoV‐2 N), 62% (RapiGEN – BIOCREDIT COVID‐19 Ag) and 85% for Shenzhen Bioeasy Biotech – 2019 nCov Ag (78 to 80 cases for the latter three assays). Specificities were 100% for all assays (based on 30 to 31 samples) apart from the one from Liming Bio‐Products (specificity 90% based on 10 samples).

Accuracy of rapid molecular tests overall and by subgroup

Average sensitivity and specificity for the 29 rapid molecular test evaluations that included samples with and without SARS‐CoV‐2, were 95.1% (95% CI 90.5% to 97.6%) and 98.8% (95% CI 98.3% to 99.2%; 4351 samples, 1781 with confirmed SARS‐CoV‐2; Table 5). Adding the three 'cases only' studies made little difference to the average sensitivity (95.5%, 95% CI 91.5% to 97.7%; 1973 cases).

Figure 9 demonstrates heterogeneity in sensitivity estimates (ranging from 57% to 100%), with consistently high specificities (92% to 100%, but with upper limits of 95% CIs of 99% or 100% in every study).

Subgroup analyses by viral load

We extracted sensitivity data according to viral load from 10 evaluations of molecular tests, six of which reported data at a Ct threshold for higher viral load of 30 or less (Jokela 2020; Lieberman 2020; Mitchell 2020; Smithgall 2020 [A]; Smithgall 2020 [B]; Wolters 2020), four using Xpert Xpress and two using ID NOW. (Appendix 16)

All sensitivity estimates for the higher viral load subgroups were 100% (based on 204 samples with confirmed SARS‐CoV‐2), with a 95% CI for the average of 98.2% to 100%. For the lower viral load group, average sensitivity was 95.6% (95% CI 55.7% to 99.7%) (149 samples with confirmed SARS‐CoV‐2; Table 5).

We observed a similar pattern for the studies using alternative Ct thresholds to define higher and lower viral load (Appendix 17).

Subgroup analysis by study design

We did not observe any clear differences in average sensitivity or specificity when studies were separated by study design (2899 samples and 976 cases in 18 single‐group studies and 1265 samples and 718 cases in nine two‐group studies; Table 5; Appendix 17). Average sensitivity was higher in two‐group studies (97.2%, 95% CI 90.7% to 99.2%) compared to single‐group studies (93.2%, 95% CI 85.5% to 97.0%); a difference of 4.0 percentage points (95% CI from 2.2 percentage points lower to 10.1 higher). Average specificities had almost identical point estimates at 99.4% (95% CI 98.4 to 99.8%) and 99.3% (95% CI 96.5% to 99.8%) respectively (Table 5).

Abbott – ID NOW

Thirteen studies evaluated the ID NOW assay, with 1949 samples and 730 confirmed SARS‐CoV‐2 cases; one study included only SARS‐CoV‐2‐positive cases (n = 36; Figure 10). Seven evaluations were laboratory‐based, three recruited participants from emergency department settings and three were conducted in multiple settings. Seven studies included only symptomatic participants, two included both symptomatic and asymptomatic people, and four did not report symptom status.

Eleven evaluations used nasopharyngeal or nasal swab samples, one was conducted using saliva samples and one did not specify the sample type. Only four evaluations were compliant with manufacturer IFUs; lack of compliance was based on the use of viral transport medium, sample type, and interval between sample collection and testing.

Pooled analyses demonstrated average sensitivity and specificity of:

  • 78.6% (95% CI 73.7% to 82.8%) and 99.8% (95% CI 99.2% to 99.9%) overall (n = 12; 1853 samples, 634 cases); and

  • 73.0% (95% CI 66.8% to 78.4%) and 99.7% (95% CI 98.7% to 99.9%), restricted to evaluations that were compliant with the manufacturer’s IFU (n = 4; 812 samples, 222 cases; Table 5).

Average sensitivity increased to 81.5% (95% CI 75.2% to 86.5%), with the addition of the cases only study (730 cases; Rhoads 2020).

Cepheid Inc – Xpert Xpress

The Xpert Xpress assay was evaluated in 15 studies using respiratory specimens, with 1781 samples and 1001 confirmed SARS‐CoV‐2 cases; two of the studies included only SARS‐CoV‐2‐positive cases (n = 90; Figure 10). Thirteen evaluations were laboratory‐based, one recruited participants from emergency department settings and one included samples from hospital inpatients. Three studies included only symptomatic participants, one included both symptomatic and asymptomatic people (mixed symptom status), and 11 did not report symptom status.

Fourteen evaluations used nasopharyngeal, oropharyngeal or nasal swab samples, and one was conducted using throat saliva or lower respiratory samples. Only three evaluations were compliant with manufacturer IFUs. Lack of compliance with the IFU was because of the use of frozen samples (n = 8), or sample type (n = 1) or concerns about the timing between sample collection and testing (n = 3).

Pooled analyses demonstrated average sensitivity and specificity of:

  • 99.1% (95% CI 97.7% to 99.7%) and 97.9% (95% CI 94.6 % to 99.2%) overall (n = 13; 1691 samples, 911 with confirmed SARS‐CoV‐2);

  • 100% (95% CI 88.1% to 100%) and 97.2% (95% CI 89.4%, 99.3%), restricted to evaluations that were compliant with the manufacturer’s IFU (n = 2; 100 samples, 29 cases; Table 5)

Average sensitivity did not change with addition of two cases‐only studies (99.1%, 95% CI 97.8% to 99.6%; n = 15; 730 cases; Broder 2020; Chen 2020a).

One additional study considered accuracy in non‐respiratory samples using Xpert Xpress (Szymczak 2020). Sensitivity in stool samples obtained up to 33 days after symptom onset was 93.1% (95% CI 77.2% to 99.1%) and specificity was 96.0% (95% CI 86.3% to 99.5%; 79 samples, 29 cases).

Comparison of ID NOW with Xpert Xpress

Comparing the overall pooled results between ID NOW and Xpert Xpress, the average sensitivity of Xpert Xpress was 19.8 (95% CI 14.9 to 24.7) percentage points higher than that of ID NOW (P < 0.0001; Table 5).

The average specificity of Xpert Xpress was marginally lower than that of ID NOW, a difference of −1.9 percentage points (95% CI −3.8 to −0.1).

DNAnudge – COVID Nudge

We included one evaluation of COVID Nudge with a total of 386 participants and 71 SARS‐CoV‐2‐positive cases (Gibani 2020; Figure 10). Participants were recruited from multiple settings including hospital inpatients (n = 88), accident and emergency (n = 15) and healthcare workers and their families (n = 280). All participants were symptomatic and direct testing of nasopharyngeal samples was used (within manufacturer IFU).

The sensitivity of the COVID Nudge assay was 94.4% (95% CI 86.2 to 98.4%) and specificity was 100% (95% CI 98.8% to 100%; 386 samples and 71 cases; Table 5).

Diagnostics for the Real World (DRW) – SAMBA II

We included two evaluations of SAMBA II with 321 samples (121 with confirmed SARS‐CoV‐2; Figure 10). All participants were symptomatic. One study conducted direct testing of combined naso‐ or oropharyngeal samples from hospital inpatients and the other obtained combined naso‐ or oropharyngeal samples in viral transport medium from Public Health England. It was not reported whether the PHE samples were stored or frozen prior to testing so we could not determine whether they complied with the IFU for the assay.

The average sensitivity and specificity of SAMBA‐II were 96.0% (95% CI 81.1% to 99.3%) and 97.0% (95% CI 93.5% to 98.6%; 2 studies; 321 samples, 121 with confirmed SARS‐CoV‐2; Table 5).

In the IFU‐compliant evaluation, sensitivity was 87.9% (95% CI 71.8% to 96.6%) and specificity was 97.4% (95% CI 92.6% to 99.5%; 149 samples, 33 cases; Collier 2020; Table 5).

Mesa Biotech – Accula

We included one evaluation of the Accula assay with a total of 100 samples (50 SARS‐CoV‐2 positive; Hogan 2020; Figure 10). The study was laboratory‐based and symptom status was not reported.

The study used nasopharyngeal samples in viral transport medium or saline, therefore the evaluation was not compliant with IFU requirements.

The sensitivity and specificity of the Accula test were 68.0% (95% CI 53.3% to 80.5%) and 100% (95% CI 92.9% to 100%; 100 samples, 50 cases; Table 5).

Sensitivity analysis of the impact of discrepant analysis

Six evaluations of molecular tests (in 1533 samples) reported results before and after discrepant analysis where selected samples were re‐tested with either the same (Collier 2020; Harrington 2020; Moran 2020; Stevens 2020), or an alternative RT‐PCR assay (Assennato 2020; Loeffelholz 2020). Four studies also reported re‐testing of samples with the index test (Assennato 2020; Collier 2020; Harrington 2020; Moran 2020; Appendix 16; Appendix 17).

Discrepant analysis reduces the number of samples deemed to be false negative or false positive errors. Discrepant analysis reduced the false negative proportion (1‐sensitivity) from 2.1% to 0.8% and the false positive rate (1‐specificity) from 2.2% to 0.4%. Three of the five studies reporting initially false positive results reported zero false positives after sample re‐testing and one reported a drop in false positives from 11 to 3 (Loeffelholz 2020; Appendix 16). Three of the four studies that reported re‐testing of initially false negative results reported reclassification as true negative on re‐testing, and in the other the single false negative remained as a false negative. Given the bias inherent in choosing the reference test dependent on the observed results, we caution against these findings.

An additional study tested all samples with two different RT‐PCR assays, and hence used a more accurate reference standard in all samples, not just samples with discrepant results (Moore 2020). Six initial true negatives were reclassified as false negatives after the second RT‐PCR. Had discrepant analysis been undertaken these misclassifications would have been missed, further underlining the methodological flaws inherent to discrepant analysis.

Other sources of heterogeneity

We also planned to evaluate the effect of sample type and reference standard.

For sample type, the use of variable combinations of sample types with or without viral transport media created numerous sparse subgroups by sample type (Appendix 18). Instead we considered study compliance with manufacturer IFU requirements which is a more pragmatic classification.

All studies used RT‐PCR alone as the reference standard for diagnosing SARS‐CoV‐2 infection.

Publication bias

We did not formally test for publication bias evident in the pattern of results, but did note that the identity of tests not meeting the PHE assessment criteria were not reported due to confidentiality agreements (PHE 2020(a)).

Discussion

This is the second iteration of a Cochrane living review summarising the accuracy of point‐of‐care antigen and molecular tests for detecting current SARS‐CoV‐2 infection. This version of the review is based on published journal articles or studies available as preprints from 1 January 2020 up until 30 September 2020. In addition, we also included evaluations of antigen assays that were available as independent national reference laboratory publications or that were co‐ordinated and published by FIND, and journal articles that were listed on the Diagnostics Global Health website to 16 November 2020.

Summary of main results

We included data from 77 studies using respiratory specimens, including 24,418 samples (7484 samples with confirmed SARS‐CoV‐2), and one study of faecal specimens (79 samples, 29 with confirmed SARS‐CoV‐2). Forty‐eight studies (reporting 58 test evaluations) considered antigen tests; 30 studies (reporting 33 test evaluations) considered rapid molecular tests, including the single study (evaluation) in faecal samples. Key findings are presented in the summary of findings Table 1.

We summarise six key findings from this review:

1. Despite a considerable increase in the number of studies evaluating point‐of‐care tests, particularly antigen tests, there are still no published or preprint reports of accuracy for a significant number of commercially produced point‐of‐care tests. This review located evaluations for 16 antigen tests (three of which we could not identify as available for purchase) and five molecular assays. These represent a small proportion of assays currently on the market (118 commercialised antigen tests and 53 molecular assays).

2. The new studies have more robust and appropriate study designs compared to those in the first version of this review. Particularly for antigen tests where there are now studies recruiting participants from community‐based COVID‐19 testing clinics. Reporting of key details, such as settings and symptom status have improved, and studies are now evaluating direct swab testing as would occur in a point‐of‐care setting. However, concerns about risk of bias and applicability of results remain, and further improvements in study methods and reporting are needed before strong conclusions can be drawn about the accuracy of many antigen and molecular tests reviewed here. As it is not known whether these limitations will lead to over‐ or underestimates of test accuracy, estimates should be cautiously interpreted in context of their methodological limitations and the settings in which they were conducted. More direct comparisons of test brands are needed, with evaluations undertaken in the intended use settings for these tests.

Particular methodological concerns include the use of deliberate sampling according to known presence or absence of SARS‐CoV‐2 infection; use of anonymised samples submitted to laboratories for routine RT‐PCR testing (with no setting or participant details); and no information on symptoms or time from symptom onset. Differences in case‐mix related to symptomatic status, time post‐symptom onset and distribution of viral load are likely to have contributed to the observed variation in accuracy.

RT‐PCR was the reference standard in all studies ‐ no study defined the presence of COVID‐19 using clinical or radiological features in the absence of a negative RT‐PCR result.

3. Studies frequently did not follow the manufacturer’s instructions or did not use the test at the point of care. Fewer than half conducted the tests according to the manufacturers' IFU (41% (37/91); 29/58 antigen test evaluations and 8/33 molecular test evaluations). Reasons for non‐compliance included use of frozen samples, use of viral transport media, or lengthy intervals between sample collection and testing. Almost a third of studies (23/78) undertook on‐site, direct swab testing immediately or within an hour of sample collection; trained laboratory staff conducted tests in 16 (21%) studies, and 31 (40%) studies did not clearly describe the test operator and setting for the test procedure but we inferred that tests were carried out in a centralised laboratory setting, for example based on reported delays between collection and testing or reported use of archived or frozen samples.

4. For antigen test evaluations in symptomatic participants, we observed considerable heterogeneity in sensitivities (and to a lesser extent the specificities). Whilst the average sensitivity was 72.0% (95% CI 63.7% to 79.0%) and specificity was 99.5% (95% CI 98.5% to 99.8%), average sensitivity decreased with time since onset of symptoms, being higher in the first week (78.3%, 95% CI 71.1% to 84.1%) than when done later (51.0 95% CI 40.8% to 61.0%). Sensitivity was high in those with higher viral loads defined by Ct values ≤ 25 (94.5% 95% CI 91.0% to 96.7%) compared to those with lower viral loads (40.7%, 95% CI 31.8% to 50.3%). Focusing on studies that used the test in accordance with the manufacturer’s instructions, sensitivities for different brands varied from 34% to 96% (either based on pooled results or single studies). WHO have set a minimum 'acceptable' sensitivity requirement of 80%, and acceptable and ideal (or 'desirable') specificity requirements of 97% and 99% respectively (WHO 2020c). Only one assay (SD Biosensor STANDARD Q) met the WHO acceptable criterion for sensitivity based on pooled results of several studies. One further test (BIONOTE NowCheck) also met the acceptable sensitivity criterion, but only one study evaluated it. Abbott Panbio met the sensitivity criterion in individual studies but not overall. The acceptable performance criterion of 97% specificity was also met for all three tests, and two tests met the desirable criterion of more than 99% specificity (Abbott Panbio and SD Biosensor STANDARD Q).

Considerable heterogeneity in sensitivities remained after restricting analyses by test brand and symptom status, suggesting an effect not only from participant characteristics but from setting, sample type and collection method, sample storage and preparation, and testing procedures that cannot be easily unpicked. The PHE studies included in this review allow some consideration of the effect of test operator experience on the accuracy of the Innova test although different samples were tested by each test operator such that only an indirect comparison of sensitivity can be made. Sensitivity increased from 57.5% (95% CI 52.3%, 62.6%; 372 samples) when testing was conducted on‐site by trained non‐healthcare workers (PHE 2020(c) [non‐HCW tested]), to 70.0% (95% CI 63.5% to 75.9%; 223 samples) in samples tested on‐site by healthcare workers ((PHE 2020(d) [HCW tested]), to 78.8% (95% CI 72.4% to 84.3%; 198 samples) for those tested by laboratory scientists (PHE 2020(d) [Lab tested]). The effect of test operator on accuracy has been observed for rapid diagnostic tests for other infectious diseases such as malaria (Boyce 2018; Landier 2018), and is worthy of further investigation for diagnosis of SARS‐CoV‐2.

5. Twelve studies evaluated the accuracy of antigen tests in asymptomatic people for detection of SARS‐CoV‐2 infection defined by PCR status. As discussed, this does not address the issue of whether the test is identifying those who are infectious (as there is no reference standard that can be used). The average sensitivity for detecting infection in asymptomatic participants was 58.1% (95% CI 40.2% to 74.1%) with specificity of 98.9% (95% CI 93.6% to 99.8%), both lower than in symptomatic people. Only half of studies reported clearly defined asymptomatic cohorts (e.g. preventive screening in the general population (n = 1), in returning travellers (n = 1), or in contacts of confirmed cases (n = 4)), the other six reported asymptomatic subgroups from mixed symptom cohorts. Only one of the 12 studies provided data by viral load (Fenollar 2020(b)); 5% (1/22) of RT‐PCR‐positive samples had a Ct value of 25 or less, but 50% (11/22) had Ct values of 30 or less. No information on time after exposure to infection was reported.

6. For rapid molecular assays there were differences between test brands. Most data were for ID NOW and Xpert Xpress assays; average sensitivity for ID NOW was 78.6% (95% CI 73.7% to 82.8%) and Xpert Xpress 99.1% (95% CI 97.7% to 99.7%). Specificity for ID NOW was 99.8% (95% CI 99.23%, 99.9%) and Xpert Xpress 97.9% (95% CI 94.6% to 99.2%). These differences are beyond those expected by chance (P < 0.0001).

We were not able to investigate the effects of symptomatic status, or time from symptom onset: 12/29 were from symptomatic populations, three from ‘mixed’ symptomatic and asymptomatic populations (percentage from each group not reported), and the remaining 14 evaluations provided no information on symptom status (2/14 recruited from A&E and 12 were laboratory‐based). These and other methodological limitations in the studies mean that we do not know how the assays would perform in any specific clinical setting when used in people suspected of having SARS‐CoV‐2 infection on the basis of symptoms, or of exposure to a confirmed case in the absence of symptoms. It is likely however that some difference in sensitivity between ID NOW and Xpert Xpress would be maintained in the absence of bias. The difference in specificity between the tests is small (ID NOW being 1.9% more specific compared to Xpert Xpress), but potentially important especially if used in a low‐prevalence setting. However, this difference in specificity would not be an issue should test‐positives be confirmed by a laboratory‐based RT‐PCR assay.

7. There are proposals for repeated use of antigen tests in different asymptomatic groups, such as school children and staff, hospital and care home workers, and even the general public, with a variety of different testing strategies. We found no data or studies evaluating the accuracy of any of these serial screening strategies.

We did not formally compare antigen with molecular assays because there were no head‐to‐head comparisons of the two test types. Instead, we illustrate predicted numbers of true positives, false positives, false negatives and true negatives, applying summary estimates of test accuracy to a hypothetical cohort of people suspected of SARS‐CoV‐2 infection across a range in prevalence of SARS‐CoV‐2 infection (summary of findings Table 1). For both antigen and molecular assays, we only use summary data from evaluations conducted in accordance with manufacturers’ IFUs, and for antigen tests we used separate results from symptomatic and asymptomatic participants.

Illustration of predicted effect of antigen testing by symptom status

For antigen test evaluations in symptomatic people, we selected three assays representing the range in observed average sensitivities: Coris Bioconcept COVID‐19 Ag Respi‐Strip (34.1% to 95% CI 29.7% to 38.8%), Abbott ‐ Panbio Covid‐19 Ag (75.1% to 95% CI 57.3% to 87.1%); and SD Biosensor ‐ STANDARD Q COVID‐19 Ag (88.1% to 95% CI 84.2% to 91.1%). Average specificities for the same three assays were 100% (95% CI 99.0% to 100%) to 99.5% (95% CI 98.7% to 99.8%) and 99.1% (95% CI 97.8% to 99.6%) respectively. Applied to a cohort of 1000 people with signs and symptoms of COVID‐19, in whom 50 people had confirmed infection (prevalence of 5%), for the three assays above we predicted that:

  • 17, 43 or 53 people would have a positive test result, of which 0, 5 and 9 would be false positives (positive predictive values (PPV) 100%, 88.4% and 83.0%, respectively), and

  • 33, 12 and 6 people with negative test results would be falsely negative (negative predictive values (NPV) 96.6%, 98.7%, and 99.4%).

Increasing the prevalence to 10% or 20%, increases PPV and decreases NPV. As there is considerable heterogeneity in the estimates of sensitivity, the values observed in practice could vary considerably from these figures as shown by the estimates derived from the confidence intervals (summary of findings Table 1).

For antigen test evaluations in asymptomatic participants there was considerably less available data from IFU‐compliant evaluations. We selected the same three exemplars, average sensitivities for identification of any infection (whether infectious or not) were lower than for symptomatic populations: 28.6% (95% CI 8.4% to 58.1%) for the Coris Bioconcept assay; 48.9% (95% CI 35.1% to 62.9%) for the Abbott assay; and 69.2% (95% CI 38.6% to 90.9%) for the SD Biosensor assay. Average specificities for the same three assays were: 100% (95% CI 88.8% to 100%), 98.1% (95% CI 96.3% to 99.1%), and 99.1% (95% CI 95.2% to 100%).

Applying the average values to a larger cohort of 10,000 people asymptomatic for COVID‐19 and with a lower prevalence of 0.5% in whom 50 people had confirmed infection (infectious or not):

  • 14, 213 or 125 individuals would have a positive test result of which 0, 189 and 90 would be false positives (PPVs of 100%, 11% and 28%, respectively), and

  • 36, 26 and 15 people with negative test results would be falsely negative (NPVs 99.6%, 99.7%, and 99.8%).

We derived the summary estimates used in these calculations from asymptomatic participants identified for testing in a number of scenarios and they cannot be directly translated to a particular setting, such as mass screening, for example. The confidence intervals for the average estimates used in these calculations are also extremely wide for both sensitivities and specificities, such that the numbers of false positives and false negatives observed in practice could differ substantially from these figures. Increasing the prevalence of confirmed SARS‐CoV‐2 infection to 1% or 2% makes little difference to the absolute number of false positive results for these assays, but has a large relative effect when considered in relation to the number of positive test results (PPVs for the Abbott and SD Biosensor assays increasing to 40% and 61% at 2% prevalence).

Illustration of predicted effect of rapid molecular tests for symptomatic testing

For molecular assays, data from IFU‐compliant evaluations were available for four of the five assays: ID NOW (Abbott Laboratories), Xpert Xpress (Cepheid Inc), SAMBA II (Diagnostics for the Real World) and COVID Nudge (DNAnudge). Average sensitivities were derived as 73.0% (95% CI 66.8% to 78.4%), 100% (95% CI 88.1% to 100%), 87.9% (95% CI 71.8% to 96.6%) and 94.4% (95% CI 86.2% to 98.4%). Average specificities were 99.7% (95% CI 98.7% to 99.9%), 97.2% (95% CI 89.4% to 99.3%), 97.4% (95% CI 92.6% to 99.5%) and 100% (95% CI 98.8% to 100%), respectively (summary of findings Table 1).

Data by symptom status for these assays were very limited, therefore we assumed that the intended use is most likely to be for diagnosis of acute infection in symptomatic individuals and have applied the average estimates of accuracy to a hypothetical cohort of 1000 people, at prevalences of 5%, 10% and 20% (summary of findings Table 1). If 50 of 1000 people had confirmed infection (5% prevalence):

  • 40, 77, 69 and 47 individuals would have a positive test result of which 3, 27, 25 or 0 would be false positive (PPVs of 93.0%, 64.9%, 63.8%, and 100% respectively).

  • 14, 0, 6 and 3 people with negative test results would be falsely negative (NPVs 98.6%, 100%, 99.4% and 99.7%).

Increasing the prevalence of confirmed SARS‐CoV‐2 infection to 10% or 20% has a large relative effect when considered in relation to the number of positive test results for both Xpert Xpress and SAMBA II (PPVs were 64.9% and 63.8% at 5% prevalence compared to 90.1% and 89.3% at 20% prevalence). Less variation in PPV was observed for ID NOW and COVID‐Nudge because of the higher observed specificities. The NPV for the molecular assays is not affected to the same degree by these prevalence changes because of their relatively high sensitivities and the relatively low‐prevalence scenarios being considered.

Across all exemplar assays in the summary of findings Table 1, we observed the widest variation in NPV for the Coris Bioconcept antigen assay in symptomatic participants (86% to 97%), demonstrating that even in a low‐prevalence setting, tests with poor sensitivity can have a considerable impact on the level of confidence that can be had in a negative test result.

Strengths and weaknesses of the review

Our review used a broad search screening all articles concerning COVID‐19 or SARS‐CoV‐2. We undertook all screening and eligibility assessments, QUADAS‐2 assessments (Whiting 2011), and data extraction of study findings independently and in duplicate. Although it is possible that the use of artificial intelligence text analysis to identify studies most relevant to diagnostic questions may have led to some eligible studies being missed, we believe that the multi‐stranded search strategy used will have identified most if not all relevant literature. Whilst we have reasonable confidence in the completeness and accuracy of the findings up until the search date, should errors be noted please inform us at [email protected] so that we can verify and correct in our next update.

We undertook a careful assessment of sample preparation and biosafety requirements as well as time to test result, to ensure that included tests were suitable for use at the point of care. The application of these index test criteria led to the exclusion of 39 of the 85 studies that we excluded on the basis of the index tests evaluated. Evaluations of alternative laboratory‐based molecular technologies are under consideration for inclusion in another review in our series of Cochrane COVID‐19 diagnostic test accuracy reviews. Furthermore, for this iteration of the review, we explicitly considered whether the test evaluations were conducted in accordance with the manufacturer IFU, regarding the sample types used, the use of viral transport medium and the permitted time between sample collection and testing.

We did not consider any manufacturer statements on the intended use of the tests by population, but we are aware that some IFUs recommend testing only in symptomatic people and within certain time frames after symptom onset (e.g. the Innova assay). Where possible, however, we did provide data separately for symptomatic and asymptomatic participants and identified clear trends towards lower sensitivities in asymptomatic individuals for detection of infection. We were unable to assess the accuracy of antigen tests for identification of infectious individuals, as there is no established reference standard for infectiousness (and it seems unlikely that one will ever be established). We have presented results by Ct value where it has been reported by the individual studies. We recognise the limitations from this approach, and given the extent to which RT‐PCR Ct values vary between assays (Vogels 2020), and between laboratories, we strongly caution against the direct application of our results in high and low Ct value subgroups to any particular clinical context. There is no 'step change' in 'infectiousness' according to any fixed Ct value; increasing numbers of studies demonstrate successful viral culture in individuals considered to have 'low' viral load (Jaafar 2020; Singanayagam 2020), and, more importantly, that transmission of infection does occur from index cases with low RT‐PCR Ct values (Lee 2021; Marks 2021). Ultimately, viral load on its own is only one factor influencing an individual's ability to transmit infection, 'infectiousness' being modified by host factors such as the health of an individual’s immune system or presence of comorbidities, and environmental risk factors including closeness and length of contact with others.

Weaknesses of the review primarily reflect the weaknesses in the primary studies and their reporting. Although study quality improved in comparison to the first iteration of this review, many studies continue to omit descriptions of participants, and key aspects of study design and execution. In order to include data for all tests in pooled analyses we had to include some samples multiple times. We have been explicit about these issues where they arose. It is possible that eligible studies have been missed by our search strategy however we believe the risk to be very low considering our broad approach to identification of literature. Despite our best efforts to be as comprehensive as possible, new evaluations are continuously becoming available and it is impossible for any published and peer‐reviewed systematic review to be fully up to date.

Around a quarter (18/78) of the studies we have included are currently only available as preprints, and as yet, have not undergone peer review. As published versions of these studies are identified in the future, we will double‐check study descriptions, methods and findings, and update the review as required.

Applicability of findings to the review question

There are an increasing number of roles and testing strategies for which antigen and rapid molecular assays are considered, and it is likely that the performance of these tests needs to be considered separately for each of the use cases.

Our review shows that antigen tests do not appear to perform as well in asymptomatic populations compared to symptomatic populations for detecting infection. The amount of available data for asymptomatic populations is less than that from symptomatic populations and is also based on asymptomatic individuals tested in a range of scenarios, from preventive or targeted screening, to contact tracing or testing at dedicated COVID‐19 test centres, which may explain some of the observed variability. It is also not clear whether individuals in these studies were truly cases of asymptomatic infection as opposed to pre‐ or post‐symptomatic, or were even mildly symptomatic and mislabelled as asymptomatic. Incomplete symptom assessment and lack of adequate follow‐up to identify subsequent development of symptoms or previous history of symptoms can all contribute to inappropriate classification of individuals as asymptomatic infection (Meyerowitz 2020). As the studies in our review did not systematically attempt to identify pre‐ or post‐symptomatic individuals, it may be more appropriate to consider the estimates for test accuracy for asymptomatic populations as primarily representing accuracy in those without clearly defined symptoms at the time of testing.

We are aware that several important studies in asymptomatic individuals have been reported since the close of our search. In mass screening in Liverpool, Innova was positive in 28 of 70 PCR‐detected cases (sensitivity for infection 40.0%, 95% CI 28.5% to 52.4%) and 26 of 39 with Ct values less than 25 (sensitivity 66.7%, 95% CI 49.8% to 80.9%). Screening University of Birmingham students found 2 of 7185 students positive with Innova, and estimated sensitivity of 3.2% (95% CI 0.6% to 15.6%) for detecting any infection, 9.1% (95% CI 1.0% to 49.1%) for Ct values less than 30 and 100% (95% CI 15.8% to 100%) for Ct less than 25 (Ferguson 2020). BinaxNOW (which uses the same test strip as PanBio) has been tested in asymptomatic groups: in San Francisco the test detected 7 of 11 PCR‐positive cases (sensitivity 63.6%, 95% CI 30.8% to 89.1%), and 6 of 6 with Ct values less than 30 (100%, 95% CI 54.1% to 100%; Pilarowski 2021); in a drive‐through centre in Massachusetts it detected the virus in 70 of 107 in adults (sensitivity 65.4%, 95% CI 55.6 to 74.4) and 40 of 57 in children (70.2%, 95% CI 56.6% to 81.6%)); no breakdown by viral load is available (Pollock 2020). The specificity of the tests in all studies has remained high (above 99%). This selection of results is not based on a systematic search (this will occur in the next update) but these results suggest that emerging evidence is illustrating a range of sensitivity values for the ability of the tests to detect infection, with high detection rates only in groups with very high viral loads.

Given the superior test performance characteristics for symptomatic populations in the first week of symptoms and in those with higher viral loads, the observed poorer performance in those without symptoms is perhaps not surprising. Evidence suggests that higher viral loads are observed in the first week of illness, beginning two days prior to the development of symptoms (Cevik 2021). Viral load patterns in asymptomatic people are less clear but similarly high titers of SARS‐CoV‐2 have been observed at the onset of infection with a suggestion of faster clearance (Cevik 2021). However, variation in viral trajectories means that even if an asymptomatic person can identify a clear contact with a confirmed case of SARS‐CoV‐2 infection, it is not possible to pinpoint when (or even if) that individual will have a sufficient viral load to be detected on antigen testing. A serial testing policy would be likely to identify at least some infected asymptomatic contacts, but comes at the cost of increased numbers of false positives, especially in low‐prevalence settings. There were no evaluations of serial testing in any of the studies.

For molecular tests, we observed a lack of studies undertaken in intended use settings, with most data being from laboratory testing. Although more evidence is available for accuracy in symptomatic people, applicability issues regarding the way in which the tests are carried out and in how cases of SARS‐CoV‐2 infection are defined remain, and it is not yet possible to determine how tests will perform in practice.

We recommend caution in applying the results outside of the individual study (or closely related) contexts and use case scenarios.

Study flow diagram

Figures and Tables -
Figure 1

Study flow diagram

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies. Numbers in the bars indicate the number of studies

Figures and Tables -
Figure 2

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies. Numbers in the bars indicate the number of studies

Forest plot of studies evaluating antigen tests. BR: Brazil; CH: Switzerland; DE: Germany; HCW: healthcare worker; Lab: laboratory

Figures and Tables -
Figure 3

Forest plot of studies evaluating antigen tests. BR: Brazil; CH: Switzerland; DE: Germany; HCW: healthcare worker; Lab: laboratory

Forest plot of data for antigen tests according to symptom status. A&E: accident and emergency; BR: Brazil; CH: Switzerland; DE: Germany; HCW: healthcare worker; Lab: laboratory

Figures and Tables -
Figure 4

Forest plot of data for antigen tests according to symptom status. A&E: accident and emergency; BR: Brazil; CH: Switzerland; DE: Germany; HCW: healthcare worker; Lab: laboratory

Forest plot of antigen test evaluations by week post symptom onset (pso). A&E: accident and emergency; Ag: antigen; BR: Brazil; CH: Switzerland; DE: Germany

Figures and Tables -
Figure 5

Forest plot of antigen test evaluations by week post symptom onset (pso). A&E: accident and emergency; Ag: antigen; BR: Brazil; CH: Switzerland; DE: Germany

Forest plot by test brand for assays with ≥ 3 evaluations. BR: Brazil; CGIA: colloidal‐gold immunoassay; CH: Switzerland; DE: Germany; FIA: fluorescent immunoassay; HCW: healthcare worker; IFU: instructions for use; Lab: laboratory; LFA: lateral flow assay

Figures and Tables -
Figure 6

Forest plot by test brand for assays with ≥ 3 evaluations. BR: Brazil; CGIA: colloidal‐gold immunoassay; CH: Switzerland; DE: Germany; FIA: fluorescent immunoassay; HCW: healthcare worker; IFU: instructions for use; Lab: laboratory; LFA: lateral flow assay

Forest plot by test brand for assays with < 3 evaluations; CGIA: colloidal‐gold immunoassay; FIA: fluorescent immunoassay; IFU: instructions for use; LFA: lateral flow assay

Figures and Tables -
Figure 7

Forest plot by test brand for assays with < 3 evaluations; CGIA: colloidal‐gold immunoassay; FIA: fluorescent immunoassay; IFU: instructions for use; LFA: lateral flow assay

Forest plot of studies reporting comparative data. CGIA: colloidal‐gold immunoassay; FIA: fluorescent immunoassay; LFA: lateral flow assay; nos: not otherwise specified

Figures and Tables -
Figure 8

Forest plot of studies reporting comparative data. CGIA: colloidal‐gold immunoassay; FIA: fluorescent immunoassay; LFA: lateral flow assay; nos: not otherwise specified

Forest plot of studies evaluating rapid molecular tests. A&E: accident and emergency

Figures and Tables -
Figure 9

Forest plot of studies evaluating rapid molecular tests. A&E: accident and emergency

Forest plot by test brand for molecular assays. A&E: accident and emergency; IFU: instructions for use

Figures and Tables -
Figure 10

Forest plot by test brand for molecular assays. A&E: accident and emergency; IFU: instructions for use

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

Figures and Tables -
Figure 11

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

Figures and Tables -
Figure 12

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study

Figures and Tables -
Figure 13

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study

Forest plot of antigen test evaluations by study design. BR: Brazil; CH: Switzerland; DE: Germany; HCW: healthcare worker

Figures and Tables -
Figure 14

Forest plot of antigen test evaluations by study design. BR: Brazil; CH: Switzerland; DE: Germany; HCW: healthcare worker

Forest plot of studies evaluating antigen tests: higher versus lower viral load (< or > 25 Ct). BR: Brazil; CH: Switzerland; Ct: cycle threshold; DE: Germany; HCW: healthcare worker

Figures and Tables -
Figure 15

Forest plot of studies evaluating antigen tests: higher versus lower viral load (< or > 25 Ct). BR: Brazil; CH: Switzerland; Ct: cycle threshold; DE: Germany; HCW: healthcare worker

Forest plot of studies evaluating antigen tests: higher versus lower viral load (< or > 32/33 Ct threshold). BR: Brazil; CH: Switzerland; ; Ct: cycle threshold; DE: Germany

Figures and Tables -
Figure 16

Forest plot of studies evaluating antigen tests: higher versus lower viral load (< or > 32/33 Ct threshold). BR: Brazil; CH: Switzerland; ; Ct: cycle threshold; DE: Germany

Forest plot of studies evaluating antigen tests: higher versus lower viral load (other Ct thresholds). Ct: cycle threshold; HCW: healthcare worker

Figures and Tables -
Figure 17

Forest plot of studies evaluating antigen tests: higher versus lower viral load (other Ct thresholds). Ct: cycle threshold; HCW: healthcare worker

Forest plot of molecular test evaluations by study design

Figures and Tables -
Figure 18

Forest plot of molecular test evaluations by study design

Forest plot of studies evaluating rapid molecular tests: high versus low viral load (30 Ct threshold). Ct: cycle threshold

Figures and Tables -
Figure 19

Forest plot of studies evaluating rapid molecular tests: high versus low viral load (30 Ct threshold). Ct: cycle threshold

Forest plot of studies evaluating rapid molecular tests: high versus low viral load (other Ct thresholds). Ct: cycle threshold

Figures and Tables -
Figure 20

Forest plot of studies evaluating rapid molecular tests: high versus low viral load (other Ct thresholds). Ct: cycle threshold

Rapid molecular assays before and after discrepant analysis

Figures and Tables -
Figure 21

Rapid molecular assays before and after discrepant analysis

Antigen tests ‐ All

Figures and Tables -
Test 1

Antigen tests ‐ All

Antigen tests ‐ symptomatic

Figures and Tables -
Test 2

Antigen tests ‐ symptomatic

Antigen tests ‐ asymptomatic

Figures and Tables -
Test 3

Antigen tests ‐ asymptomatic

Antigen tests ‐ mixed symptoms or not reported

Figures and Tables -
Test 4

Antigen tests ‐ mixed symptoms or not reported

Antigen tests ‐ Ct values < or <=25

Figures and Tables -
Test 5

Antigen tests ‐ Ct values < or <=25

Antigen tests ‐ Ct values >25

Figures and Tables -
Test 6

Antigen tests ‐ Ct values >25

Antigen tests ‐ Ct values < or <=32/33

Figures and Tables -
Test 7

Antigen tests ‐ Ct values < or <=32/33

Antigen tests ‐ Ct values >32/33

Figures and Tables -
Test 8

Antigen tests ‐ Ct values >32/33

Antigen tests ‐ other Ct thresholds for 'higher' viral load

Figures and Tables -
Test 9

Antigen tests ‐ other Ct thresholds for 'higher' viral load

Antigen tests ‐ other Ct thresholds for 'lower' viral load

Figures and Tables -
Test 10

Antigen tests ‐ other Ct thresholds for 'lower' viral load

Antigen tests ‐ week 1 after symptom onset

Figures and Tables -
Test 11

Antigen tests ‐ week 1 after symptom onset

Antigen tests ‐ week 2 after symptom onset

Figures and Tables -
Test 12

Antigen tests ‐ week 2 after symptom onset

Molecular tests ‐ all

Figures and Tables -
Test 13

Molecular tests ‐ all

Molecular tests ‐ all (before discrepant analysis)

Figures and Tables -
Test 14

Molecular tests ‐ all (before discrepant analysis)

Molecular tests ‐ all (after discrepant analysis)

Figures and Tables -
Test 15

Molecular tests ‐ all (after discrepant analysis)

Molecular tests ‐ Ct values < or <=30

Figures and Tables -
Test 16

Molecular tests ‐ Ct values < or <=30

Molecular tests ‐ Ct values >30

Figures and Tables -
Test 17

Molecular tests ‐ Ct values >30

Molecular tests ‐ other Ct thresholds for 'higher' viral load

Figures and Tables -
Test 18

Molecular tests ‐ other Ct thresholds for 'higher' viral load

Molecular tests ‐ other Ct thresholds for 'lower' viral load

Figures and Tables -
Test 19

Molecular tests ‐ other Ct thresholds for 'lower' viral load

Molecular tests ‐ other sites

Figures and Tables -
Test 20

Molecular tests ‐ other sites

Antigen tests ‐ direct comparisons

Figures and Tables -
Test 21

Antigen tests ‐ direct comparisons

AAZ ‐ COVID‐VIRO (CGIA)

Figures and Tables -
Test 22

AAZ ‐ COVID‐VIRO (CGIA)

Abbott ‐ Panbio Covid‐19 Ag (CGIA)

Figures and Tables -
Test 23

Abbott ‐ Panbio Covid‐19 Ag (CGIA)

Becton Dickinson ‐ BD Veritor (LFA – method not specified)

Figures and Tables -
Test 24

Becton Dickinson ‐ BD Veritor (LFA – method not specified)

BIONOTE ‐ NowCheck COVID‐19 Ag (LFA – method not specified)

Figures and Tables -
Test 25

BIONOTE ‐ NowCheck COVID‐19 Ag (LFA – method not specified)

Biosynex ‐ Biosynex COVID‐19 Ag BSS (CGIA)

Figures and Tables -
Test 26

Biosynex ‐ Biosynex COVID‐19 Ag BSS (CGIA)

Coris Bioconcept ‐ COVID‐19 Ag Respi‐Strip (CGIA)

Figures and Tables -
Test 27

Coris Bioconcept ‐ COVID‐19 Ag Respi‐Strip (CGIA)

E25Bio ‐ DART (NP) (CGIA)

Figures and Tables -
Test 28

E25Bio ‐ DART (NP) (CGIA)

Fujirebio ‐ ESPLINE SARS‐CoV‐2 [LFA(ALP)]

Figures and Tables -
Test 29

Fujirebio ‐ ESPLINE SARS‐CoV‐2 [LFA(ALP)]

Inhouse (Bioeasy co‐author) ‐ n/a (FIA)

Figures and Tables -
Test 30

Inhouse (Bioeasy co‐author) ‐ n/a (FIA)

Innova Medical Group ‐ Innova SARS‐CoV‐2 Ag (CGIA)

Figures and Tables -
Test 31

Innova Medical Group ‐ Innova SARS‐CoV‐2 Ag (CGIA)

Liming Bio‐Products ‐ StrongStep® COVID‐19 Ag (CGIA)

Figures and Tables -
Test 32

Liming Bio‐Products ‐ StrongStep® COVID‐19 Ag (CGIA)

Quidel Corporation ‐ SOFIA SARS Antigen (FIA)

Figures and Tables -
Test 33

Quidel Corporation ‐ SOFIA SARS Antigen (FIA)

RapiGEN ‐ BIOCREDIT COVID‐19 Ag (CGIA)

Figures and Tables -
Test 34

RapiGEN ‐ BIOCREDIT COVID‐19 Ag (CGIA)

Roche ‐ SARS‐CoV‐2 (LFA – method not specified)

Figures and Tables -
Test 35

Roche ‐ SARS‐CoV‐2 (LFA – method not specified)

Savant Biotech ‐ Huaketai SARS‐CoV‐2 N Protein (LFA – method not specified)

Figures and Tables -
Test 36

Savant Biotech ‐ Huaketai SARS‐CoV‐2 N Protein (LFA – method not specified)

SD Biosensor ‐ STANDARD F COVID‐19 Ag (FIA)

Figures and Tables -
Test 37

SD Biosensor ‐ STANDARD F COVID‐19 Ag (FIA)

SD Biosensor ‐ STANDARD Q COVID‐19 Ag (CGIA)

Figures and Tables -
Test 38

SD Biosensor ‐ STANDARD Q COVID‐19 Ag (CGIA)

Shenzhen Bioeasy Biotech ‐ 2019‐nCoV Ag (FIA)

Figures and Tables -
Test 39

Shenzhen Bioeasy Biotech ‐ 2019‐nCoV Ag (FIA)

Abbott ‐ ID NOW (Isothermal PCR)

Figures and Tables -
Test 40

Abbott ‐ ID NOW (Isothermal PCR)

Cepheid ‐ Xpert Xpress (Automated RT‐PCR)

Figures and Tables -
Test 41

Cepheid ‐ Xpert Xpress (Automated RT‐PCR)

DNANudge – COVID Nudge (Automated RT‐PCR)

Figures and Tables -
Test 42

DNANudge – COVID Nudge (Automated RT‐PCR)

DRW ‐ SAMBA II (Automated RT‐PCR)

Figures and Tables -
Test 43

DRW ‐ SAMBA II (Automated RT‐PCR)

Mesa Biotech ‐ Accula (other molecular)

Figures and Tables -
Test 44

Mesa Biotech ‐ Accula (other molecular)

Antigen test evaluations ‐ Single group design

Figures and Tables -
Test 45

Antigen test evaluations ‐ Single group design

Antigen test evaluations ‐ Two group design

Figures and Tables -
Test 46

Antigen test evaluations ‐ Two group design

Antigen test evaluations ‐ Unclear design

Figures and Tables -
Test 47

Antigen test evaluations ‐ Unclear design

Molecular test evaluations ‐ Single group design

Figures and Tables -
Test 48

Molecular test evaluations ‐ Single group design

Molecular test evaluations ‐ Two group design

Figures and Tables -
Test 49

Molecular test evaluations ‐ Two group design

Molecular test evaluations ‐ Unclear design

Figures and Tables -
Test 50

Molecular test evaluations ‐ Unclear design

Summary of findings 1. Diagnostic accuracy of point‐of‐care antigen and molecular‐based tests for the diagnosis of SARS‐CoV‐2 infection

Question

What is the diagnostic accuracy of rapid point‐of‐care antigen and molecular‐based tests for the diagnosis of SARS‐CoV‐2 infection?

Population

Adults or children with suspected:

  • current SARS‐CoV‐2 infection

or populations undergoing screening for SARS‐CoV‐2 infection, including

  • asymptomatic contacts of confirmed COVID‐19 cases

  • community screening

Index test

Any rapid antigen or molecular‐based test for diagnosis of SARS‐CoV‐2 meeting the following criteria:

  • portable or mains‐powered device

  • minimal sample preparation requirements

  • minimal biosafety requirements

  • no requirement for a temperature‐controlled environment

  • test results available within 2 hours of sample collection

Target condition

Detection of current SARS‐CoV‐2 infection

Reference standard

For COVID‐19 cases: positive RT‐PCR alone or clinical diagnosis of COVID‐19 based on established guidelines or combinations of clinical features

For non‐COVID‐19 cases: negative RT‐PCR or pre‐pandemic sources of samples

Action

False negative results mean missed cases of COVID‐19 infection, with either delayed or no confirmed diagnosis and increased risk of community transmission due to false sense of security

False positive results lead to unnecessary self‐isolation or quarantine, with the potential for new infection to be acquired

Quantity of evidence

Sample type

Number studies

Total samples

Samples from confirmed SARS‐CoV‐2 cases

Respiratory

77

24,418

7484

Non‐respiratory

1

79

29

Limitations in the evidence

Risk of bias

(based on 78 studies)

Participants: high (29) or unclear (27) risk in 56 studies (72%)

Index test (antigen tests): high (0) or unclear (19) risk in 19 studies (40% of 48 studies)

Index test (molecular tests): high (3) or unclear (22) risk in 25 studies (83% of 30 studies)

Reference standard: high (66) unclear (6) risk in 72 studies (92%)

Flow and timing: high (29) or unclear (36) risk in 65 studies (83%)

Concerns about applicability

(based on 78 studies)

Participants: high concerns in 35 studies (45%)

Index test (antigen tests): high concerns in 23 studies (48% of 48 studies)

Index test (molecular tests): high concerns in 16 studies (53% of 30 studies)

Reference standard: high concerns in 76 studies (97%)

Findings: antigen tests

Evaluations (studies)

Samples (SARS‐CoV‐2 cases)

Sensitivity (95% CI)

[Range]

Specificity (95% CI)

[Range]

Symptomatic

37 (27)

15,530 (4410)

72.0 (63.7 to 79.0)

[0% to 100%]

99.5 (98.5 to 99.8)

[8% to 100%]

Symptomatic (up to 7 days from onset of symptoms)a

26 (21)

2320 (2320)

78.3 (71.1 to 84.1)

[15% to 95%]

Asymptomatic

12 (10)

1581 (295)

58.1 (40.2 to 74.1)

[29% to 85%]

98.9 (93.6 to 99.8)

[14% to 100%]

Examples of pooled results for individual antigen tests using data for evaluations compliant with manufacturer instructions for use according to symptom status

Tests

Evaluations

Samples

SARS‐CoV‐2

cases

Sensitivity (95% CI)

Specificity (95% CI)

Symptomatic participants

Coris Bioconcept ‐ COVID‐19 Ag Respi‐Strip

3

780

414

34.1 (29.7 to 38.8)

100 (99.0 to 100)

Abbott ‐ Panbio Covid‐19 Ag

3

1094

252

75.1 (57.3 to 87.1)

99.5 (98.7 to 99.8)

SD Biosensor ‐ STANDARD Q COVID‐19 Ag

3

1947

336

88.1 (84.2 to 91.1)

99.1 (97.8 to 99.6)

Asymptomatic participants

Coris Bioconcept ‐ COVID‐19 Ag Respi‐Strip

2

45

14

28.6 (8.4 to 58.1)

100 (88.8 to 100)

Abbott ‐ Panbio Covid‐19 Ag

1

474

47

48.9 (35.1 to 62.9)

98.1 (96.3 to 99.1)

SD Biosensor ‐ STANDARD Q COVID‐19 Ag

1

127

13

69.2 (38.6 to 90.9)

99.1 (95.2 to 100)

Symptomatic participants: average sensitivity and specificity (and 95% CIs) applied to a hypothetical cohort of 1000 patients where 50, 100 and 200 have COVID‐19 infection

Test

Prevalence

TP (95% CI)

FP (95% CI)

FN (95% CI)

TN (95% CI)

PPV

1 – NPV

Coris Bioconcept

5%

17 (15 to 19)

0 (0 to 10)

33 (31 to 35)

950 (941 to 950)

100%

3.4%

10%

34 (30 to 39)

0 (0 to 9)

66 (61 to 70)

900 (891 to 900)

100%

6.8%

20%

68 (59 to 78)

0 (0 to 8)

132 (122 to 141)

800 (792 to 800)

100%

14.1%

Abbott ‐ Panbio Covid‐19 Ag

5%

38 (29 to 44)

5 (2 to 12)

12 (6 to 21)

945 (938 to 948)

89%

1.3%

10%

75 (57 to 87)

5 (2 to 12)

25 (13 to 43)

896 (888 to 898)

94%

2.7%

20%

150 (115 to 174)

4 (2 to 10)

50 (26 to 85)

796 (790 to 798)

97%

5.9%

SD Biosensor ‐ STANDARD Q COVID‐19 Ag

5%

44 (42 to 46)

9 (4 to 21)

6 (4 to 8)

941 (929 to 946)

84%

0.6%

10%

88 (84 to 91)

8 (4 to 20)

12 (9 to 16)

892 (880 to 896)

92%

1.3%

20%

176 (168 to 182)

7 (3 to 18)

24 (18 to 32)

793 (782 to 797)

96%

2.9%

Asymptomatic participants: average sensitivity and specificity (and 95% CIs) applied to a hypothetical cohort of 10,000 patients where 50, 100 and 200 have COVID‐19 infection

Coris Bioconcept

0.5%

14 (4 to 29)

0 (0 to 1114)

36 (21 to 46)

9950 (8836 to 9950)

100%

0.4%

1%

29 (8 to 58)

0 (0 to 1109)

71 (42 to 92)

9900 (8791 to 9900)

100%

0.7%

2%

57 (17 to 116)

0 (0 to 1098)

143 (84 to 183)

9800 (8702 to 9800)

100%

1.4%

Abbott ‐ Panbio Covid‐19 Ag

0.5%

24 (18 to 31)

189 (90 to 368)

26 (19 to 32)

9761 (9582 to 9860)

11%

0.3%

1%

49 (35 to 63)

188 (89 to 366)

51 (37 to 65)

9712 (9534 to 9811)

21%

0.5%

2%

98 (70 to 126)

186 (88 to 363)

102 (74 to 130)

9614 (9437 to 9712)

34%

1.0%

SD Biosensor ‐ STANDARD Q COVID‐19 Ag

0.5%

35 (19 to 45)

90 (0 to 478)

15 (5 to 31)

9860 (9472 to 9950)

28%

0.2%

1%

69 (39 to 91)

89 (0 to 475)

31 (9 to 61)

9811 (9425 to 9900)

44%

0.3%

2%

138 (77 to 182)

88 (0 to 470)

62 (18 to 123)

9712 (9330 to 9800)

61%

0.6%

Findings: rapid molecular tests

Evaluations (studies)

Samples

SARS‐CoV‐2 cases

Average sensitivity (95% CI)

[Range]

Average specificity (95% CI)

[Range]

29 (26)

4351

1787

95.1 (90.5 to 97.6)

[57% to 100%]

98.8 (98.3 to 99.2)

[ 92% to 100%]

Pooled results for individual tests using data from compliant with manufacturer instructions for use

Tests

Evaluations

Samples

SARS‐CoV‐2

cases

Sensitivity (95% CI)

Specificity (95% CI)

Abbott ‐ ID NOW

4

812

222

73.0 (66.8 to 78.4)

99.7 (98.7 to 99.9)

Cepheid ‐ Xpert Xpress

2

100

29

100 (88.1 to 100)

97.2 (89.4 to 99.3)

DRW ‐ SAMBA II

1

149

33

87.9 (71.8 to 96.6)

97.4 (92.6 to 99.5)

DNANudge COVID Nudge

1

386

71

94.4 (86.2 to 98.4)

100 (98.8 to 100)

Average sensitivity and specificity (and 95% CIs) applied to a hypothetical cohort of 1000 patients where 50, 100 and 200 have COVID‐19 infection

Tests

Prevalence

TP (95% CI)

FP (95% CI)

FN (95% CI)

TN (95% CI)

PPVb

1 – NPVc

ID NOW

5%

37 (33 to 39)

3 (1 to 12)

14 (11 to 17)

947 (938 to 949)

93%

1.4%

10%

73 (67 to 78)

3 (1 to 12)

27 (22 to 33)

897 (888 to 899)

96%

2.9%

20%

146 (134 to 157)

2 (1 to 10)

54 (43 to 66)

798 (790 to 799)

98%

6.3%

Xpert Xpress

5%

50 (44 to 50)

27 (7 to 101)

0 (0 to 6)

923 (849 to 943)

65%

0.0%

10%

100 (88 to 100)

25 (6 to 95)

0 (0 to 12)

875 (805 to 894)

80%

0.0%

20%

200 (176 to 200)

22 (6 to 85)

0 (0 to 24)

778 (715 to 794)

90%

0.0%

SAMBA II

5%

44 (36 to 48)

25 (5 to 70)

6 (2 to 14)

925 (880 to 945)

64%

0.6%

10%

88 (72 to 97)

23 (5 to 67)

12 (3 to 28)

877 (833 to 896)

79%

1.4%

20%

176 (144 to 193)

21 (4 to 59)

24 (7 to 56)

779 (741 to 796)

89%

3.0%

COVID Nudge

5%

47 (43 to 49)

0 (0 to 11)

3 (1 to 7)

950 (939 to 950)

100%

0.3%

10%

94 (86 to 98)

0 (0 to 11)

6 (2 to 14)

900 (889 to 900)

100%

0.6%

20%

189 (172 to 197)

0 (0 to 10)

11 (3 to 28)

800 (790 to 800)

100%

1.4%

1 – NPV: 1 – negative predictive value (the percentage of people with negative results who are infected); Ag: antigen;CI: confidence interval; FN: false negative; FP: false positive;IFU: [manufacturers'] instructions for use; PPV: positive predictive value (the percentage of people with positive results who are infected); RT‐PCR: reverse transcription polymerase chain reaction; TN: true negative; TP: true positive

aSpecificity only estimated in 8 of 26 evaluations by time after symptom onset.
bPPV (positive predictive value) defined as the percentage of positive rapid test results that are truly positive according to the reference standard diagnosis.
c1‐NPV (negative predictive value), where NPV is defined as the percentage of negative rapid test results that are truly negative according to the reference standard diagnosis.

Figures and Tables -
Summary of findings 1. Diagnostic accuracy of point‐of‐care antigen and molecular‐based tests for the diagnosis of SARS‐CoV‐2 infection
Table 1. Description of studies

No. of studies (%)

Participants

Antigen tests

Rapid molecular

Number of studies

48

29

Sample size (by test type)

Median (IQR)

291.5 (155 to 502.5)

104 (75 to 172)

Range

56 to 1676

19 to 524

Number of COVID‐19 cases (by test type)

Median (IQR)

99.5 (45.5 to 128.5)

50 (20 to 88)

Range

0, 951

6, 220

Setting

COVID‐19 test centre

22 (46)

0 (0)

Contacts

4 (8)

0 (0)

Hospital A&E

3 (6)

3 (10)

Hospital inpatient

2 (4)

2 (7)

Laboratory‐based

11 (23)

20 (69)

Mixed

4 (8)

4 (14)

Unclear

2 (4)

0 (0)

Symptom status

Asymptomatic

3 (6)

0 (0)

Symptomatic

16 (33)

12 (41)

Mainly symptomatica

11 (23)

0 (0)

Mixed

8 (17)

3 (10)

Not reported

10 (21)

14 (48)

Study design

Recruitment structure

Single group – sensitivity and specificity

29 (60)

17 (59)

Two or more groups ‐ sensitivity and specificity

10 (21)

7 (24)

Unclear

2 (4)

2 (7)

Single group – sensitivity only

6 (13)

3 (10)

Single group – specificity only

1 (2)

0 (0)

Reference standard for COVID‐19 cases

All RT‐PCR‐positive

47 (98)

29 (100)

No. of studies = 42

No. of studies = 26

Reference standard for non‐COVID‐19

COVID suspects (single RT‐PCR‐negative)

39 (93)

24 (92)

COVID suspects (double+ RT‐PCR‐negative)

1 (2)

1 (4)

Current other disease (RT‐PCR‐negative)

0 (0)

1 (4)

Pre‐pandemic (not described)

1 (2)

0 (0)

Pre‐pandemic other disease

1 (2)

0 (0)

Tests

No. of evaluations (%)

Total number of test evaluations

58

32

Number of tests per study

1

44 (92)

26 (90)

2

1 (2)

3 (10)

3

1 (2)

0 (0)

4

1 (2)

0 (0)

5

1 (2)

0 (0)

Test method

CGIA

41 (71)

0 (0)

FIA

9 (16)

0 (0)

LFA (alkaline phosphatase labelled)

2 (3)

0 (0)

LFA (not otherwise specified)

6 (10)

0 (0)

Automated RT‐PCR

0 (0)

18 (56)

Isothermal amplification

0 (0)

13 (41)

Other molecular (PCR + LFA)

0 (0)

1 (3)

Sample type

NP alone

30 (52)

16 (50)

NP + OP combined

12 (21)

2 (6)

Nasal alone

2 (3)

2 (6)

OP alone

1 (2)

1 (3)

Two or more of NP, or nasal or OP

8 14)

8 (25)

Saliva

1 (2)

1 (3)

Other

3 (5)

0 (0)

Mixed (including lower respiratory)

4 (7)

1 (3)

Not specified

0 (0)

1 (3)

Sample storage

Direct

28 (48)

7 (22)

VTM

20 (35)

12 (38)

Saline

1 (2)

0 (0)

Direct or VTM

0 (0)

1 (3)

VTM or PBS

1 (2)

0 (0)

VTM or other

0 (0)

6 (19)

Not specified

8 (14)

6 (19)

Sample collection

HCW

15 (26)

2 (6)

Trained non‐HCW

3 (5)

0 (0)

Self‐collected

6 (10)

0 (0)

HCW or self‐collection

0

1 (3)

Not specified

34 (59)

29 (91)

Sample testing

HCW (on‐site)

13 (22)

0

Trained non‐HCW (on‐site)

3 (5)

0

HCW or on‐site laboratory personnel

0 (0)

1 (3)

Not specified (on‐site testing)

5 (9)

1 (3)

Laboratory staff

12 (21)

4 (13)

Not stated (laboratory setting)

15 (26)

16 (50)

IFU compliance

No

16 (28)

16 (50)

Yes

29 (50)

9 (28)

Unclear

13 (22)

7 (22)

A&E: accident and emergency department; CGIA: colloidal gold immunoassay; CI: confidence intervals; DRW: Diagnostics for the Real World; FIA: fluorescent immunoassay; HCW: healthcare worker; IFU: instructions for use; IQR: inter‐quartile range; LFA: lateral flow assay; NP: nasopharyngeal; OP: oropharyngeal; PBS: phosphatase‐buffered saline; RT‐PCR: reverse transcription polymerase chain reaction; VTM: viral transport medium

a‘mainly’ symptomatic indicates ≥ 75% of included participants reported as symptomatic.

Figures and Tables -
Table 1. Description of studies
Table 2. Antigen tests: summary of sensitivity and specificity analyses

Subgroup

Test

Evaluations

Samples

Cases

Average sensitivity, % (95% CI)

Average specificity, % (95% CI)

Overall analysis

Evaluations reporting both sensitivity and specificity

51

21,614

6136

68.9 (61.8 to 75.1)

99.6 (99.0 to 99.8)

Evaluations reporting sensitivity dataa

57

22,605

7127

67.7 (60.8 to 74.0)

N/A

Evaluations reporting specificity dataa

52

22,152

6136

N/A

99.5 (99.0 to 99.8)

Subgroup analyses (with sensitivity analyses restricting to direct comparisons)

Symptom status (all)

Symptomatic

37

15,530

4410

72.0 (63.7 to 79.0)

99.5 (98.5 to 99.8)

Asymptomatic

12

1581

295

58.1 (40.2 to 74.1)

98.9 (93.6 to 99.8)

Difference

13.8 (33.1 to 5.4)

P = 0.159

0.6 (2.6 to 1.4)

P = 0.551

Symptomatic: direct comparison

9

2437

890

68.0 (51.4 to 81.1)

99.2 (83.9 to 100)

Asymptomatic: direct comparison

9

1182

213

53.6 (35.0 to 71.3)

99.2 (85.5 to 100)

Difference

14.4 (38.8 to 10.0)

P = 0.246

0.01 (3.2 to 3.2),

P = 0.995

Mixed symptoms or not reported

19

6220

2392

63.0 (52.2 to 72.6)

98.4 (98.0 to 98.8)

Time post‐symptom onset

(sensitivity only)

Week 1

26

5769

2320

78.3 (71.1 to 84.1)a

N/A

Week 2

22

935

692

51.0 (40.8 to 61.0)a

N/A

Difference

27.3 (32.8 to −21.9)

P < 0.0001

Week 1: direct comparison

22

4978

2164

76.6 (68.2 to 83.4)a

N/A

Week 2: direct comparison

22

935

692

48.8 (37.9 to 59.8)a

N/A

Difference

27.9 (33.3 to −22.5)

P < 0.0001

Ct value

(sensitivity only)

Higher viral load (< or ≤ 25 Ct threshold)b

36

2613

2613

94.5 (91.0 to 96.7)a

N/A

Lower viral load (> or >= 25 Ct threshold)b

36

2632

2632

40.7 (31.8 to 50.3)a

N/A

Difference

53.8 (63.6 to −44.1)

P < 0.0001

Higher viral load (≤ 32 or33 Ct threshold)c

15

2127

2127

82.5 (74.0 to 88.6)a

N/A

Lower viral load (> 32 or 33 Ct threshold)c

15

346

346

8.9 (3.3 to 21.7)a

N/A

Difference

73.5 (84.7 to −62.4)

P < 0.0001

Study design

Single group: sensitivity and specificity

29

15,336

3536

72.1 (64.8 to 78.3)

99.6 (99.1 to 99.8)

Two or more groups: sensitivity and specificity

20

5729

2396

64.1 (48.5 to 77.2)

97.3 (96.7 to 97.8)

8.0 (24.2 to 8.2)

P = 0.334

2.3 (2.9 to −1.6)

P < 0.0001

Unclear

2

549

204

65.2 (39.6 to 84.3)

96.3 (88.0 to 98.9)

Test method

CGIA

36

17,448

5085

64.0 (55.7 to 71.6)

99.0 (98.8 to 99.2)

FIA

9

2820

712

79.6 (67.5 to 88.0)

97.7 (95.3 to 98.8)

Difference

15.6 (2.6 to 28.5)

P = 0.019

1.3 (3.0 to 0.3)

P = 0.113

LFA (not otherwise specified)

5

1184

277

78.0 (46.0 to 93.7)

96.0 (94.5 to 97.1)

LFA (ALP)

1

162

62

80.6 (68.6 to 89.6)

100 (96.4 to 100)

ALP: alkaline phosphatase labelled; CGIA: colloidal gold immunoassay; CI: confidence intervals; Ct: cycle threshold; FIA: fluorescent immunoassay; LFA: lateral flow assay; N/A: not applicable

aSeparate pooling of sensitivity or specificity, or both.

b threshold for 'higher' viral load was < 25 Ct in 18 evaluations and ≤ 25 Ct in 18 evaluations

c threshold for 'higher' viral load ≤ 33 Ct in 13 evaluations and < 32 in 2 evaluations

Figures and Tables -
Table 2. Antigen tests: summary of sensitivity and specificity analyses
Table 3. Antigen tests: summary data by test brand and compliance with manufacturers' instructions for use

Test

All

IFU‐compliant

Number of evaluations; samples (cases)

Average sensitivity, % (95% CI)

Average specificity, % (95% CI)

Number of evaluations; samples (cases)

Average sensitivity, % (95% CI)

Average specificity, % (95% CI)

AAZ ‐ COVID‐VIRO

(2 studies not pooled)

1; 632 (295)

61.7 (55.9 to 67.3)

100 (98.9 to 100)

1; 248 (101)

96.0 (90.2 to 98.9)

86.4 (79.8 to 91.5)

1; 248 (101)

96.0 (90.2 to 98.9)

86.4 (79.8 to 91.5)

Abbott ‐ Panbio Covid‐19 Ag

10; 5509 (1849)

72.0 (60.6 to 81.1)

99.3 (99.0 to 99.6)

5; 1776 (362)

72.0 (56.5 to 83.5)

99.2 (98.5 to 99.5)

including sensitivity‐only cohort

11; 2031 (2031)

72.8 (62.6 to 81.0)a

6; 544 (544)

73.5 (61.1 to 83.0)a

Becton Dickinson ‐ BD Veritor

2; 602 (55)

82.3 (62.1 to 93.0)

99.5 (98.3 to 99.8)

including sensitivity‐only cohort

3; 180 (180)

79.4 (72.9 to 84.7)a

BIONOTE ‐ NowCheck COVID‐19 Ag

1; 400 (102)

89.2 (81.5 to 94.5)

97.3 (94.8 to 98.8)

1; 400 (102)

89.2 (81.5 to 94.5)

97.3 (94.8 to 98.8)

Biosynex ‐ Biosynex COVID‐19 Ag BSS

1; 634 (297)

59.6 (53.8 to 65.2)

100 (98.9 to 100)

Coris Bioconcept ‐ COVID‐19 Ag Respi‐Strip

7; 1781 (707)

39.7 (31.3 to 48.7)

98.3 (97.4 to 98.9)

7; 1781 (707)

39.7 (31.3 to 48.7)

98.3 (97.4 to 98.9)

E25Bio ‐ DART (N‐based)

1; 190 (100)

80.0 (70.8 to 87.3)

91.1 (83.2 to 96.1)

Fujirebio ‐ ESPLINE SARS‐CoV‐2

(2 studies not pooled)

1; 162 (62)

80.6 (68.6 to 89.6)

100 (96.4 to 100)

1; 103 (103)

11.6 (6.2 to 19.5)

Innova Medical Group ‐ Innova SARS‐CoV‐2 Ag

3; 2945 (596)

47.9 (34.3 to 61.8)

99.8 (99.5 to 99.9)

1; 1676 (372)

57.5 (52.3 to 62.6)

99.6 (99.1 to 99.9)

including sensitivity‐only cohorts

5; 1017

59.0 (43.4to 73.0)a

3; 793

69.1 (58.3to 78.2)a

including specificity‐only cohort

4; 2887

99.8 (99.5to 99.9)a

2; 1842

99.7 (99.3to 99.9)a

Liming Bio‐Products ‐ StrongStep® COVID‐19 Ag

1; 19 (9)

0 (0 to 33.6)

90.0 (55.5 to 99.7)

Quidel Corporation ‐ SOFIA SARS Ag

1; 64 (32)

93.8 (79.2 to 99.2)

96.9 (83.8 to 99.9)

RapiGEN ‐ BIOCREDIT COVID‐19 Ag

5; 2010 (310)

63.3 (45.7 to 78.0)

99.5 (99.1 to 99.8)

3; 1828 (189)

73.0 (57.4 to 84.4)

99.8 (99.4 to 99.9)

including sensitivity‐only cohort

6; 470 (470)

57.7 (39.8to 73.8)a

Roche ‐ SARS‐CoV‐2

1; 73 (42)

88.1 (74.4 to 96.0)

19.4 (7.5 to 37.5)

Savant Biotech ‐ Huaketai SARS‐CoV‐2 N Protein

1; 109 (78)

16.7 (9.2 to 26.8)

100 (88.8 to 100)

SD Biosensor ‐ STANDARD F COVID‐19 Ag

4; 1552 (295)

72.6 (54.0 to 85.7)

97.5 (96.4 to 98.2)

2; 1129 (159)

75.5 (68.2 to 81.5)

97.2 (96.0 to 98.1)

SD Biosensor ‐ STANDARD Q COVID‐19 Ag

6; 3480 (821)

79.3 (69.6 to 86.6)

98.5 (97.9 to 98.9)

4; 2522 (421)

85.8 (80.5 to 89.8)

99.2 (98.2 to 99.6)

Shenzhen Bioeasy Biotech ‐ 2019‐nCoV Ag

development‐phase publication

3; 965 (177)

86.2 (72.4 to 93.7)

93.8 (91.9 to 95.3)

1; 727 (15)

66.7 (38.4 to 88.2)

93.1 (91.0 to 94.9)

1; 239 (208)

67.8 (61.0 to 74.1)

100 (88.8 to 100)

Ag: antigen; CI: confidence interval; IFU: [manufacturers'] instructions for use; N: nucleoprotein

aSeparate pooling of sensitivity or specificity.
b2x2 tables combined prior to calculating estimates.

Figures and Tables -
Table 3. Antigen tests: summary data by test brand and compliance with manufacturers' instructions for use
Table 4. Antigen tests: summary data by symptom status, test brand and compliance with manufacturers' instructions for use

All

IFU‐compliant

Number of evaluations; samples (cases)

Average sensitivity, % (95% CI)

Average specificity, % (95% CI)

Number of evaluations; samples (cases)

Average sensitivity, % (95% CI)

Average specificity, % (95% CI)

SYMPTOMATIC participants by test

AAZ ‐ COVID‐VIRO

(2 studies not pooled)

1; 632 (295)

61.7 (55.9 to 67.3)

100 (98.9 to 100)

1; 248 (101)

96.0 (90.2 to 98.9)

86.4 (79.8 to 91.5)

1; 248 (101)

96.0 (90.2 to 98.9)

86.4 (79.8 to 91.5)

Abbott ‐ Panbio Covid‐19 Ag

8; 3699 (1162)

74.1 (60.8 to 84.0)

99.8 (99.5 to 99.9)

3; 1094 (252)

75.1 (57.3 to 87.1)

99.5 (98.7 to 99.8)

including sensitivity‐only cohort

9; 1344 (1344)

74.8 (63.4 to 83.6)a

4; 434 (434)

76.2 (63.6to 85.4)a

Becton Dickinson ‐ BD Veritor

2; 602 (55)

82.3 (62.1 to 93.0)

99.5 (98.3 to 99.8)

including sensitivity‐only cohort

3; 180 (180)

79.4 (72.9to 84.7)a

BIONOTE ‐ NowCheck COVID‐19 Ag

1; 400 (102)

89.2 (81.5 to 94.5)

97.3 (94.8 to 98.8)

1; 400 (102)

89.2 (81.5 to 94.5)

97.3 (94.8 to 98.8)

Biosynex ‐ Biosynex COVID‐19 Ag BSS

1; 634 (297)

59.6 (53.8 to 65.2)

100 (98.9 to 100)

Coris Bioconcept ‐ COVID‐19 Ag Respi‐Strip

3; 780 (414)

34.1 (29.7 to 38.8)a

100 (99.0 to 100)a,b

3; 780 (414)

34.1 (29.7 to 38.8)a

100 (99.0 to 100)a,b

Fujirebio ‐ ESPLINE SARS‐CoV‐2

1; 88 (88)

11.4 (5.6 to 19.9)

Innova Medical Group ‐ Innova SARS‐CoV‐2 Ag

2; 2794 (550)

56.2 (52.0 to 60.3)

99.8 (99.5 to 99.9)

1; 1676 (372)

57.5 (52.3 to 62.6)

99.6 (99.1 to 99.9)

including sensitivity‐only cohorts

4; 971 (971)

65.5 (54.8to 74.9)†

3; 793 (793)

69.1 (58.3to 78.2)†

Liming Bio‐Products ‐ StrongStep® COVID‐19 Ag

1; 19 (9)

0 (0 to 33.6)

90.0 (55.5 to 99.7)

Quidel Corporation ‐ SOFIA SARS Ag

1; 64 (32)

93.8 (79.2 to 99.2)

96.9 (83.8 to 99.9)

RapiGEN ‐ BIOCREDIT COVID‐19 Ag

3; 608 (206)

58.4 (36.3 to 77.5)

96.4 (82.8 to 99.3)

1; 476 (117)

74.4 (65.5 to 82.0)

98.9 (97.2 to 99.7)

Roche ‐ SARS‐CoV‐2

1; 23 (10)

100 (69.2 to 100)

7.7 (0.2 to 36.0)

Savant Biotech ‐ Huaketai SARS‐CoV‐2 N Protein

1; 109 (78)

16.7 (9.2 to 26.8)

100 (88.8 to 100)

SD Biosensor ‐ STANDARD F COVID‐19 Ag

3; 1193 (191)

78.0 (71.6 to 83.3)

97.2 (96.0 to 98.1)

2; 1129 (159)

75.5 (68.2 to 81.5)

97.2 (96.0 to 98.1)

SD Biosensor ‐ STANDARD Q COVID‐19 Ag

5; 2760 (731)

80.1 (68.5 to 88.1)

98.1 (97.4 to 98.6)

3; 1947 (336)

88.1 (84.2 to 91.1)

99.1 (97.8 to 99.6)

Shenzhen Bioeasy Biotech ‐ 2019‐nCoV Ag

3; 965 (177)

86.2 (72.5 to 93.7)

93.8 (91.9 to 95.3)

1; 727 (15)

66.7 (38.4 to 88.2)

93.1 (91.0 to 94.9)

ASYMPTOMATIC participants by test

Abbott ‐ Panbio Covid‐19 Ag

6; 1097 (190)

58.1 (41.7 to 72.9)

98.4 (92.2 to 99.7)

2; 474 (47)

48.9 (35.1 to 62.9)

98.1 (96.3 to 99.1)

Coris Bioconcept ‐ COVID‐19 Ag Respi‐Strip

1; 45 (14)

28.6 (8.4 to 58.1)

100 (88.8 to 100)

1; 45 (14)

28.6 (8.4 to 58.1)

100 (88.8 to 100)

Fujirebio ‐ ESPLINE SARS‐CoV‐2

1; 15 (15)

13.3 (1.7 to 40.5)

N/A

RapiGEN ‐ BIOCREDIT COVID‐19 Ag

2; 140 (60)

63.2 (21.7 to 91.4)

98.9 (82.9 to 99.9)

1; 113 (47)

85.1 (71.7 to 93.8)

100 (94.6 to 100)

Roche ‐ SARS‐CoV‐2

1; 27 (13)

84.6 (54.6 to 98.1)

14.3 (1.8 to 42.8)

SD Biosensor ‐ STANDARD Q COVID‐19 Ag

2; 272 (18)

61.1 (37.9 to 80.2)

99.6 (97.3 to 99.9)

1; 127 (13)

69.2 (38.6 to 90.9)

99.1 (95.2 to 100)

Ag: antigen; CI: confidence interval; N: nucleoprotein; N/A: not applicable

aseparate pooling of sensitivity or specificity.
b2x2 tables combined prior to calculating estimates.

Figures and Tables -
Table 4. Antigen tests: summary data by symptom status, test brand and compliance with manufacturers' instructions for use
Table 5. Molecular tests: summary of sensitivity and specificity analyses

Test or subgroup

Evaluations

Samples

Cases

Average sensitivity, % (95% CI)

Average specificity, % (95% CI)

Overall analysis

Evaluations reporting both sensitivity and specificity

29

4351

1787

95.1 (90.5 to 97.6)

98.8 (98.3 to 99.2)

Evaluations reporting sensitivity dataa

32

4537

1973

95.5 (91.5 to 97.7)

N/A

Subgroup analyses (with sensitivity analyses restricting to direct comparisons)

Viral load

(sensitivity only)

High viral load (≤ 30 Ct)

6

204

204

100 (98.2 to 100)a,b

N/A

Low viral load (> 30 Ct)

6

149

149

95.6 (55.7 to 99.7)

N/A

By study design

Single group – sensitivity and specificity

18

2899

976

93.2 (85.5 to 97.0)

99.4 (98.4 to 99.8)

Two or more groups ‐ sensitivity and specificity

9

1265

718

97.2 (90.7 to 99.2)

99.3 (96.5 to 99.8)

Difference

4.0 (‐2.2to 10.1)

P = 0.211

‐0.2 (‐1.3to 1.0)

P = 0.771

Unclear designs

2

187

93

93.2 (71.0 to 98.7)a

100 (96.2 to 100)a,b

Test brand

Abbott – ID NOW

12

1853

634

78.6 (73.7 to 82.8)

99.8 (99.2 to 99.9)

Cepheid – Xpert Xpress

13

1691

911

99.1 (97.7 to 99.7)

97.9 (94.6 to 99.2)

Difference

19.8 (14.9to 24.7)

P < 0.0001

‐1.9 (‐3.8to ‐0.1)

P = 0.036

Abbott – ID NOW (including sensitivity only cohort)

13

1949

730

81.5 (75.2 to 86.5)a

N/A

Cepheid – Xpert Xpress (including sensitivity only cohorts)

15

1781

1001

99.1 (97.8 to 99.6)a

N/A

DNANudge – COVID Nudge

1

386

71

94.4 (86.2 to 98.4)

100 (98.8 to 100)

Diagnostics for the Real World – SAMBA II

2

321

121

96.0 (81.1 to 99.3)

97.0 (93.5 to 98.6)

Mesa Biotech – Accula

1

100

50

68.0 (53.3 to 80.5)

100 (92.9 to 100)

Test brand

(restricted to IFU‐compliant)

Abbott – ID NOW

4

812

222

73.0 (66.8 to 78.4)

99.7 (98.7 to 99.9)

Cepheid – Xpert Xpress

2

100

29

100 (88.1 to 100)a

97.2 (89.4 to 99.3)a

DRW – SAMBA II

1

149

33

87.9 (71.8 to 96.6)

97.4 (92.6 to 99.5)

DNANudge – COVID Nudge

1

386

71

94.4 (86.2 to 98.4)

100 (98.8 to 100)

Discrepant analysis

Before discrepant analysis

6

1533

623

97.9 (88.1 to 99.7)

97.8 (96.6 to 98.6)

After discrepant analysis

6

1533

632

99.2 (93.6 to 99.9)

99.6 (98.8 to 99.8)

Difference

1.3 (‐2.8to 5.4)

P = 0.528

1.8 (0.7to 2.8)

P = 0.001

CI: confidence interval; Ct: cycle threshold; IFU: [manufacturers'] instructions for use; N/A: not applicable

aSeparate pooling of sensitivity or specificity.
b2x2 tables combined prior to calculating estimates.

Figures and Tables -
Table 5. Molecular tests: summary of sensitivity and specificity analyses
Table Tests. Data tables by test

Test

No. of studies

No. of participants

1 Antigen tests ‐ All Show forest plot

58

23143

2 Antigen tests ‐ symptomatic Show forest plot

42

16346

3 Antigen tests ‐ asymptomatic Show forest plot

13

1596

4 Antigen tests ‐ mixed symptoms or not reported Show forest plot

20

5447

5 Antigen tests ‐ Ct values < or <=25 Show forest plot

36

3827

6 Antigen tests ‐ Ct values >25 Show forest plot

36

2632

7 Antigen tests ‐ Ct values < or <=32/33 Show forest plot

15

2127

8 Antigen tests ‐ Ct values >32/33 Show forest plot

15

346

9 Antigen tests ‐ other Ct thresholds for 'higher' viral load Show forest plot

13

1760

10 Antigen tests ‐ other Ct thresholds for 'lower' viral load Show forest plot

13

739

11 Antigen tests ‐ week 1 after symptom onset Show forest plot

26

5769

12 Antigen tests ‐ week 2 after symptom onset Show forest plot

22

935

13 Molecular tests ‐ all Show forest plot

32

4537

14 Molecular tests ‐ all (before discrepant analysis) Show forest plot

6

1533

15 Molecular tests ‐ all (after discrepant analysis) Show forest plot

6

1533

16 Molecular tests ‐ Ct values < or <=30 Show forest plot

6

204

17 Molecular tests ‐ Ct values >30 Show forest plot

6

149

18 Molecular tests ‐ other Ct thresholds for 'higher' viral load Show forest plot

4

75

19 Molecular tests ‐ other Ct thresholds for 'lower' viral load Show forest plot

4

168

20 Molecular tests ‐ other sites Show forest plot

3

316

21 Antigen tests ‐ direct comparisons Show forest plot

11

3631

22 AAZ ‐ COVID‐VIRO (CGIA) Show forest plot

2

880

23 Abbott ‐ Panbio Covid‐19 Ag (CGIA) Show forest plot

11

5691

24 Becton Dickinson ‐ BD Veritor (LFA – method not specified) Show forest plot

3

727

25 BIONOTE ‐ NowCheck COVID‐19 Ag (LFA – method not specified) Show forest plot

1

400

26 Biosynex ‐ Biosynex COVID‐19 Ag BSS (CGIA) Show forest plot

1

634

27 Coris Bioconcept ‐ COVID‐19 Ag Respi‐Strip (CGIA) Show forest plot

7

1781

28 E25Bio ‐ DART (NP) (CGIA) Show forest plot

1

190

29 Fujirebio ‐ ESPLINE SARS‐CoV‐2 [LFA(ALP)] Show forest plot

2

265

30 Inhouse (Bioeasy co‐author) ‐ n/a (FIA) Show forest plot

1

239

31 Innova Medical Group ‐ Innova SARS‐CoV‐2 Ag (CGIA) Show forest plot

6

3904

32 Liming Bio‐Products ‐ StrongStep® COVID‐19 Ag (CGIA) Show forest plot

1

19

33 Quidel Corporation ‐ SOFIA SARS Antigen (FIA) Show forest plot

1

64

34 RapiGEN ‐ BIOCREDIT COVID‐19 Ag (CGIA) Show forest plot

6

2170

35 Roche ‐ SARS‐CoV‐2 (LFA – method not specified) Show forest plot

1

73

36 Savant Biotech ‐ Huaketai SARS‐CoV‐2 N Protein (LFA – method not specified) Show forest plot

1

109

37 SD Biosensor ‐ STANDARD F COVID‐19 Ag (FIA) Show forest plot

4

1552

38 SD Biosensor ‐ STANDARD Q COVID‐19 Ag (CGIA) Show forest plot

6

3480

39 Shenzhen Bioeasy Biotech ‐ 2019‐nCoV Ag (FIA) Show forest plot

3

965

40 Abbott ‐ ID NOW (Isothermal PCR) Show forest plot

13

1949

41 Cepheid ‐ Xpert Xpress (Automated RT‐PCR) Show forest plot

15

1781

42 DNANudge – COVID Nudge (Automated RT‐PCR) Show forest plot

1

386

43 DRW ‐ SAMBA II (Automated RT‐PCR) Show forest plot

2

321

44 Mesa Biotech ‐ Accula (other molecular) Show forest plot

1

100

45 Antigen test evaluations ‐ Single group design Show forest plot

29

15336

46 Antigen test evaluations ‐ Two group design Show forest plot

20

5729

47 Antigen test evaluations ‐ Unclear design Show forest plot

2

549

48 Molecular test evaluations ‐ Single group design Show forest plot

18

2899

49 Molecular test evaluations ‐ Two group design Show forest plot

9

1265

50 Molecular test evaluations ‐ Unclear design Show forest plot

2

187

Figures and Tables -
Table Tests. Data tables by test