Characteristics of included in vitro SRs/MAs
Among 244 in vitro SRs/MAs included in the analysis, 150 articles (60.7%) employed the guidelines for SRs/MAs. Of these, 146 articles used the PRISMA checklist though only a single study followed Quality of Reporting of Meta-analyses (QUOROM) checklist, and only one study followed Strengthening the Reporting of Observational Studies in Epidemiology (STROBE), and Oral Health Assessment Tool (OHAT) (Table
1). Only 100 articles that followed guidelines for SR/MA reported their QA results. The list of 244 included articles using QA tools in vitro SRs/MAs was found in Table S
4.
Table 1
Principal characteristics of included articles using QA tools in vitro SRs/MAs
Year of publication | 2007–2014 | 24 (9.8%) |
2015–2020 | 220 (90.2%) |
Region | Europe | 99 (40.6%) |
South-America | 64 (26.2%) |
Asia | 33 (13.5%) |
Middle East | 26 (10.7%) |
North America | 14 (5.7%) |
Australia | 5 (2%) |
Africa | 3 (1.2%) |
Study topic | Dentistry | 125 (51.2%) |
Bioactivity | 53 (21.7%) |
Biology | 31 (12.7%) |
Methodology | 13 (5.3%) |
Materials | 9 (3.7%) |
Pharmacology | 5 (2%) |
Diagnosis | 4 (1.6%) |
Toxicity | 4 (1.6%) |
Reporting QA used | PRISMA | 143 (58.6%) |
N | 93 (38.1%) |
PRISMA and AMSTAR | 3 (1.3%) |
Cochrane Handbook for Systematic Reviews of Interventions | 2 (0.8%) |
STROBE | 1 (0.4%) |
QUOROM | 1 (0.4%) |
OHAT | 1 (0.4%) |
QA used | Y | 126 (51.6%) |
N | 118 (48.4%) |
Conducting meta-analysis | Y | 71 (29.1%) |
N | 173 (70.9%) |
QA tool used | NR | 120 (49.2%) |
Following previous description of Onofre et al. | 29 (11.9%) |
Developed by authors | 28 (11.5%) |
Cochrane Risk of Bias tool | 12 (4.9%) |
CONSORT | 8 (3.3%) |
ToxRTool | 5 (2%) |
OHAT | 4 (1.6%) |
Joanna Briggs Institute Clinical Appraisal Checklist | 4 (1.6%) |
MINORS | 4 (1.6%) |
QUADAS-2 | 4 (1.6%) |
GRADE | 3 (1.2%) |
NOS | 3 (1.2%) |
Following previous description of Onofre et al. and Montagner et al. | 2 (0.8%) |
STROBE | 2 (0.8%) |
Following the previous description of Bader et al | 1 (0.4%) |
Following previous description of Sackett et al | 1 (0.4%) |
JADAD | 1 (0.4%) |
SciRAP method | 1 (0.4%) |
CASP and MINORS | 1 (0.4%) |
Timmer’s Analysis Tool | 1 (0.4%) |
ARRIVE | 1 (0.4%) |
QUADAS | 1 (0.4%) |
Modifying Quality Assessment Tool for Studies with Diverse Designs (QATSDD) | 1 (0.4%) |
Referencing CRH and the EBM Evidence Pyramid | 1 (0.4%) |
Nature Publication Quality Improvement Project (NPQIP) study | 1 (0.4%) |
Standard Quality Assessment Criteria for Evaluating Primary Research Papers from a Variety of Fields | 1 (0.4%) |
Following previous description of Samuel et al. | 1 (0.4%) |
SYCLE | 1 (0.4%) |
World Cancer Research Fund/ University of Bristol for cell line | 1 (0.4%) |
CRIS guidelines | 1 (0.4%) |
Following Joanna Briggs Institute Clinical Appraisal Checklist for Experimental Studies | 1 (0.4%) |
PRISMA | 1 (0.4%) |
Downs and Black | 1 (0.4%) |
Among 244 included studies, 126 articles (51.6%) performed QA. Only 26 of 126 articles developed their QA tools while conducting their reviews, meanwhile 100 articles employed the available tools. Also, 34 studies followed the QA checklist, which was previously developed by other authors. The others assessed the risks of bias following pre-structured QA tools.
Regarding the distribution of the included studies based on the continent, Europe had the most significant representation with 99 (40.7%) studies meanwhile 65 (26.6%), 14 (5.7%), 27 (11.1%), 31 (12.7%) were from South-America, North America, Middle East, and Asia, respectively. Three studies (1.2%) were from Africa, and five studies from (2%) Australia. Table S
3 provided the characteristics of all included SRs/MAs.
The publication trend of in vitro SRs/MAs slowly enhanced from 2007 to 2014 and then rapidly increased in the following years until 2020. There were 126 of 244 included articles (51.6%) that conducted methodological QA. Although no SR/MA assessed QA in 2007 and 2008, the prevalent studies performing QA among included in vitro SRs/MAs steadily increased during the search period.
We identified 51 different available QA tools. Of these, 48 tools from the first phase were retrieved from within included studies and three tools from the second phase found by Google engine, including IVD (in vitro diagnosis), artificial rumen system, and OHAT. We found that 26 used tools (51%) in the first phases developed by the authors [
20‐
45], while other 22 tools were pre-structured and included 19 studies from the first phases and three tools from the second phase accounted for the remaining 49%. Among 26 QA tools developed by the authors, 20 tools (76.9%), specialized in dentistry studies whereas two tools (7.7%) applied in the methodology, two tools (7.7%) applied in bioactivity studies, and two tools (7.7%) involved in the biology studies. Among tools developed by the authors, 17 tools (65.38%) [
20,
21,
23,
26‐
34,
39,
40,
42,
44,
45] mentioned items, which could be used only in specific fields (mainly on dentistry) while nine tools (34.62%) [
22,
24,
25,
35‐
38,
41,
43] contributed the criteria for general reviews, as shown in Table
3. Tools used for a specific study often contained unique factors directly relating to the test materials and outcomes in the reviews. Examples of this include teeth free of caries, the specimen preparation, specimen dimension, enamel antagonist, the specimen shape, concentration of enzymes, storage condition of the sample, or the used devices. The authors also highly concern on the bias of method, which could affect the reliability of outcomes, namely calculating sample size, the randomization of samples, the blinding of the examiner, and the appropriate form of statistical analysis. Instead, tools used for general SRs/MAs evaluated the reliability of methodology to report results generally [
43] or consisted of items assessing each step of study (objective, sequence generation, blinding, selection bias, detection bias, performance bias, report bias). The majority of these tools (11 tools, 42.3%) were contributed as simple checklists. These tools only had questions and required the answers of “yes”, “no” or “not report.” The overall bias could be decided by the number of “yes” or “no” answers. Seven checklists with judgment (26.9%) among tools developed by the authors contained multiple items, which required the authors to provide their assessment in details and compared them between studies. Finally, eight scale tools (30.8%) rated the quality of each item with varied levels by giving points to them; for instance, reported answer = 1 point, not reported answer = 0 points. There were also tool in quality ratings of each domain with different levels (0–4 points). The summary score of each study determined as high, low or unclear risk of bias correspondingly.
Table 3
Summary results comparing the identified tools by type
Purpose |
Items used in specific fields | | |
Items used for general systematic reviews | | |
Characteristics |
Simple checklist | | |
Checklist with judgment | | |
Scale | | |
Total (number, 100%) | 26, 100% | 25, 100% |
In contrast, for 25 pre-structured tools, there were approximately 20 tools (80%) used for general SRs/MAs. The exceptions were QA for IVD [
46], a tool for in vitro studies using artificial rumen [
47], which specialized in studies on cell lines, and two tools for the evaluation of toxicological/ecotoxicological data [
48,
49]. In general, there were four simple checklists, six checklists with judgment with 15 scales respectively (Table
3). IVD and artificial rumen tools are checklists with conclusions. The assessments entirely required the examiners to give their checking based on available criteria. Tool for IVDs suggested validations relating to their technical characteristics, namely technical specification actable for registry purposes, their format for technical file, manufacturers, their proper distribution, and cost-effectiveness.
Similarly, the validation established for experiments with artificial rumen focused on specific criteria via the assessment of microorganisms, dividing protozoa, incubation periods, the digestion, and the interaction between chemicals used. Meanwhile, the tool specializing in cell lines (World Cancer Research Fund, University of Bristol) also highlighted the cell line characteristics, repetitive numbers of experiments, and the reporting selection of outcomes. The first tool in toxicological/ecotoxicological information was developed by Klimisch et al. [
49]. The criteria entirely focused on factors affecting the results, namely the test substances (their purity/origin/composition, their concentration/doses) or test systems (their suitability, the physical and chemical characteristics of the medium, negative/positive controls) and method to measure the results (appropriate statistic method). These authors suggested four levels of quality, including reliable with and without restriction, not reliable and not assignable, accordingly. However, this approach did not have specific guidance for the quality evaluation. In 2009, Schneider et al. [
48] developed a more detailed tool named ToxRTool based on Klimisch et al. [
49] ‘s suggestion to address this flaw. The ToxRTool for in vitro SRs/MAs included 18 questions evaluating the test substances, test system, study design description, study results documentation, and plausibility of study design and data. For each criterion reported, the study gets one point. The summary score will initially determine its level of quality. However, Schneider et al. [
48] indicated some critical criteria would downgrade the overall level if the study did not report it. The evaluators will give their decision after considering both the summary score and the answer to critical questions.
For 20 pre-structured tools for general reviews, they emphasized the bias based on the detection or selection of samples, the balance of baseline characteristics, the complete outcome reported, and the sequence generation. Two of these tools (EBM Evidence Pyramid and GRADE tool) were wrongly used as assessment tools of methodological quality or risk of bias. Mainly, Xiao et al. [
58] used EBM Evidence Pyramid to evaluate the methodological quality, while Pavan et al. [
59] used GRADE tool to assess the risk of bias of their included studies. However, we still had them as exceptional cases of QA tools applied by other authors of SRs/MAs in our research. Among these 20 pre-structured tools, the QA tool referring to CRH and the EBM Evidence Pyramid [
50] might be classified as the most straightforward checklist. This tool has four levels and defines the grades of quality based on the study design (SRs/MAs of in vitro studies = A) and baseline characteristics (comparable baseline = B, unknown baseline = C, no similar baseline = D). However, it is inappropriate to evaluate the methodology of a SR/MA only based on the baseline characteristics. The GRADE tool [
51] is a tool to grade the quality of evidence (strong to low quality), which consists of six other domains (study design, inconsistency, indirectness, limitations, imprecision, and publication bias) to adjust (downward or upward) this initial assessment of quality. Therefore, the GRADE tool instructs the authors on defining the critical outcomes and evaluating the quality of such results rather than assessing the study’s risk of bias.
For the remained 18 tools, although there were both checklists with available questions needing yes/no answers and lists with domains needs requiring assessor’s opinions, these questions are divided into these domains: rationale of the study, samples, randomization, blinding, procedures, reported outcomes, discussion evaluation and other bias (Table
4). The criteria were highly varied. The most popular criterion, which needs to be considered as the appropriate analysis, was mentioned in Cochrane Collaboration [
60], Joanna Briggs Institute Clinical Appraisal Checklist for Experimental Studies [
61], Timmer’s Analysis Tool [
52], and OHAT [
53]. Other pricipal criteria are description of data collection, the blinding of samples and investigators/assessors, appropriate method, reporting of all outcomes mentioned in the method, and reporting of missing data. These criteria were mentioned by three tools in Table
4. The less highlighted criteria include the reasonable sample size, the appropriate method of data collection, the representative samples, the balanced baseline characteristic between intervention groups, the detailed sample data, the randomization of allocation sequence, the assurance that samples received the proper procedure, the appropriate control/reference standards, and the adjusted confounders. Finally, the criteria rated by only one tool are the rationale of the study, the description of the sample collection tool, the description of controls/reference standards, the adequate randomization, the blinding of allocation sequence, the full description of procedures, the identical approach between groups, the description of control/reference standard, the replication, the justification of method analysis, the similar research between groups, the report of complete data, no selection of reported results, report of intermediate results and the requirement of the reflection in a clinical trial.
Table 4
The criteria rated by five tools (Cochrane collaboration, Joanna Briggs Institute Clinical appraisal checklist for experimental studies, QUADAS tool, Timmer’s analysis tool, OHAT)
Rationale of study | Rationale of study | – | + | – | – | – |
Sample |
| Reasonable sample size | – | + | | + | – |
Description of data collection | – | + | + | + | – |
Appropriate method of data collection | – | + | – | + | – |
Sample collection tool | – | + | – | – | – |
Representative/ appropriate samples | – | – | + | + | – |
The balanced baseline characteristics between intervention groups | + | – | – | – | + |
Detailed sample data | – | + | – | + | – |
Description of control/reference standard | – | – | + | – | – |
Appropriate control/reference | – | – | – | + | – |
Randomization |
| Randomization of allocation sequence | + | – | – | + | – |
Adequate randomization | – | – | – | – | + |
Blinding |
| Allocation sequence | + | – | – | – | – |
Sample/Participants | + | – | – | + | + |
Investigators/Assessors | + | – | – | + | + |
Procedure |
| Full description of procedures | – | + | – | – | – |
Samples received proper procedure | + | – | + | – | – |
Identical procedure between group | – | – | – | – | + |
Choice of appropriate method | – | – | + | + | + |
Appropriate control/reference standard | – | – | + | – | – |
The ability for replication | – | – | + | – | – |
Appropriate analysis | + | + | | + | + |
Justification of method analysis | | + | – | – | – |
Identical analysis between group | + | – | – | – | – |
Adjust confounders | – | – | – | + | + |
Reporting outcomes |
| Complete reported results | + | – | – | + | + |
Complete data | + | – | – | – | – |
No selection of reported results | + | – | – | – | – |
Intermediate results reported | – | – | + | – | – |
Missing data reported | + | – | + | + | – |
Clinical practice reflection | – | – | + | – | – |