For preclinical studies, outcomes considered will include clinically relevant outcomes such as mortality, morbidity and adverse events. Structural outcomes considered include results of histological analyses, including grading of pathology according to the Osteoarthritis Research Society International (OARSI) histopathology initiative guidelines for specific animal models [
33‐
37], and other commonly used measures such as the Grading of Recommendations Assessment, Development and Evaluation (HHGS)/Mankin score and its modifications, the O’Driscoll and Pineda scores [
38]. Outcomes of noninvasive imaging including cartilage thickness [
39], presence of osteosclerotic lesions or intraosseous cysts [
40] visible on MRI will also be included for analysis. Functional outcomes considered for analysis will include behavioural and mechanical measures of nociception and gait analysis such as hind paw weight as appropriate to specific species [
25].
For clinical studies, outcomes considered for analysis will include clinically relevant outcomes, such as mortality, morbidity and adverse events, classified as per the US Department of Health and Human Services Common Terminology Criteria for Adverse Events [
41]. Structural outcomes will include results of arthroscopic evaluation, specifically ratings of severity such as the International Cartilage Repair Society (ICRS) clinical cartilage injury classification system [
42,
43], and Oswestry Arthroscopic Score (OAS) [
44]. Also considered will be the results of medical imaging, including ratings of x-rays such as the Kellgren-Lawrence (KL) Classification of Osteoarthritis [
45] and ratings of pathology via magnetic resonance imaging such as the OMERACT Knee Inflammation MRI Scoring System (KIMRISS) [
46], the Boston Leeds Osteoarthritis Knee Score (BLOKS) [
47], the MRI Osteoarthritis Knee Score (MOAKS) [
48] and the Whole-Organ Magnetic Resonance Imaging Score (WORMS) [
49]. Results of histological analyses considered for analysis include grading systems such as the HHGS [
50] and the OARSI Cartilage Histopathology Assessment System [
51]. Patient-reported outcomes considered for review include validated measures of treatment response [
52,
53], including measures of knee function, pain, quality of life and patient satisfaction, such as the Western Ontario and McMasters Universities Osteoarthritis Index (WOMAC) [
54], the Knee injury and Osteoarthritis Outcome Score (KOOS) [
55], Knee Pain Scale (KPS) [
56] and visual analogue scales (VAS). Objective functional outcomes including strength, range of motion, locomotion, gait and proprioception will also be examined if reported in included studies.
Risk of bias
The Systematic Review Centre for Laboratory Animal Experimentation (SYRCLE) risk of bias tool will be applied to pre-clinical (animal) studies [
57]. This is an assessment tool adapted from the Cochrane risk of bias tool for randomized controlled trials with human participants [
58] and the two tools display significant overlap. Independent scoring of risk of bias for included studies will be performed by two reviewers, with consensus reached by discussion. The ROBINS-I (‘Risk of Bias In Non-randomized Studies - of Interventions’) tool [
59] will be used to assess the observational studies eligible for inclusion. Potential risks will be assessed over seven bias domains: baseline confounding, participant selection, classification of intervention, deviations from intended intervention, missing data, outcomes measurement and reporting [
59,
60]. For any randomized trials, the RoB2.0 tool will be used to rate risk of bias [
61]. An overall risk of bias judgement will be determined as either low, moderate, serious or critical risk of bias or no information for each specified outcome. Where more than one outcome of an included study is to be assessed, the risk of bias across the seven domains will be repeated for each key outcome, and a risk of bias judgement will be reported for all outcomes.
Confidence in cumulative evidence
The revised and validated methodological index for non-randomized studies (MINORS) criteria [
63] will be used to assess the strength of non-randomized studies included for the review. The MINORS tool applies a scoring system across 12 items to assess the methodological and scientific value of studies, with the first 8 items relating to non-comparative studies and all 12 items relevant for comparative studies. Each item will be scored from 0 to 2, with 0 indicating a lack of reporting of the item, 1 indicating inadequate reporting and 2 indicating adequate reporting of the item in the evaluated study with maximum scores for non-comparative and comparative studies of 16 and 24, respectively. The MINORS score for non-randomized studies will be categorized as per 0 < MINORS score < 6 to indicate a very low quality evidence, 6 ≤ MINORS score < 10 to indicate low quality of evidence, 10 ≤ MINORS score < 14 to indicate fair quality of evidence and MINORS score > 15 to indicate good quality of evidence. Where randomized controlled trials are included, in the context of a primary comparison between alternative interventions with respect to the review outcomes, the Grading of Recommendations Assessment, Development and Evaluation (GRADE) system will be utilized to assess study quality [
58]. For preclinical evidence, the methods proposed by Hooijmans et al. [
64] will be used to rate the quality of evidence against the Animal Research: Reporting of In Vivo Experiments (ARRIVE) guidelines for animal research [
65].