Data source and cohort identification
The main data source for the present study was the administrative data repository of the Veneto Region, North East Italy. All healthcare contacts involving the Region’s ~ 5 million inhabitants are recorded to report expenditures to the central government. To complement this infrastructure, a regional Health Information Exchange (rHIE) system has been implemented for the real-time sharing of healthcare documents [
18], including laboratory reports. This was a retrospective, observational study involving the entire Veneto region. The initial subject pool comprised all Italian citizens resident in the Region who, according to Veneto’s register of healthcare beneficiaries [
19] had been eligible beneficiaries for at least 1 year between January 1st, 2011 and September 30th, 2018, or time of death. For each subject, we collected all available information, including exemptions from co-payment, and all administrative claims concerning prescriptions, refills, and hospitalizations (procedures and post-discharge diagnosis codes). In the absence of a centralized diabetes registry, we applied a validated claims-based algorithm with 97.6% precision, 95.7% recall, 87.9% specificity [
20] in identifying citizens affected by diabetes. Among these, we selected all new initiators of GLP-1RA (exenatide, liraglutide, lixisenatide, dulaglutide) or DPP-4i (sitagliptin, vildagliptin, alogliptin, linagliptin, saxagliptin) who had started their therapies within the observation window but had not been treated with fast-acting insulin or the other drug. This exclusion criterion was applied because, in Italy, the combination of fast-acting insulin and GLP-1RA or DPP-4i was not reimbursed; in addition, even spot use of fast-acting insulin is considered a proxy of disease severity or intercurrent illness. The distinction between ongoing and newly initiated therapies was based on the presence (or absence, respectively) of prescriptions of each drug within 7 months of the first prescription of an A10-class drug in the patient’s claims. We defined the date of first appearance of either a GLP-1RA (ATC A10BJ) or a DPP-4i (A10BH, A10BD07-13, A10BD19, A10BD21, or A10BD24-25) after this 7-month period as the patient’s index date. The 7-month delay was chosen based on a sensitivity analysis comparing prescription with refill rates, showing that the vast majority of prescriptions are refilled within 7 months. In our primary, “as treated” (AT) analysis, we followed each subject from the index date until therapy discontinuation or the last available observation. In a sensitivity analysis, we followed an “intention to treat” (ITT) approach, disregarding therapy discontinuation as a censoring criterion.
Since prescription of cardioprotective drugs can reflect perception of an imminent cardiovascular events or a planned cardiovascular intervention, in order to avoid this reverse causality, we ignored all events occurring within 2 months from the index date. This delay also allows hospitalization administrative claims to appear in the repository.
Outcome definition
The primary outcome was a modified definition of the 3-point major adverse cardiovascular event (3P-MACE), i.e., a combination of myocardial infarction, stroke, or all-cause death. Due to the unavailability of causes of death, all-cause death was used in place of the traditional cardiovascular death within the 3P-MACE. This modification was considered acceptable because about 70% of deaths in people with diabetes are caused by cardiovascular disease [
21]. Secondary endpoints were: individual components of the 3P-MACE and hospitalization for heart failure (HHF). Operatively, the presence of the following ICD-9-CM diagnosis codes in a patient’s claims denoted the occurrence of the corresponding endpoint: 410-414 myocardial infarction, 431-436 stroke, 428 hospitalization for heart failure. Due to the time resolution of anonymized dates of death, all event times were expressed in months.
Propensity score matching and statistical analysis
We balanced GLP-1RA and DPP-4i initiators via propensity score matching (PSM), using the nearest neighbor method and the logit distance, with maximum caliper set to 0.06% of the propensity score (PS) standard deviation. The estimated PS were the output of a logistic regression model trained on patients’ characteristics, i.e., age at index date, sex, claims-based history length (months between the first available claim and the index date), claims-based diabetes duration (months between the first diabetes-related claim and the index date); pre-existing conditions, i.e., hypertension, dyslipidemia, peripheral circulatory complications, myocardial infarction, ischemic heart disease, stroke or TIA, heart failure, cardiovascular disease, neurological complications, ocular complications, renal complications, chronic kidney disease, severe hypoglycemia, chronic pulmonary disease, systemic inflammatory disease, cancer, Charlson comorbidity index [
22,
23]; glucose lowering medications in the entire patient’s history, i.e., number of different A10B-class drugs (“blood glucose lowering drugs, excluding insulins”) and insulin therapy; use of glucose lowering medications in the year before the index date, including long-acting insulin, metformin, sulfonylureas, SGLT-2i, pioglitazone; and use of other drugs in the year before the index date, including ACE inhibitors, diuretics, beta blockers, other antihypertensives, statins, fibrates or omega-3, PCSK9 inhibitors, ezetimibe, and platelet aggregation inhibitors. Additional file
1: Table S1 reports the definition of these variables via administrative claims.
We tested the balance obtained by PSM using the Chi square test for dichotomous variables, and Mann–Whitney’s U test for age at index date, claims-based history length, claims-based diabetes duration, Charlson index, and number of A10B-class drugs. We defined the two cohorts to be well-balanced if all associated p-values were greater than 0.05 or the effect size ware sufficiently small (standardized mean difference between − 0.10 and 0.10). Laboratory data were available for a limited subset of subject. Hence, following a previously published approach [
24], we verified whether good balance in administrative claims would translate into good balance in the laboratory data closest to the index date. The criteria for this balance assessment were the same as in the previous evaluation (p > 0.05 or absolute SMD < 0.10). Laboratory variables were fasting glucose, HbA1c, total cholesterol, HDL cholesterol, LDL cholesterol, triglycerides, eGFR (CKD-EPI formula [
25]). Systolic blood pressure and diastolic blood pressure were also recorded, when available.
In our primary analysis, we followed the AT approach and compared hazard ratios (HRs) for GLP-1RA and DPP-4i initiators in terms of 3P-MACE, its components, and hospitalization for heart failure. We also performed an ITT sensitivity analysis within the same framework.
Additionally, we implemented the following supplementary and exploratory analyses: (1) comparison of HRs for all cardiovascular endpoints in subgroups stratified by pre-existing CVD; (2) comparison of HRs for the primary outcome (3P-MACE) in subjects who were female vs. male, aged 65 or older vs. 64 or younger, with claims-based diabetes duration above or below the median (91 months), treated vs. untreated with long-acting insulin, treated vs. untreated with sulfonylureas, treated vs. untreated with statins, treated vs. untreated with ACE inhibitors or sartans; (3) comparison of DPP-4i versus human-based (liraglutide, dulaglutide) or exendin-based (exenatide, lixisenatide) GLP-1RA.
For all analyses, we used Cox regression to estimate hazard ratios and tested statistical significance at the 0.05 level.