Strenghts
Computerisation provides relatively inexpensive and easy access to large volumes of data. They are population based and can quickly produce large samples of patients. Because of the longitudinal nature of the data, many research questions can be answered in a more time and cost effective manner compared to other study designs (e.g. newly set up cohort studies).
The most frequently used design is the retrospective cohort design which enables researchers to analyze a large amount of data collected and carefully stored in the past, while analyzing them using a cohort design. The design enables researchers to use information collected in tempero non suspecto, without any hindsight of the objectives of the actual study and avoiding the main risks of bias in prospective study designs, especially selection and recall bias. Thus, the retrospective design guarantees that the measurement of predictor variables was not biased by knowledge of which subjects had the outcome of interest. In a prospective design, knowledge of exposure status may bias classification of the outcome. Also, in a prospective design, being in the study may alter participant’s behavior. Since data in a registry are routinely collected, this inclusion bias does not count for retrospective study designs. Compared to a case–control design, an advantage of the retrospective cohort design is that all of the subjects who developed the outcome (cases) and all those who did not (control) come from the same population.
A major strength of these types of cohort studies in general is the possibility to study multiple exposures and multiple outcomes in one cohort. Even rare exposures can be studied. The combined effect of multiple exposures on disease risk can be determined. Hypothesis generation is another benefit of cohort studies considered as a way to pick up associations between many exposures and outcomes. Yet because of lack of randomization, cohort studies do not permit conclusions about causality. Several underlying etiological hypotheses can be generated, to be tested in other confirmatory studies.
Deckers et al. formulated minimal criteria for a primary care network[
25]. Computerisation provides relatively inexpensive and easy access to large volumes of data. A sufficient sample size is advised to be about 1% of the population, which allows the study of common diseases. Increasing the sample size much more will lead to an increased workload, but is not expected to result in additional information. Intego covers more than 2% of the Flemish population, highly representative for age and gender. Data on age is collected on a continuous scale and can be divided into as many age groups as needed. Data from the previous registration year are collected in the first half of the year. Although data are not collected on a weekly basis, they can be reconstructed in retrospect to obtain weekly, monthly or even daily information.
Because Intego is based on routinely collected data and the study population is selected with broader inclusion and less exclusion criteria compared to an RCT, its results may be more generalizable to clinical practice.
Weaknesses
Some variables, which may be important confounders in public health research are not measured (occupation, employment, and socioeconomic status), measured imprecisely, or even unknown, such as smoking or mortality, which is only registered partially. Information from specialists as well as events that occur in hospital may not be fully captured in the electronic health record. Over the counter medications and treatments given in hospital are not readily available. Exposure status (e.g. onset of treatment, exact year of diagnosis) may be missed because it has occurred prior to start of the registration.
The most ill can die, others can be lost to follow-up because they moved. This will bias the results when selective follow-up rates differ between index and reference group. Loss to follow-up can be related to future outcome. The Healthy Survivor Effect (HSE) can be described as a continuing selection process in the cohort due to survival or maintenance of the healthiest individuals, whereas survival/maintenance process may differ amongst the selected groups (e.g. diabetes or not). The study will include only patients remaining in the system, a survivor (healthier) population. HSE can lead to lower than expected outcomes, can interact with exposure vs. outcome associations between groups (e.g. statin effect in longitudinal design on elderly diabetic patients with and without CKD). When there is no exposure-disease relationship, higher cumulative exposure appears protective of health. The effects of healthy patient effect biases may vary by gender, race/ethnicity, social class, work status, age at inclusion, length of follow-up or cause of mortality/morbidity.
Differential misclassification can lead to an overestimation or underestimation of the effect between exposure and outcome. Classification of individuals (exposure or outcome status) can also be affected by changes in diagnostic procedures.
Intego is based on routinely collected data. Does the general practitioner make a correct diagnosis or assessment? This refers to the discussion about diagnosis (cough, cold,…) in primary care, but also to the problem of episode registration and changing diagnoses (ex. Mycoplasma pn. Infection). Criteria based diagnoses are more accurate than symptom based diagnoses. Second, has the diagnosis/assessment been correctly coded in the electronic health record? This question refers to the sensitivity (or completeness) and the positive predictive value (or correctness) of electronic health record -based data[
26].
Although the patient population is representative for the Flemish population, registering general practitioners are not representative for the general practictioner population. It is a selected group of high quality registering practitioners which use a specific electronic health record. This selection bias of general practitioners could eventually have an influence on some process parameters in the follow-up of patients.
Difficulties arise with tracers, when some drugs are prescribed for other conditions as well. For example anti-epileptics are also prescribed for chronic pain. Also a general practitioner might have a very specific reason for describing a specific drug to a specific person. The lower associations between dementia, migraine and Parkinson with their tracer medications probably were caused by this lack of specificity.
Future
In the near future, large databases with all kinds of information will be widely available. Some of these registries will contain data of almost the entire population of a country or a region. This is a challenging situation for the existing databases collecting data from a smaller sample population. We will have to consider redefining our strategic goals, methods and procedures.
The option of collecting either blood or buccal smear samples of patients in Intego is being explored[
27]. Together with background characteristics and a full inventory of recorded morbidity, laboratory results and prescribed drugs, this would enable research on gene-environment and pharmaco-genetic interactions. This can only be performed after strict informed consent and with the agreement of the ethical review board, the national privacy authorities, and the Flemish authorities supervising the Flemish Biobank activities.
We will continue to explicitly structure quality procedures with training and coaching, introduce criteria for diagnosis (for example GOLD criteria for COPD classification) and list missing data for possible confounders, such as smoking and SES. We will check the validity of our 3 existing quality criteria and continue to work out and implement internal and external validation procedures. We are convinced that this procedure will procure a quality label enabling small registries like Intego to occur as a gold standard for large databases with a lot of data often extracted from ‘bad’ registrators.
Finally, registries like Intego with a stable group of practicising general practitioners also offers possibilities for targeted, experimental, randomized trials.