Register-based birth cohorts provide ultimately possibilities to study the effects of prenatal and childhood exposures throughout the life course and to identify sensitive or critical time windows as is already clear from the studies carried out using the Finnish Birth Register [
17,
19,
20]. In the current work our special focus is on the early postnatal period with the lifelong follow-up remaining as a readily available option. The linkage of multiple health registers supports comprehensive case identification and follow up. The current baseline cohort size of 1.75 million children leads to high study power, which is sufficient to study even rare endpoints, such as birth malformations and childhood cancer.
Scientific knowledge gaps and potential of the MATEX birth cohort
Maternal smoking is an established cause for low birth weight and preterm birth [
21]. Nevertheless, there is controversy about the association between maternal smoking and childhood cancers or specific congenital anomalies. These outcomes have low incidence rates requiring big studies to reach sufficient study power. Case-control design is often applied to identify most cases without the need for the big size of a cohort study [
45]. Case-control design is retrospective leading potentially to recall bias of exposures during the pregnancy, which may have occurred decades earlier. Additionally, both case-control and cohort designs may be biased in the recruitment of the controls or study population. Register-based approach minimises the potential for both biases [
5]. The health registers cover virtually the whole population. Therefore there is no risk that some population groups are over- or under-represented. The use of exposure databases minimises the risk for recall bias.
The fully register-based design limited to the available data, which, in the case of the Finnish Birth Register, have been collected for use over the 29 year recruitment period. It is should be noted that some important confounding variables are missing for certain years, such as maternal height and weight, as well as the socioeconomic group. Additionally, some possibly important confounding factors are not available, for example paternal smoking, alcohol consumption during pregnancy and physical activity. Furthermore, nicotine replacement therapy is potentially important, in the light of the fetotoxicity of nicotine found in animal studies (for a recent review, see [
46]). However, it has been used only for a short period and the data has not been systematically collected in the birth register. Overall, except of maternal smoking, no lifestyle information is available. Due to the large cohort size, it is not feasible to collect additional data via questionnaires or interviews. For variables that are temporarily restricted, the cohort can be analysed in two groups (one with adjustment for the variable, one without the adjustment) to investigate the magnitude of the confounding.
Harmful effects of prenatal exposure to cigarette smoke on the 2nd generation include implications of e.g. germ cell mutations in the case of maternal smoking during pregnancy or paternal preconceptional smoking [
47], but are not well-established, while the effects of air pollution have not been studied at all. The MATEX cohort is recruited over a long enough time that we can identify potential pregnancies of women, who are included in the cohort at birth. The oldest members of the cohort are currently 29 years old and the mean maternal age at pregnancy is 29 years. Later on the current focus on maternal smoking and air pollution may be widened to other prenatal exposures. The Finnish Maternity Cohort, that contains first trimester serum samples from 2 million pregnant women since 1983 (national coverage 95% of all pregnancies), as well as other Finnish blood and serum banks can be used for exposure assessment to chemicals [
42]. The availability of address history has the potential to investigate exposures emitted from stationary sources, such as (nuclear) power plants, industrially contaminated sites, high voltage power lines or transformer stations (extremely low magnetic fields).
Nordic health registers are to a great extent similar, which opens the possibility for Nordic collaboration to increase the cohort size even further [
48]. This would increase the study power in order to investigate rare outcomes associated with low prevalence exposures, such as illegal drugs.
Register-based epidemiology is restricted to data, which are routinely collected in registers. Most registers, however, have not been designed for research purposes per se, but rather for statistical purposes. Hence, some information important and interesting for research is missing. Information on lifestyle is not available, except of maternal smoking in the MBR. No data about paternal smoking or the use of nicotine products, such as chewing gums and skin patches, are available. Additionally, data on alcohol consumption, physical activity, eating habits and other exposures are missing. The data availability limits the possibility to adjust for confounders.
Ethical and legal considerations
The routine collection of data for health register means more work for the health professionals, who collect the data, as well as costs to collect the data and maintain the register. These costs are paid by public funding. Utilizing data that is collected and stored anyway is cost-efficient. There is an ethical duty of the society to use the available data to improve public health and the health services. Utilization of the data not only for statistics, but also research, justifies the increased work load and costs to maintain the register. Because individuals barely have a chance to voice their opinion whether they want their data to be collected or to be used in research, the research community should do its best to make use of the data in a responsible way, not only for science, but especially to improve public and health services for the individuals. Additionally, the scientific community has the ethical responsibility to disseminate the results and conclusions both within the scientific community, and to the general public (e.g. [
49]. This means that the results should not only be published in scientific journals, but also in general newspapers and responsible social media in plain language, ensuring that the public benefits, too.
Finnish and European legislation (Act on the Openness of Government Activities (621/1999) [
50]; Section 8.4, Personal Data Act (523/1999) [
51]; European level Directive 2016/679 (accepted 27th April 2016, to be implemented by 25 May 2018) [
52] regulate the use of personal data for various purposes including research. Generally (Declaration of Helsinki [
53]), and according to Finnish law (488/1999) [
54] research must be based on the consent of research subjects, unless obtaining consent is unduly difficult and the research cannot be carried out without using the data. In this case the prerequisites set out in the law must be satisfied for an exception from the need for informed consent. The full register-based study design qualifies for such an exemption from the need for informed consent. Besides, the majority of Finnish public considers the benefit for public health more important than the individual right to privacy [
55]. However, in the study by Eloranta and Auvinen [
55], information about ongoing and new register-based research was deemed inadequate and register-based research was in general seen as an unfamiliar topic. In the MATEX study only coded information without PINs and names are being used, disabling the direct identification of individuals. Additionally, the statistical analyses do not require us to work with data of individuals, but only with the data as a set. The publication plan includes this study protocol, giving the possibility to inform the society how their data will be used. Additionally, the results of the analyses will not only be published in scientific articles, but also in newspaper articles aimed at the general public.
Data protection is crucial when health data are used, because of the importance to protect privacy and inhibit misuse of the data (e.g. [
49,
56]). Individuals may be identified based on their characteristics and health history, even with unidentified data. In human biomedical studies a positive statement from an ethics committee is required by law in Finland (488/1999) before filing the request to the register holder for obtaining data. Among the crucial aspects are that data protection is sufficient and that a plan exists, what will be done with the data once the study is finished [
5]. For register-based studies, no ethics committee statement is required in Finland.