Background
Non-alcoholic fatty liver disease (NAFLD) is rapidly becoming the most common cause of chronic liver disease worldwide [
1]. NAFLD is a spectrum of diseases that encompasses uncomplicated steatosis, non-alcoholic steatohepatitis (NASH) and fibrosis, which in a small proportion can lead to complications including cirrhosis, liver failure and hepatocellular carcinoma [
2]. NAFLD is a multisystem disease with a multidirectional relationship with the metabolic syndrome [
3‐
5]. NAFLD is associated with increased risk of cardiovascular disease [
5‐
7] and cancer [
8]. Among other high-risk groups [
9], people with diabetes and NAFLD are at increased risk of micro- and macrovascular complications [
10,
11] and these patients have a twofold increased risk of all-cause mortality [
12].
The estimated point prevalence of NAFLD in the general Western population is 20–30%, largely based on cohort studies with heterogeneous inclusion criteria and research methods [
13]. The prevalence of NAFLD rises to 40–70% among patients with type 2 diabetes and up to 90% among patients with morbid obesity [
14‐
16]. Moreover, as the rates of diabetes and obesity rise worldwide, it is expected that NAFLD will become even more common. NAFLD-related cirrhosis is currently the third most common indication and is anticipated to become the leading indication for liver transplantation in the USA within the next one to two decades [
17].
There is much debate about whether screening programmes in the general population or in at-risk groups, such as people with diabetes [
9], should be implemented [
18,
19]. This debate is based on our current understanding of the epidemiology and natural history of NAFLD, which, in turn, derives from cohort or cross-sectional studies [
13]. These are often highly selected studies of individuals with metabolic risk factors, or they involve extensive phenotyping that would be unrealistic in routine practice.
A pragmatic approach is to focus on real-world patients for whom the diagnosis of NAFLD has been made during routine clinical care. A diagnosis of NAFLD is often made following abnormal imaging of the liver or elevated serum liver enzymes (so-called liver function tests) and involves exclusion of other causes of liver injury, such as excess alcohol consumption and viral hepatitis. Although routinely collected data can represent only the visible part of the clinical iceberg, there is a growing body of literature that has used well-curated electronic health records (EHRs) to study disease characteristics and epidemiology in large numbers of people [
20‐
22].
In many European countries where health care is largely state funded and there are low or absent primary-care co-payments, the population has unrestricted access to health care with primary-care physicians acting as gatekeepers (including referral to secondary care) [
23]. Healthy people register with primary-care centres when they move to an area to access health care when it is be needed and so primary-care EHR represent data that are as close to a general population as possible, with near universal coverage of the population in the region where the data is collected. Recording of a diagnosis in European primary-care databases is not driven by reimbursement and the patient population is relatively stable compared to other types of EHRs, such as US claims databases. Primary-care databases hold comprehensive medical records, which include diagnoses, prescriptions, laboratory values, lifestyle and health measures, and demographic information for a large and representative sample of patients. Concerns around the degree of data completeness are now largely historic as the vast majority of practices are paper-free and therefore, these data represent the only clinical record for care, administration and re-imbursement. Thus, within the areas that utilise these databases, coverage is near universal. If a practice joins the database, all the patients of that practice are registered in the database. Although there is an option for individual patients to opt out, this is minimal (<1%).
In this study, we harmonised health-care records for 17.7 million adults from four large European primary-health-care databases to estimate the prevalence and incidence of recorded diagnoses of NAFLD and, where available, NASH, in patients in primary care and to compare these with estimates from cohort studies. We sought to ascertain the changes in prevalence and incidence of recorded diagnoses of NAFLD from 2007 to 2015, and the effect of age and sex. We compared the characteristics of patients with an NAFLD diagnosis in the different databases and reported, where possible, the proportion of patients with markers of advanced disease in the diagnosed population.
Discussion
In the largest real-world study of its kind to date, we report the incidence and prevalence of recorded NAFLD diagnoses among 17.7 million adults in four different European countries.
The databases used have been validated, are broadly representative of the population of the country and have been extensively used for pharmaco-epidemiology research [
17,
20] (Additional file
1: Table S1). Despite a rise in incidence, our study found a large shortfall in Europe between the expected number of patients with NAFLD and NASH and the number with recorded diagnoses. Although others have suggested that this might be the case at a local level or in small questionnaire-based exercises [
32], this study has identified the scale of that diagnostic gap across four European territories. Under-recording of NAFLD in primary care may reflect (i) missed opportunities to make the diagnosis by investigating abnormal liver enzyme values or imaging findings, (ii) a lack of confidence to make the diagnosis even if liver enzymes are in the reference range or (iii) under-recognition of the diagnosis in secondary care. Furthermore, many patients who do have the diagnosis have not had the investigations required for appropriate risk-stratification and therefore, specialist care may not be offered to those at greatest need. The current study represents a departure from existing population-level study designs of NAFLD. Notwithstanding the limitations discussed below, by using real-world data, we have gained insight into current practice and attitudes to NAFLD and into the changing face of NAFLD in primary care.
We used UMLS semantic harmonisation to extract primary-care EHR data and identify 176,114 patients with a recorded diagnosis of NAFLD. Despite variations in coding systems, in the characteristics of the populations and in the health-care systems in each country, the results from all four territories are broadly consistent. They show rising incidence and prevalence of NAFLD; however, the levels of recorded NAFLD in EHR primary-care databases is many-fold lower than those anticipated based on prior observation studies, which estimated the prevalence of NAFLD in the general European population to be 20–30% [
33]. The characteristics of patients in that study were comparable with those with NAFLD in a recent systematic review of the literature and meta-analysis that included 101 studies [
13]. That study reported the European prevalence of NAFLD diagnosed by imaging to be 24% (95% CI: 16–34%) and diagnosed by blood tests to be 13% (95% CI: 4–33%). Thus, our pooled prevalence in European EHR databases of 1.9% is at best ~1/6 and more likely only ~1/12 of the estimates based on cohort data. Our estimates of incidence in 2015 ranged from 1.1 to 4.1 per 1000 and are approximately 10 times lower than expected based on cohort studies: 28 (95% CI: 19–41) per 1000 person-years in Israel and 52 (95% CI: 28–97) per 1000 in Asia [
13].
The prevalence of NAFLD diagnosis has trebled and incidence has doubled over the period of this study. The rising rates of co-morbid conditions such as diabetes and obesity may be responsible for this. Other probable factors include increased awareness among primary-care and non-liver physicians, improved communication of the diagnosis from secondary to primary care, and the increased use of blood tests and imaging to investigate common complaints such as abdominal pain or monitoring long-term conditions. Our data do not allow us to test these hypotheses further; however, studies from other groups also suggest that the total number of people developing NAFLD is rising, as is the number of people with NAFLD who develop life-threatening complications [
13].
Despite the consistency in overall findings, the differences between the databases are indicative of differing practices. SIDIAP had a relatively large proportion of patients with a history of alcohol abuse (14.1%), although all databases included at least some NAFLD patients with recorded alcohol abuse. This reflects uncertainty in the community as to whether an individual can have fatty liver disease associated with metabolic syndrome even if they drink alcohol in excess of recommended limits, or indeed have any other cause of chronic liver injury such as viral hepatitis. While clinical trials make very precise distinctions between alcoholic and non-alcoholic fatty liver disease, the reality is that an obese, diabetic and hypertensive patient can consume alcohol in excess of recommended limits and have liver injury. There is no way to distinguish which aetiology is the dominant cause, and so clinicians are quite comfortable with co-existing diagnoses. Indeed, some authors now refer to BAFLD – both alcoholic and fatty liver disease. An alternative explanation may be that specialists making the diagnosis of fatty liver are unaware of the high alcohol use, either because of under-reporting by patients or poor communication from GP practices.
In HSD, prevalence increased over time whereas incidence has decreased in recent years. This can be explained by a relatively stable population in which nearly all patients were enrolled in 2000, see Additional file
1: Figure S3, and remained in the database until December 2015.
Text-mining in IPCI increased the number of NAFLD diagnoses by over eightfold. This suggests that while the diagnosis of NAFLD is being made, GPs are not recording it, despite there being a code for liver steatosis in IPCI. IPCI had the lowest level of ALT recording. A recent survey of Dutch GPs explored attitudes to the importance of NAFLD [
34]. Only 47% of doctors used liver tests in patients with NAFLD and non-invasive scores were never used by 73% of respondents (we were able to calculate FIB-4 scores in only 27% in IPCI).
The UK THIN database appears to outlie from the others in several ways. The prevalence of recorded NAFLD in THIN (0.2%) is much lower than the other databases and markedly lower than that found in a study of almost 700,000 adults in a primary-care EHR study in London, UK (0.9%) [
35]. Higher rates of alcohol recording in the UK alone are unlikely to account for all this difference. The median ALT was highest in THIN. This may suggest that the diagnosis of NAFLD is more likely to be made in the UK by investigating abnormal liver enzymes than in other territories. However, the data required to calculate FIB-4 were available in only 11% of patients in THIN (Additional file
1: Table S3). NAFLD patients in THIN had the highest mean BMI. Moreover, THIN had the highest proportion of NAFLD patients with diabetes or impaired fasting glucose and the highest proportion of NAFLD patients with high-risk FIB-4 scores. Large-scale liver-biopsy-based cross-sectional studies or replication of the current study in cohorts with systematic ascertainment of the component of FIB-4 would be needed to confirm that patients are diagnosed with NAFLD at more advanced stages in the UK compared to other European countries.
Limitations of the study
When interpreting the data, it is important to consider the following issues. In IPCI, a diagnostic code for NAFLD was not available, therefore we devised an algorithm based on the diagnostic code ‘liver steatosis’ and excluding excess alcohol consumption. We did not do this for all databases because the IPCI terminology contains only 1073 clinical terms and therefore, general practitioners often utilise the free text to record information with greater precision, whereas the other coding systems contain many more such concepts: ICD9CM contains 40,855 terms, ICD10 contains 13,505 terms and Read Codes contains 347,568 terms [
36].
The number of cases of recorded NASH is too small to make meaningful estimates of incidence and prevalence: 2–4% of patients with NAFLD in THIN and SIDIAP in which NASH was coded. This is far short of the 12.2% estimated from a US biopsy-based study [
37]. This shortfall between coded NASH and the true burden of disease is probably due to the same factors that result in under-recording of NAFLD diagnosis: recognition, referral and coding in primary care, and under-diagnosis or poor communication in secondary care.
It is not possible to verify the accuracy or origin of recorded diagnoses, although the characteristics of the patients derived from the four databases are in keeping with the population one would expect with a NAFLD diagnosis. Some individuals not in this study may have undiagnosed NAFLD. Therefore, our results do not represent the true disease burden in the epidemiological sense, rather they tell us what is actually happening with people who currently have a diagnosis of NAFLD and they can inform the arguments for or against greater action in this area. While we cannot exclude the possibility (however unlikely) that all the other millions of expected NAFLD patients exist in other databases, we do not make any conclusions about people outside this dataset. Although primary-care data contain a large body of information, this does not diminish the value of well-phenotyped cohort studies in which NAFLD can be ascertained systematically using standardised screening methods (e.g. measuring liver enzymes or performing ultrasound in all patients). That said, the databases included in this study have been extensively used for research and have been validated for diagnoses other than NAFLD [
24,
27,
38].
Conclusions
Clinical practice is evolving in this emerging field and as yet there are no recommendations to screen formally for NAFLD, even in high risk groups [
39,
40]. One school of thought is that if the only available intervention for NAFLD or NASH is lifestyle change, then doctors are already giving such advice to their patients, although the extent to which patients take up such advice varies. However, hepatic steatosis is an independent predictor of diabetes [
41,
42] and could, therefore, identify patients who stand to benefit from lifestyle changes to prevent diabetes and hepatic complications. Furthermore, the emerging data suggesting hepatic steatosis is an independent cardiovascular risk factor may be an additional incentive for physicians to increase their awareness of the early stages of NAFLD. At the more severe end of the scale, novel therapies targeted at NASH and fibrosis are already in phase III clinical trials and are expected to be available in the next few years. These may change the treatment paradigm. Therefore, the scale of the health-care challenge posed by NAFLD and its sequelae cannot simply be side-stepped by dismissing NAFLD as pre-disease. Further research is required to quantify the associations of NAFLD with outcomes and to determine whether Wilson’s criteria for effective screening can be fulfilled [
43], thereby informing the screening debate.
Acknowledgements
EMIF is a collaboration between industry and academic partners that aims to develop common technical and governance solutions to facilitate access to diverse electronic medical and research data sources. These analyses were supported by the Innovative Medicines Initiative Joint Undertaking under EMIF grant agreement 115372, whose resources include financial contributions from the European Union’s Seventh Framework Programme (FP7/2007-2013) and in-kind contributions from European Federation of Pharmaceutical Industries and Associations companies. The authors would like to acknowledge Nicholas Galwey for his advice on the statistical methods, Alba Jene for her administrative support and support during submission to ethical review boards, and Derek Nunez for support during the early protocol design stage.