Background
Hemorrhagic fever with renal syndrome (HFRS), a rodent-borne disease caused by hantaviruses (family Bunyaviridae), is characterized by fever, acute renal dysfunction, and hemorrhage manifestations [
1‐
3]. HFRS was first recognized in northeastern China in 1931 and has been prevalent in many other parts of China since 1955. At present, it is highly endemic in mainland China accounting for 90 % of the total cases reported in the world [
4‐
7]. In response to the spread of HFRS in China, the Chinese Center for Disease Control and Prevention (CDC) established the National Notifiable Disease Surveillance System in 2004, which made the surveillance data for HFRS more accurate and comprehensive. A better understanding of the spatial distribution patterns and social demographic distribution characteristics of HFRS would help to identify areas and populations at high risk. Early warnings are also essential for controlling or reducing the risk of outbreaks [
8], epidemic modeling and forecasting can be essential tools to prevent and control HFRS [
9]. In epidemiology, Autoregressive integrated moving average (ARIMA) models have been successfully applied to predict the incidence of infectious diseases, such as HIV [
10], influenza [
11], malaria incidence [
12], and other infectious diseases [
13‐
15].
This study aimed to establish the current situation of endemic HFRS in Yiyuan, and characterize its spatio-temporal distribution and demographic distribution characteristics. Furthermore, we fit ARIMA models and predict the HFRS epidemic trend by using SAS version 9.2 (SAS Institute, Cary, NC,USA). Our study was based on HFRS epidemic data from Yiyuan County, China, where it could provide a basis for HFRS prevention and control.
Methods
Study area
The study site is located in Yiyuan County (latitude 35°55′ ~ 36°23′ N and longitude 117°54′ ~ 118°31′ E), in the central part of Shandong province. Monthly HFRS cases reported during 2005–2014 were provided by Zibo CDC, and we were permitted to use the data. In China, HFRS is a nationally notifiable disease and hospital physicians must report every case of HFRS to the local health authority within 24 h.
Ethics statement
The ethical approval was given by Ethics Review Committee of the Zibo Center for Disease Control and Prevention, and the study was conducted in compliance with the principles of the Declaration of Helsinki. Written informed consents for the use of their clinical samples were obtained from the patients and all analyzed data were anonymized. Besides, consents of participants who were under 16 years have been obtained from their parents/guardians.
Demographic distribution analysis
The demographic distribution characteristics including age, sex and occupation distribution of HFRS cases from 2005 to 2014 in Yiyuan County were analyzed according to surveillance data. All HFRS cases were geo-coded and matched to the town-level layers of polygon and point by administrative code using the software ArcGIS9.3 (ESRI Inc., Redlands, CA, USA). To alleviate variations of incidence in small populations and areas, annualized average incidence of HFRS per 100 000 at each town over the 10 year-period were calculated. Furthermore, annualized average incidences and the proportion of monthly average incidence for each town were mapped in gradient colors and pie charts, respectively. To approximately distinguish the dominant hantaviruses, we divided a year into three periods according to the seasonal distribution of the HFRS cases: March to June (in spring and early summer), July to August (in summer), and September to February (in autumn and winter).
Time-series analysis
ARIMA models are the most commonly used time series prediction models [
10]. We constructed ARIMA models for monthly HFRS incidence in Yiyuan from 2005 to 2014. ARIMA was designed to deal with highly seasonal data [
16]. An ARIMA (p, d, q) model comprises three types of parameters [
9,
16,
17]: the autoregressive parameters (p), number of differencing passes (d), and moving average parameters (q). The multiplicative seasonal ARIMA (p, d, q) × (P, D, Q)s model is an extension of the ARIMA method to time series in which a pattern repeats seasonally over time [
15,
16,
18]. Analogous to the simple ARIMA parameters, the seasonal parameters are: seasonal autoregressive (P), seasonal differencing (D), and seasonal moving average parameters (Q). The length of the seasonal period is represented by s. For example, the incidence of infectious disease varies in the annual cycle, so s = 12 in the present study.
We used the Box-Jenkins strategy to construct models. The ARIMA model procedure consists of three iterative steps [
15,
17,
18]: identification, estimation, and diagnostic checking. Prior to fitting the ARIMA model, an appropriate difference of the series is usually performed to make the series stationary. Identification is the process of determining seasonal and non-seasonal orders using the autocorrelation functions (ACF) and partial autocorrelation functions (PACF) of the transformed data. Parameters in the ARIMA model(s) are estimated with the conditional least squares method after the identification step. At the diagnosis stage, the adequacy of the established model for the series is verified by employing white noise tests to check whether the residuals are independent and normally distributed. It is possible that several ARIMA models may be identified, and the selection of an optimum model is necessary. Such selection of models is usually based on the Akaike Information Criterion (AIC) and Schwartz Bayesian Criterion (SBC). Smaller AIC values indicate a better model, and the SBC considers the residual error, which is based on AIC. The lowest SBC value with a
P value less than 0.05 was considered to be the best model [
19]. In addition, to check the accuracy of each model, root mean square error (RMSE) between the number of observed and fitted HFRS infections from 2005 to 2014 were calculated. A lower RMSE value indicates a better fit of the data. Finally, the fitted ARIMA model was used for short-term forecasting of the monthly HFRS incidence between January and December 2014. All analyses were performed using SAS 9.2 with a significant level of
p < 0.05.
Discussion
In this study, our results demonstrated that the HFRS incidence had been increasing from 2009 to 2014 in Yiyuan County. The proportion of monthly average incidence for each town showed that HFRS in Yiyuan was mainly caused by HTNV. We applied multiplicative seasonal ARIMA (p, d, q) × (P, D, Q)s models to analyze the surveillance data of HFRS in Yiyuan, China. According to the results above, the ARIMA (2, 1, 1) × (0, 1, 1)12 model is reliable with a high validity, which can be used to predict the next one year’s HFRS incidence in Yiyuan.
Yiyuan, are mountainous and hilly with numerous
Apodemus agrarius, and farmers have more chances to be exposed to contaminated urine and feces of infected rodents. Agricultural activities such as sleeping in the fields, irrigating, and working on the farmland during the autumn harvest season might have played a significant role in the occurrence of HFRS [
20]. Previous studies reported that the transmission of HTNV through
Apodemus agrarius peaked in the winter, while
Rattus norvegicus associated SEOV infections mainly occurred in the spring [
21‐
24]. Thus, the human infections in Yiyuan in the fall and winter reflect a seasonal characteristic pattern of HTNV transmission. The forecast results suggest that the HFRS incidence in China will experience a slight growth in the next one year. A rise in the number of HFRS incidence may also result from an increase in the number and size of natural foci [
25], climate change, especially the increase of mean temperature [
20,
23]. Therefore, knowledge of HFRS forecasts is necessary to prompt health departments to strengthen surveillance systems and reallocate resources in anticipation of increasing HFRS incidence.
Epidemiological surveillance of communicable diseases is one of the most traditional health-related activities. Time-series analysis of incidence of various infections is extremely useful in developing hypotheses to explain and anticipate the dynamics of the observed phenomena and subsequently in the establishment of a quality control system and reallocation of resources [
26]. There are a number of methods applied for time series analysis including ARIMA model [
8‐
13,
16], maximum entropy method (MEM) spectral analysis [
27] and the autoregressive conditional heteroscedastic (ARCH) model [
28]. The ARIMA model has its advantages in time-series analyses. The secular trend, seasonal variation, and autocorrelation could all be easily controlled by difference, auto-regression, moving average, and seasonal functions without performing complicated transformations or using extra surrogate variables [
18]. Once a satisfactory model has been obtained, it can be used to forecast expected numbers of cases for a given number of future time intervals [
29].
Besides, the application of GIS, together with time-series analyses in the present study, provides ways to quantify explicit HFRS and to further identify environmental factors responsible for the increasing disease risk. Although analyses are still preliminary, the findings can be helpful for generating hypothesis for further investigation. For example, based on the prediction results, the government can invest more health resources during high-risk periods and decrease it during low-risk periods to improve the cost-effectiveness of interventions and scheduling of resources. It can also be used to evaluate the effectiveness of public health interventions under varying assumptions by comparing actual HFRS incidence with expected incidence.
However, limitations should also be considered in this present study. The RMSE value was 3.56, and the actual data did not match the predicted data of the model perfectly. Due to a lack of time series data on the population densities of rodents, and the influencing factors, it is difficult to further uncover the probable causes and shifts of the characters of HFRS. Future researches are warranted to focus on the risk factors of HFRS to modify the ARIMA model such as rodent population densities, human activities, farming patterns, various socio-economic and environmental factors in Yiyuan.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
TW extracted the data, conducted the statistical analysis and drafted the manuscript. JL and YPZ conceived of the project concept, helped to interpret the results and modify the manuscript. FC helped to interpret the results. ZSH extracted the data. SYZ and LW conceived of the project concept, assisted with the data interpretation, and helped write the manuscript. All of the authors have read and approved the final manuscript.