There are two methods for time series analysis: frequency domain methods and time domain methods. Seasonal ARIMA models, which belong to time domain methods, have been regarded as one of the most useful models in seasonal time series prediction [
25]. There is no need to use any extra surrogate variables [
26]. We can usually only analyze with the outcome variable series without considering the factors that will affect the outcome variable. This method is more practical because we cannot obtain all of the time series data of impacting factors most of the time. Before the model identification, the time series should be handled to be stationary with data transformation and difference. Generally, the more differences are used, the more data loss will occur. Fortunately, we only used a first-order difference and a seasonal difference in this study. Eventually, the ARIMA (2,1,0) × (0,1,1)
12 model was chosen as the optimal model according to the value of CAIC. The seasonal ARIMA model accurately captured the seasonal fluctuation of human brucellosis cases in mainland China. However, the forecasting accuracy in the test set was not satisfactory. The MAPE of the seasonal ARIMA model reached 0.236 in the test set. The most likely reason was that the time series data of human brucellosis cases in mainland China was not linear. As shown in Fig.
5, although we could observe a long-term upward trend from the trend component, some curves remained after the seasonal and irregular components had been extracted. The results of the BDS test also supported that the time series of human brucellosis in Mainland China from 2004 to 2016 was not linear.
There are mainly two approaches for nonlinear time series forecasting [
27]. One approach is model-based parametric nonlinear methods, such as the smoothing transition autoregressive (STAR) model, the threshold autoregressive (TAR) model, the nonlinear autoregressive (NAR) model, the nonlinear moving average (NMA) model, etc. In theory, these parametric nonlinear methods are superior to the traditional ARIMA model in capturing nonlinear relationships in the data. However, there are too many possible nonlinear patterns in practice, which restricts the usefulness of these models. The other approach is nonparametric data driven methods, and the most widely used method is neural networks. Neural networks are inspired by the structure of a biological nervous system. These networks can capture the patterns and hidden functional relationships existing in a given set of data, although these relationships are unknown or hard to identify [
28]. Recurrent neural networks contain hidden states that are distributed across time. This characteristic suggests that these networks have the ability to efficiently store much information about the past. Therefore, these networks have the advantage of dealing with time series data. Elman and Jordan neural networks are two widely used recurrent neural networks. Elman neural networks have been used in many practical applications, such as the price prediction of crude oil futures [
29], weather forecasting [
30], water quality forecasting [
31], and financial time series prediction [
28]. Jordan neural networks have been used in wind speed forecasting [
32] and stock market volatility monitoring [
33]. All of these applications have achieved good forecasting performance. In this study, we tried these two neural network models to predict human brucellosis cases in Mainland China. The MAPE of Elman and Jordan neural networks were 0.115 and 0.113, respectively, almost the same as the MAPE of the seasonal ARIMA model at 0.112 in the training set, while the RMSE and MAE of Elman and Jordan neural networks were lower than those of the ARIMA model. The RMSE and MAE of the Elman neural network were the lowest, whereas the MAPE of the Elman neural network was the highest in the training set. The most likely reason was that the Elman neural network gained better fitting accuracy for large values, but gained poorer fitting accuracy for small values in this study. Importantly, the Elman and Jordan neural networks achieved much higher forecasting accuracy in the test set. The RMSE, MAE, and MAPE of Elman and Jordan neural networks were far lower than those of the seasonal ARIMA model. Therefore, Elman and Jordan Recurrent Neural Networks are more appropriate than the seasonal ARIMA model for forecasting nonlinear time series data, such as human brucellosis. However, we must admit that there are still some limitations of neural network models. First, neural network models are black boxes, i.e., we cannot know how much each input variable is influencing the output variables. Second, there are no fix rules to determine the structure and parameters of neural network models. It all depends on the experience of researchers. Third, it is computationally very expensive and time consuming to train neural network models. Neural network models require processors with parallel processing power to accelerate the training process. Some researchers have built hybrid models combining ARIMA models and neural network models to analyze time series data and achieved good results. We will try hybrid models for human brucellosis in the future. There were still some limitations to this study. First, the NHFPC of China only reported the data from 2004 to 2017. More time series data on brucellosis cases can improve the accuracy of forecasting models. Second, the present study is an ecological study, and we cannot avoid ecological fallacy. Third, the factors that affect the occurrence of human brucellosis such as pathogens, host, natural environment, vaccines and socioeconomic variations were not considered when we conducted the models.