媒介生物传染病

LASSO回归和SARIMAX模型联合应用对广州市肾综合征出血热发病的预测效果研究

展开
  • 广州市疾病预防控制中心慢性非传染性疾病预防控制部/免疫规划部/寄生虫病与地方病预防控制部, 广东 广州 510440
祁娟,女,主管医师,从事疾病预测预警研究,E-mail:qijuan717@126.com

收稿日期: 2023-07-05

  网络出版日期: 2024-03-05

基金资助

广州市卫生健康科技项目(20221A011067)

Predictive performance of LASSO-SARIMAX model for the incidence of hemorrhagic fever with renal syndrome in Guangzhou,China

Expand
  • Department of Chronic and Non-communicable Diseases Prevention and Control/Department of Immunization Planning/Department of Parasitic Diseases and Endemic Diseases Prevention and Control, Guangzhou Center for Disease Control and Prevention, Guangzhou, Guangdong 510440, China

Received date: 2023-07-05

  Online published: 2024-03-05

Supported by

Guangzhou Health Science and Technology Project (No. 20221A011067)

摘要

目的 比较3种时间序列模型对肾综合征出血热(HFRS)发病的预测效果,探索最小绝对值收缩与选择算子算法回归(LASSO)联合引入自变量的季节性差分自回归移动平均(SARIMAX)模型对HFRS的预测效果。方法 系统收集2006-2022年广州市HFRS发病数、鼠密度、气象及社会经济学数据,采用指数平滑法、SARIMAX以及通过LASSO-SARIMAX模型进行发病预测,通过自相关函数(ACF)、平均百分比误差(MPE)和平均绝对百分比误差(MAPE)评价模型的预测效果,通过MAPE对比3种模型不同预测时长的预测效果。结果 2006-2022年广州市HFRS年均发病率0.06/10万,指数平滑法(ETS)模型训练集的MAPE为45.066,SARIMA模型训练集的MAPE为51.403,LASSO-SARIMAX模型训练集的MAPE为39.466,除预测24月时低于ETS模型外,LASSO-SARIMAX模型训练数据集、预测12月的MAPE均最低。结论 LASSO回归联合SARIMAX模型在广州市HFRS发病的中短期预测中有较好效果。

本文引用格式

祁娟, 康燕, 陈海燕, 许聪辉, 魏跃红 . LASSO回归和SARIMAX模型联合应用对广州市肾综合征出血热发病的预测效果研究[J]. 中国媒介生物学及控制杂志, 2024 , 35(1) : 49 -55 . DOI: 10.11853/j.issn.1003.8280.2024.01.009

Abstract

Objective To compare the performance of three time series models in predicting the incidence of hemorrhagic fever with renal syndrome (HFRS),and to explore the predictive performance of a modified seasonal autoregressive integrated moving average (SARIMAX) model with independent variables introduced from a least absolute shrinkage and selection operator (LASSO) model. Methods The information on HFRS incidence, rodent density, meteorological and socio-economic data in Guangzhou,China from 2006 to 2022 were systematically collected. Exponential smoothing (ETS), SARIMAX, and LASSO-SARIMAX models were constructed to predict the incidence of HFRS. Autocorrelation function (ACF), mean percentage error (MPE), and mean absolute percentage error (MAPE) were used to evaluate the predictive effects of the models. MAPE was used to compare the prediction effects of the three models in different prediction times. Results The mean annual incidence rate of HFRS in Guangzhou from 2006 to 2022 was 0.06/100 000. The MAPE for the training set was 45.066 for the ETS model, 51.403 for the SARIMA model,and 39.466 for the LASSO-SARIMAX model. The LASSO-SARIMAX model had the lowest MAPE in the training data set at a prediction length of 12 months,with a lower MAPE compared with the ETS model at a length of 24 months. Conclusion The LASSO-SARIMAX model shows good performance in predicting the incidence of HFRS in Guangzhou in the short and medium term.

参考文献

[1] Zhao YL,Ge L,Zhou YJ,et al. A new seasonal difference space-time autoregressive integrated moving average (SD-STARIMA) model and spatiotemporal trend prediction analysis for hemorrhagic fever with renal syndrome (HFRS)[J]. PLoS One,2018,13(11):e0207518. DOI:10.1371/journal.pone.0207518.
[2] Wei X,Meng B,Peng H,et al. Hemorrhagic fever with renal syndrome caused by destruction of residential area of rodent in a construction site:Epidemiological investigation[J]. BMC Infecti Dis,2022,22(1):761. DOI:10.1186/s12879-022-07744-1.
[3] Zhang C,Fu X,Zhang YY,et al. Epidemiological and time series analysis of hemorrhagic fever with renal syndrome from 2004 to 2017 in Shandong province,China[J]. Sci Rep,2019,9(1):14644. DOI:10.1038/s41598-019-50878-7.
[4] 刘天,姚梦雷,侯清波,等. 7种时间序列模型对全国肾综合征出血热发病率预测效果比较[J]. 中国媒介生物学及控制杂志,2022,33(4):548-554. DOI:10.11853/j.issn.1003.8280. 2022.04.020.Liu T,Yao ML,Hou QB,et al. Comparison of seven time series models in fitting and predicting the incidence of hemorrhagic fever with renal syndrome in China[J]. Chin J Vector Biol Control,2022,33(4):548-554. DOI:10.11853/j.issn.1003. 8280.2022.04.020.(in Chinese)
[5] Yang TL,Wang Y,Yao LS,et al. Application of logistic differential equation models for early warning of infectious diseases in Jilin province[J]. BMC Public Health,2022,22(1):2019. DOI:10.1186/s12889-022-14407-y.
[6] Lee GY,Kim WK,No JS,et al. Clinical and immunological predictors of hemorrhagic fever with renal syndrome outcome during the early phase[J]. Viruses,2022,14(3):595. DOI:10.3390/v14030595.
[7] Chen YL,Liu T,Yu XL,et al. An ensemble forecast system for tracking dynamics of dengue outbreaks and its validation in China[J]. PLoS Comput Biol,2022,18(6):e1010218. DOI:10.1371/journal.pcbi.1010218.
[8] Zhang R,Song HJ,Chen QL,et al. Comparison of ARIMA and LSTM for prediction of hemorrhagic fever at different time scales in China[J]. PLoS One,2022,17(1):e0262009. DOI:10.1371/journal.pone.0262009.
[9] Qi C,Zhang DD,Zhu YC,et al. SARFIMA model prediction for infectious diseases:Application to hemorrhagic fever with renal syndrome and comparing with SARIMA[J]. BMC Med Res Methodol,2020,20(1):243. DOI:10.1186/s12874-020-01130-8.
[10] Yang Z,Hu QM,Feng ZP,et al. Development and validation of a nomogram for predicting severity in patients with hemorrhagic fever with renal syndrome:A retrospective study[J]. Open Med (Wars),2021,16(1):944-954. DOI:10.1515/med-2021-0307.
[11] Shi FY,Yu CL,Yang LP,et al. Exploring the dynamics of hemorrhagic fever with renal syndrome incidence in east China through seasonal autoregressive integrated moving average models[J]. Infect Drug Resist,2020,13:2465-2475. DOI:10.2147/IDR.S250038.
[12] 王晔萍,王瑶,杨天龙,等. 吉林省肾综合征出血热发病预测研究[J]. 中国地方病防治,2022,37(5):373-376.Wang YP,Wang Y,Yang TL,et al. Prediction of hemorrhagic fever with renal syndrome in Jilin province[J]. Chin J Ctrl Endemic Dis,2022,37(5):373-376. (in Chinese)
[13] Greener JG,Kandathil SM,Moffat L,et al. A guide to machine learning for biologists[J]. Nat Rev Mol Cell Biol,2022,23(1):40-55. DOI:10.1038/s41580-021-00407-0.
[14] She KL,Li CY,Qi C,et al. Epidemiological characteristics and regional risk prediction of hemorrhagic fever with renal syndrome in Shandong province,China[J]. Int J Environ Res Public Health,2021,18(16):8495. DOI:10.3390/ijerph18168495.
[15] Zhang R,Zhang N,Sun WW,et al. Analysis of the effect of meteorological factors on hemorrhagic fever with renal syndrome in Taizhou city,China,2008-2020[J]. BMC Public Health,2022,22(1):1097. DOI:10.1186/s12889-022-13423-2.
[16] Chen ZX,Liu FQ,Li B,et al. Prediction of hot spot areas of hemorrhagic fever with renal syndrome in Hunan province based on an information quantity model and logistical regression model[J]. PLoS Negl Trop Dis,2020,14(12):e0008939. DOI:10.1371/journal.pntd.0008939.
[17] Chen Y,Hou WM,Dong J. Time series analyses based on the joint lagged effect analysis of pollution and meteorological factors of hemorrhagic fever with renal syndrome and the construction of prediction model[J]. PLoS Negl Trop Dis,2023,17(7):e0010806. DOI:10.1371/journal.pntd.0010806.
[18] 李凤灵. 以居民消费价格指数为样本的预测模型选择[D]. 济南:山东财经大学,2023. DOI:10.27274/d.cnki.gsdjc.2023. 001325.Li FL. The selection of forecasting models based on consumer price index[D]. Ji'nan:Shandong University of Finance and Economics,2023. DOI:10.27274/d.cnki.gsdjc.2023.001325.(in Chinese)
文章导航

/