高级检索

基于新型冠状病毒感染疫情数据的呼吸道传染病跨地区传播风险评估模型构建

A model for assessing cross-regional transmission risk of respiratory infectious diseases based on COVID-19 epidemic data: a modelling analysis

  • 摘要:
    目的 构建呼吸道传染病跨地区传播风险评估模型,为评估呼吸道传染病在跨地区传播中的风险提供理论基础、实践经验和技术支持。
    方法 收集2020年1月—2022年12月国家卫生健康委员会、百度迁徙平台及各省统计局等官方公开发布的中国2019年新型冠状病毒感染(COVID-19)疫情相关数据,选取2020年湖北省武汉市疫情、2021年河北省石家庄市疫情、2021年陕西省西安市疫情、2022年上海市疫情、2022年广西壮族自治区北海市疫情、2022年海南省三亚市疫情、2022年内蒙古自治区呼和浩特市疫情、2022年河南省郑州市疫情和2022年河北省石家庄市疫情共9个疫情事件,通过主成分分析法构建城市疫情综合输入风险指数计算公式,对跨地区疫情综合输入风险进行量化评估;将贡献率较高的主成分作为支持向量机模型的特征向量、城市间的实际疫情传播情况作为目标变量,使用支持向量机进行模型构建与预测,并采用沙普利加性解释(SHAP)方法对模型进行可解释性分析。
    结果 主成分分析提取出当地社会经济活动因子(F1)、疫情来源地社会经济活动因子(F2)、城市政策执行因子(F3)、城市人口流入与地理距离相关因子(F4)和城市疫情传播指数因子(F5)共5个因子,累计方差贡献率为82.32%;基于主成分得分系数和主成分方差贡献率得到综合输入风险指数得分的公式为:F=0.217 8×F1+0.184 1×F2+0.155 6×F3+0.141 9×F4+0.123 8×F5;支持向量机模型的预测准确率为84.88%,精确率为75.00%,召回率为57.14%,F1分数为64.86%;SHAP分析结果显示,F1、F2、F3、F4和F5的SHAP值分别为0.15、0.09、0.08、0.08和0.12,对评估传播风险的贡献度分别为29.23%、16.72%、15.26%、16.09%和22.70%。
    结论 本研究构建的基于新型冠状病毒感染疫情数据的呼吸道传染病跨地区传播风险评估模型在呼吸道传染病跨地区输入风险的量化评估和传播预测方面具有较好的适用性和准确性,可为呼吸道传染病防控策略的科学制定提供理论支持和实践参考。

     

    Abstract:
    Objective To construct a cross-regional transmission risk assessment model for respiratory infectious diseases, providing a theoretical basis, practical experience, and technical support for evaluating the risk of cross-regional transmission.
    Methods Data related to the COVID-19 epidemic in China from January 2020 to December 2022, officially released by the National Health Commission, Baidu Migration, and provincial statistical bureaus, were collected. Nine epidemic events were selected: the 2020 Wuhan, Hubei province epidemic; the 2021 Shijiazhuang, Hebei province epidemic; the 2021 Xi'an, Shaanxi province epidemic; the 2022 Shanghai epidemic; the 2022 Beihai, Guangxi Zhuang Autonomous Region epidemic; the 2022 Sanya, Hainan province epidemic; the 2022 Hohhot, Inner Mongolia Autonomous Region epidemic; the 2022 Zhengzhou, Henan province epidemic; and the 2022 Shijiazhuang, Hebei province epidemic. Principal component analysis (PCA) was used to construct a comprehensive input risk index formula for urban epidemics to quantitatively assess the comprehensive cross-regional epidemic input risk. The principal components with higher contribution rates were used as feature vectors for a support vector machine (SVM) model, with the actual intercity epidemic transmission status as the target variable. The SVM was used for model construction and prediction, and the Shapley Additive exPlanations (SHAP) method was used for model interpretability analysis.
    Results Five factors were extracted by PCA: local socioeconomic activity factor (F1), epidemic source socioeconomic activity factor (F2), urban policy implementation factor (F3), urban population inflow and geographical distance-related factor (F4), and urban epidemic transmission index factor (F5), with a cumulative variance contribution rate of 82.32%. Based on the principal component score coefficients and variance contribution rates, the formula for the comprehensive input risk index score was: F=0.2178×F1+0.1841×F2+0.1556×F3+0.1419×F4+0.1238×F5. The SVM model achieved a prediction accuracy of 84.88%, precision of 75.00%, recall of 57.14%, and an F1-score of 64.86%. SHAP analysis showed that the SHAP values of F1, F2, F3, F4, and F5 were 0.15, 0.09, 0.08, 0.08, and 0.12, respectively, contributing 29.23%, 16.72%, 15.26%, 16.09%, and 22.70% to the assessment of transmission risk.
    Conclusions The cross-regional transmission risk assessment model for respiratory infectious diseases constructed in this study, based on COVID-19 epidemic data, demonstrates good applicability and accuracy in quantifying the risk of cross-regional importation and predicting transmission, providing theoretical support and practical reference for the scientific formulation of respiratory infectious disease prevention and control strategies.

     

/

返回文章
返回