
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Ĭompeting interests: The authors have declared that no competing interests exist.ĭengue is a mosquito-borne viral disease that affects a large fraction of the world. MM’s research was partially supported by DTRA CNIMS (contract HDTRA1-11-D-0016-0001), NSF DIBBS Grant ACI-1443054, NSF EAGER Grant CMMI-1745207, and NSF BIG DATA Grant IIS-1633028. SKM was supported by a Tata Trusts Grant.

įunding: PR was supported by the Kishore Vaigyanik Protsahan Yojana Fellowship, Government of India. Replication data for ILI are available at. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.ĭata Availability: Replication data for Dengue are available at. Received: JanuAccepted: OctoPublished: November 21, 2019Ĭopyright: © 2019 Rangarajan et al. Los Alamos National Laboratory, UNITED STATES
#Google trends data time series series
It also performs better than other methods in predicting the peak value of the case count and the peak time.Ĭitation: Rangarajan P, Mody SK, Marathe M (2019) Forecasting dengue and influenza incidences using a sparse representation of Google trends, electronic health records, and time series data. In particular, our method gives a 18 percent forecast error reduction over a leading method that also uses data from multiple sources. Numerical experiments show that our method outperforms existing time series forecasting methods in forecasting the dengue and ILI case counts. We apply our method to dengue case count data from five countries/states: Brazil, Mexico, Singapore, Taiwan, and Thailand and to ILI case count data from the United States. Using numerical experiments, we demonstrate that our method recovers the underlying sparse model much more accurately than the lasso method. This sparse representation method uses an algorithm that maximizes an appropriate likelihood ratio at every step. This method combines sparse representation of time series data, electronic health records data (for ILI) and Google Trends data to forecast dengue and ILI incidences. In this paper, we use a computationally efficient implementation of the known variable selection method that we call the Autoregressive Likelihood Ratio (ARLR) method.

Since data from multiple sources (such as dengue and ILI case counts, electronic health records and frequency of multiple internet search terms from Google Trends) can improve forecasts, standard time series analysis methods are inadequate to estimate all the parameter values from the limited amount of data available if we use multiple sources. It is therefore important to develop accurate methods for forecasting dengue and ILI incidences.

Dengue and influenza-like illness (ILI) are two of the leading causes of viral infection in the world and it is estimated that more than half the world’s population is at risk for developing these infections.
