A New Method for Generalizing Burr and Related Distributions
2022, Chakraborty, Tanujit, Das, Suchismita, Chattopadhyay, Swarup
A new method has been proposed to generalize Burr-XII distribution, also called Burr distribution, by adding an extra parameter to an existing Burr distribution for more flexibility. In this method, the exponent of the Burr distribution is modeled using a nonlinear function of the data and one additional parameter. The models of this newly introduced generalized Burr family can significantly increase the flexibility of the former Burr distribution with respect to the density and hazard rate shapes. Families expanded using the method proposed here is heavy-tailed and belongs to the maximum domain of attractions of the Frechet distribution. The method is further applied to yield three-parameter classical Pareto and generalized exponentiated distributions which shows the broader application of the proposed idea of generalization. A relevant model of the new generalized Burr family has been considered in detail, with particular emphasis on the hazard functions, stochastic orders, estimation procedures, and testing methods are derived. Finally, as empirical evidence, the new distribution is applied to the analysis of large-scale heavy-tailed network data and compared with other commonly used distributions available for fitting degree distributions of networks. Experimental results suggest that the proposed Burr distribution with nonlinear exponent better fits the large-scale heavy-tailed networks better than the popularly used Marhsall-Olkin generalization of Burr and exponentiated Burr distributions.
Epicasting: An Ensemble Wavelet Neural Network for forecasting epidemics
2023, Panja, Madhurima, Chakraborty, Tanujit, Kumar, Uttam, Liu, Nan
Infectious diseases remain among the top contributors to human illness and death worldwide, among which many diseases produce epidemic waves of infection. The lack of specific drugs and ready-to-use vaccines to prevent most of these epidemics worsens the situation. These force public health officials and policymakers to rely on early warning systems generated by accurate and reliable epidemic forecasters. Accurate forecasts of epidemics can assist stakeholders in tailoring countermeasures, such as vaccination campaigns, staff scheduling, and resource allocation, to the situation at hand, which could translate to reductions in the impact of a disease. Unfortunately, most of these past epidemics exhibit nonlinear and non-stationary characteristics due to their spreading fluctuations based on seasonal-dependent variability and the nature of these epidemics. We analyze various epidemic time series datasets using a maximal overlap discrete wavelet transform (MODWT) based autoregressive neural network and call it Ensemble Wavelet Neural Network (EWNet) model. MODWT techniques effectively characterize non-stationary behavior and seasonal dependencies in the epidemic time series and improve the nonlinear forecasting scheme of the autoregressive neural network in the proposed ensemble wavelet network framework. From a nonlinear time series viewpoint, we explore the asymptotic stationarity of the proposed EWNet model to show the asymptotic behavior of the associated Markov Chain. We also theoretically investigate the effect of learning stability and the choice of hidden neurons in the proposal. From a practical perspective, we compare our proposed EWNet framework with twenty-two statistical, machine learning, and deep learning models for fifteen real-world epidemic datasets with three test horizons using four key performance indicators. Experimental results show that the proposed EWNet is highly competitive compared to the state-of-the-art epidemic forecasting methods.
Searching for Heavy-Tailed Probability Distributions for Modeling Real-World Complex Networks
2022, Chakraborty, Tanujit, Chattopadhyay, Swarup, Das, Suchismita, Kumar, Uttam, Senthilnath, J.
Perhaps the most recent controversial topic in network science research is to determine whether real-world complex networks are scale-free or not. Recently, Broido and Clauset [A.D. Broido, A. Clauset, Nature Communication, 10, 1017 (2019)] asserted that the degree distributions of real-world networks are rarely power law under statistical tests. Such complex networks, including social, biological, information, temporal, and brain networks, are often heavy-tailed where the assumption on the scale-free nature of realworld heavy-tailed networks become insignificant as the complex system evolves over time. The failure of power law distribution in fitting the degree distribution data is mainly due to the presence of an identifiable non-linearity within the entire degree distribution in a log-log scale of a complex heavy-tailed network. In this study, we attempt to address this issue by proposing a new class of heavy-tailed probability distributions for modeling the entire degree distributions of complex networks. We introduce a new family of generalized Lomax models (GLM) to capture the non-linearity of these heavy-tailed networks. These newly introduced GLM-type distributions provide better fitting and greater flexibility to the entire node degree distribution of complex networks. Several statistical properties of the proposed model, such as extreme value and inferential statistical properties, are derived into this context. Interestingly, the GLM family belongs to the basin of attraction of Frechet distribution, a heavy-tailed extreme value distribution. Rigorous experimental analysis showcases the excellent performance of the proposed family of distributions while fitting the heavytailed real-world complex networks over fifty real-world datasets in comparison with benchmark probability models. Our results show that GLM-type distributions are not rare, able to model almost 90% of the tested networks accurately compared to benchmark probability models. INDEX TERMS Complex networks, heavy-tailed networks, degree distribution, Lomax distribution, extreme value properties.
An ensemble neural network approach to forecast Dengue outbreak based on climatic condition
2023, Chakraborty, Tanujit, Panja, Madhurima, Nadim. Sk Shahid, Ghosh, Indrajit, Kumar, Uttam, Liu, Nan
Dengue fever is a virulent disease spreading over 100 tropical and subtropical countries in Africa, the Americas, and Asia. This arboviral disease affects around 400 million people globally, severely distressing the healthcare systems. The unavailability of a specific drug and ready-to-use vaccine makes the situation worse. Hence, policymakers must rely on early warning systems to control intervention-related decisions. Forecasts routinely provide critical information for dangerous epidemic events. However, the available forecasting models (e.g., weather-driven mechanistic, statistical time series, and machine learning models) lack a clear understanding of different components to improve prediction accuracy and often provide unstable and unreliable forecasts. This study proposes an ensemble wavelet neural network with exogenous factor(s) (XEWNet) model that can produce reliable estimates for dengue outbreak prediction for three geographical regions, namely San Juan, Iquitos, and Ahmedabad. The proposed XEWNet model is flexible and can easily incorporate exogenous climate variable(s) confirmed by statistical causality tests in its scalable framework. The proposed model is an integrated approach that uses wavelet transformation into an ensemble neural network framework that helps in generating more reliable long-term forecasts. The proposed XEWNet allows complex non-linear relationships between the dengue incidence cases and rainfall; however, mathematically interpretable, fast in execution, and easily comprehensible. The proposal's competitiveness is measured using computational experiments based on various statistical metrics and several statistical comparison tests. In comparison with statistical, machine learning, and deep learning methods, our proposed XEWNet performs better in 75% of the cases for short-term and long-term forecasting of dengue incidence.
Knowledge-based Deep Learning for Modeling Chaotic Systems
2022, Elabid, Zakaria, Chakraborty, Tanujit, Hadid, Abdenour
Deep Learning has received increased attention due to its unbeatable success in many fields, such as computer vision, natural language processing, recommendation systems, and most recently in simulating multiphysics problems and predicting nonlinear dynamical systems. However, modeling and forecasting the dynamics of chaotic systems remains an open research problem since training deep learning models requires big data, which is not always available in many cases. Such deep learners can be trained from additional information obtained from simulated results and by enforcing the physical laws of the chaotic systems. This paper considers extreme events and their dynamics and proposes elegant models based on deep neural networks, called knowledge-based deep learning (KDL). Our proposed KDL can learn the complex patterns governing chaotic systems by jointly training on real and simulated data directly from the dynamics and their differential equations. This knowledge is transferred to model and forecast real-world chaotic events exhibiting extreme behavior. We validate the efficiency of our model by assessing it on three real-world benchmark datasets: El Niño sea surface temperature, San Juan Dengue viral infection, and Bjørnøya daily precipitation, all governed by extreme events' dynamics. Using prior knowledge of extreme events and physics-based loss functions to lead the neural network learning, we ensure physically consistent, generalizable, and accurate forecasting, even in a small data regime. Index Terms-Chaotic systems, long short-term memory, deep learning, extreme event modeling.
Stochastic forecasting of COVID-19 daily new cases across countries with a novel hybrid time series model
2022, Chakraborty, Tanujit, Rai, Shesh N., Bhattacharyya, Arinjita
An unprecedented outbreak of the novel coronavirus (COVID-19) in the form of peculiar pneumonia has spread globally since its first case in Wuhan province, China, in December 2019. Soon after, the infected cases and mortality increased rapidly. The future of the pandemic’s progress was uncertain, and thus, predicting it became crucial for public health researchers. These predictions help the effective allocation of health-care resources, stockpiling, and help in strategic planning for clinicians, government authorities, and public health policymakers after understanding the extent of the effect. The main objective of this paper is to develop a hybrid forecasting model that can generate real-time out-of-sample forecasts of COVID-19 outbreaks for five profoundly affected countries, namely the USA, Brazil, India, the UK, and Canada. A novel hybrid approach based on the Theta method and autoregressive neural network (ARNN) model, named Theta-ARNN (TARNN) model, is developed. Daily new cases of COVID-19 are nonlinear, non-stationary, and volatile; thus, a single specific model cannot be ideal for future prediction of the pandemic. However, the newly introduced hybrid forecasting model with an acceptable prediction error rate can help healthcare and government for effective planning and resource allocation. The proposed method outperforms traditional univariate and hybrid forecasting models for the test datasets on an average.
Bayesian neural tree models for nonparametric regression
2023, Chakraborty, Tanujit, Kamat, Gauri, Chakraborty, Ashis Kumar
Summary Frequentist and Bayesian methods differ in many aspects but share some basic optimal properties. In real-life prediction problems, situations exist in which a model based on one of the above paradigms is preferable depending on some subjective criteria. Nonparametric classification and regression techniques, such as decision trees and neural networks, have both frequentist (classification and regression trees (CARTs) and artificial neural networks) as well as Bayesian counterparts (Bayesian CART and Bayesian neural networks) to learning from data. In this paper, we present two hybrid models combining the Bayesian and frequentist versions of CART and neural networks, which we call the Bayesian neural tree (BNT) models. BNT models can simultaneously perform feature selection and prediction, are highly flexible, and generalise well in settings with limited training observations. We study the statistical consistency of the proposed approaches and derive the optimal value of a vital model parameter. The excellent performance of the newly proposed BNT models is shown using simulation studies. We also provide some illustrative examples using a wide variety of standard regression datasets from a public available machine learning repository to show the superiority of the proposed models in comparison to popularly used Bayesian CART and Bayesian neural network models.
Optimized ensemble deep learning framework for scalable forecasting of dynamics containing extreme events
2021, Ray, Arnob, Chakraborty, Tanujit, Ghosh, Dibakar
The remarkable flexibility and adaptability of both deep learning models and ensemble methods have led to the proliferation for their application in understanding many physical phenomena. Traditionally, these two techniques have largely been treated as independent methodologies in practical applications. This study develops an optimized ensemble deep learning framework wherein these two machine learning techniques are jointly used to achieve synergistic improvements in model accuracy, stability, scalability, and reproducibility, prompting a new wave of applications in the forecasting of dynamics. Unpredictability is considered one of the key features of chaotic dynamics; therefore, forecasting such dynamics of nonlinear systems is a relevant issue in the scientific community. It becomes more challenging when the prediction of extreme events is the focus issue for us. In this circumstance, the proposed optimized ensemble deep learning (OEDL) model based on a best convex combination of feed-forward neural networks, reservoir computing, and long short-term memory can play a key role in advancing predictions of dynamics consisting of extreme events. The combined framework can generate the best out-of-sample performance than the individual deep learners and standard ensemble framework for both numerically simulated and real-world data sets. We exhibit the outstanding performance of the OEDL framework for forecasting extreme events generated from a Liénard-type system, prediction of COVID-19 cases in Brazil, dengue cases in San Juan, and sea surface temperature in the Niño 3.4 region.
W-Transformers : A Wavelet-based Transformer Framework for Univariate Time Series Forecasting
2022, Sasal, Lena, Chakraborty, Tanujit, Hadid, Abdenour
Deep learning utilizing transformers has recently achieved a lot of success in many vital areas such as natural language processing, computer vision, anomaly detection, and recommendation systems, among many others. Among several merits of transformers, the ability to capture long-range temporal dependencies and interactions is desirable for time series forecasting, leading to its progress in various time series applications. In this paper, we build a transformer model for non-stationary time series. The problem is challenging yet crucially important. We present a novel framework for univariate time series representation learning based on the wavelet-based transformer encoder architecture and call it W-Transformer. The proposed W-Transformers utilize a maximal overlap discrete wavelet transformation (MODWT) to the time series data and build local transformers on the decomposed datasets to vividly capture the nonstationarity and long-range nonlinear dependencies in the time series. Evaluating our framework on several publicly available benchmark time series datasets from various domains and with diverse characteristics, we demonstrate that it performs, on average, significantly better than the baseline forecasters for short-term and long-term forecasting, even for datasets that consist of only a few hundred training samples.