A New Method for Generalizing Burr and Related Distributions
2022, Chakraborty, Tanujit, Das, Suchismita, Chattopadhyay, Swarup
A new method has been proposed to generalize Burr-XII distribution, also called Burr distribution, by adding an extra parameter to an existing Burr distribution for more flexibility. In this method, the exponent of the Burr distribution is modeled using a nonlinear function of the data and one additional parameter. The models of this newly introduced generalized Burr family can significantly increase the flexibility of the former Burr distribution with respect to the density and hazard rate shapes. Families expanded using the method proposed here is heavy-tailed and belongs to the maximum domain of attractions of the Frechet distribution. The method is further applied to yield three-parameter classical Pareto and generalized exponentiated distributions which shows the broader application of the proposed idea of generalization. A relevant model of the new generalized Burr family has been considered in detail, with particular emphasis on the hazard functions, stochastic orders, estimation procedures, and testing methods are derived. Finally, as empirical evidence, the new distribution is applied to the analysis of large-scale heavy-tailed network data and compared with other commonly used distributions available for fitting degree distributions of networks. Experimental results suggest that the proposed Burr distribution with nonlinear exponent better fits the large-scale heavy-tailed networks better than the popularly used Marhsall-Olkin generalization of Burr and exponentiated Burr distributions.
Optimized ensemble deep learning framework for scalable forecasting of dynamics containing extreme events
2021, Ray, Arnob, Chakraborty, Tanujit, Ghosh, Dibakar
The remarkable flexibility and adaptability of both deep learning models and ensemble methods have led to the proliferation for their application in understanding many physical phenomena. Traditionally, these two techniques have largely been treated as independent methodologies in practical applications. This study develops an optimized ensemble deep learning framework wherein these two machine learning techniques are jointly used to achieve synergistic improvements in model accuracy, stability, scalability, and reproducibility, prompting a new wave of applications in the forecasting of dynamics. Unpredictability is considered one of the key features of chaotic dynamics; therefore, forecasting such dynamics of nonlinear systems is a relevant issue in the scientific community. It becomes more challenging when the prediction of extreme events is the focus issue for us. In this circumstance, the proposed optimized ensemble deep learning (OEDL) model based on a best convex combination of feed-forward neural networks, reservoir computing, and long short-term memory can play a key role in advancing predictions of dynamics consisting of extreme events. The combined framework can generate the best out-of-sample performance than the individual deep learners and standard ensemble framework for both numerically simulated and real-world data sets. We exhibit the outstanding performance of the OEDL framework for forecasting extreme events generated from a Liénard-type system, prediction of COVID-19 cases in Brazil, dengue cases in San Juan, and sea surface temperature in the Niño 3.4 region.
Stochastic forecasting of COVID-19 daily new cases across countries with a novel hybrid time series model
2022, Chakraborty, Tanujit, Rai, Shesh N., Bhattacharyya, Arinjita
An unprecedented outbreak of the novel coronavirus (COVID-19) in the form of peculiar pneumonia has spread globally since its first case in Wuhan province, China, in December 2019. Soon after, the infected cases and mortality increased rapidly. The future of the pandemic’s progress was uncertain, and thus, predicting it became crucial for public health researchers. These predictions help the effective allocation of health-care resources, stockpiling, and help in strategic planning for clinicians, government authorities, and public health policymakers after understanding the extent of the effect. The main objective of this paper is to develop a hybrid forecasting model that can generate real-time out-of-sample forecasts of COVID-19 outbreaks for five profoundly affected countries, namely the USA, Brazil, India, the UK, and Canada. A novel hybrid approach based on the Theta method and autoregressive neural network (ARNN) model, named Theta-ARNN (TARNN) model, is developed. Daily new cases of COVID-19 are nonlinear, non-stationary, and volatile; thus, a single specific model cannot be ideal for future prediction of the pandemic. However, the newly introduced hybrid forecasting model with an acceptable prediction error rate can help healthcare and government for effective planning and resource allocation. The proposed method outperforms traditional univariate and hybrid forecasting models for the test datasets on an average.
An ensemble neural network approach to forecast Dengue outbreak based on climatic condition
2023, Chakraborty, Tanujit, Panja, Madhurima, Nadim. Sk Shahid, Ghosh, Indrajit, Kumar, Uttam, Liu, Nan
Dengue fever is a virulent disease spreading over 100 tropical and subtropical countries in Africa, the Americas, and Asia. This arboviral disease affects around 400 million people globally, severely distressing the healthcare systems. The unavailability of a specific drug and ready-to-use vaccine makes the situation worse. Hence, policymakers must rely on early warning systems to control intervention-related decisions. Forecasts routinely provide critical information for dangerous epidemic events. However, the available forecasting models (e.g., weather-driven mechanistic, statistical time series, and machine learning models) lack a clear understanding of different components to improve prediction accuracy and often provide unstable and unreliable forecasts. This study proposes an ensemble wavelet neural network with exogenous factor(s) (XEWNet) model that can produce reliable estimates for dengue outbreak prediction for three geographical regions, namely San Juan, Iquitos, and Ahmedabad. The proposed XEWNet model is flexible and can easily incorporate exogenous climate variable(s) confirmed by statistical causality tests in its scalable framework. The proposed model is an integrated approach that uses wavelet transformation into an ensemble neural network framework that helps in generating more reliable long-term forecasts. The proposed XEWNet allows complex non-linear relationships between the dengue incidence cases and rainfall; however, mathematically interpretable, fast in execution, and easily comprehensible. The proposal's competitiveness is measured using computational experiments based on various statistical metrics and several statistical comparison tests. In comparison with statistical, machine learning, and deep learning methods, our proposed XEWNet performs better in 75% of the cases for short-term and long-term forecasting of dengue incidence.
PARNN: A Probabilistic Autoregressive Neural Network Framework for Accurate Forecasting
2022, Hadid, Abdenour, Chakraborty, Tanujit, Panja, Madhurima, Kumar, Uttam
Forecasting time series data represents an emerging field of research in data science and knowledge discovery with vast applications ranging from stock price and energy demand prediction to the early prediction of epidemics. Numerous statistical and machine learning methods have been proposed in the last five decades with the demand for high-quality and reliable forecasts. However, in real-life prediction problems, situations exist in which a model based on one of the above paradigms is preferable. Therefore, hybrid solutions are needed to bridge the gap between classical forecasting methods and modern neural network models. In this context, we introduce a Probabilistic AutoRegressive Neural Network (PARNN) model that can handle a wide variety of complex time series data (e.g., nonlinearity, non-seasonal, long-range dependence, and non-stationarity). The proposed PARNN model is built by creating a fusion of an integrated moving average and autoregressive neural network to preserve the explainability, scalability, and "white-boxlike" prediction behavior of the individuals. Sufficient conditions for asymptotic stationarity and geometric ergodicity are obtained by considering the asymptotic behavior of the associated Markov chain. Unlike advanced deep learning tools, the uncertainty quantification of the PARNN model based on prediction intervals is obtained. During computational experiments, PARNN outperforms standard statistical, machine learning, and deep learning models (e.g., Transformers, NBeats, DeepAR, etc.) on a diverse collection of real-world datasets from macroeconomics, tourism, energy, epidemiology, and others for short-term, medium-term, and long-term forecasting. Multiple comparisons with the best method are carried out to showcase the superiority of the proposal in comparison with the state-ofthe-art forecasters over different forecast horizons.
W-Transformers : A Wavelet-based Transformer Framework for Univariate Time Series Forecasting
2022, Sasal, Lena, Chakraborty, Tanujit, Hadid, Abdenour
Deep learning utilizing transformers has recently achieved a lot of success in many vital areas such as natural language processing, computer vision, anomaly detection, and recommendation systems, among many others. Among several merits of transformers, the ability to capture long-range temporal dependencies and interactions is desirable for time series forecasting, leading to its progress in various time series applications. In this paper, we build a transformer model for non-stationary time series. The problem is challenging yet crucially important. We present a novel framework for univariate time series representation learning based on the wavelet-based transformer encoder architecture and call it W-Transformer. The proposed W-Transformers utilize a maximal overlap discrete wavelet transformation (MODWT) to the time series data and build local transformers on the decomposed datasets to vividly capture the nonstationarity and long-range nonlinear dependencies in the time series. Evaluating our framework on several publicly available benchmark time series datasets from various domains and with diverse characteristics, we demonstrate that it performs, on average, significantly better than the baseline forecasters for short-term and long-term forecasting, even for datasets that consist of only a few hundred training samples.
Knowledge-based Deep Learning for Modeling Chaotic Systems
2022, Elabid, Zakaria, Chakraborty, Tanujit, Hadid, Abdenour
Deep Learning has received increased attention due to its unbeatable success in many fields, such as computer vision, natural language processing, recommendation systems, and most recently in simulating multiphysics problems and predicting nonlinear dynamical systems. However, modeling and forecasting the dynamics of chaotic systems remains an open research problem since training deep learning models requires big data, which is not always available in many cases. Such deep learners can be trained from additional information obtained from simulated results and by enforcing the physical laws of the chaotic systems. This paper considers extreme events and their dynamics and proposes elegant models based on deep neural networks, called knowledge-based deep learning (KDL). Our proposed KDL can learn the complex patterns governing chaotic systems by jointly training on real and simulated data directly from the dynamics and their differential equations. This knowledge is transferred to model and forecast real-world chaotic events exhibiting extreme behavior. We validate the efficiency of our model by assessing it on three real-world benchmark datasets: El Niño sea surface temperature, San Juan Dengue viral infection, and Bjørnøya daily precipitation, all governed by extreme events' dynamics. Using prior knowledge of extreme events and physics-based loss functions to lead the neural network learning, we ensure physically consistent, generalizable, and accurate forecasting, even in a small data regime. Index Terms-Chaotic systems, long short-term memory, deep learning, extreme event modeling.
Searching for Heavy-Tailed Probability Distributions for Modeling Real-World Complex Networks
2022, Chakraborty, Tanujit, Chattopadhyay, Swarup, Das, Suchismita, Kumar, Uttam, Senthilnath, J.
Perhaps the most recent controversial topic in network science research is to determine whether real-world complex networks are scale-free or not. Recently, Broido and Clauset [A.D. Broido, A. Clauset, Nature Communication, 10, 1017 (2019)] asserted that the degree distributions of real-world networks are rarely power law under statistical tests. Such complex networks, including social, biological, information, temporal, and brain networks, are often heavy-tailed where the assumption on the scale-free nature of realworld heavy-tailed networks become insignificant as the complex system evolves over time. The failure of power law distribution in fitting the degree distribution data is mainly due to the presence of an identifiable non-linearity within the entire degree distribution in a log-log scale of a complex heavy-tailed network. In this study, we attempt to address this issue by proposing a new class of heavy-tailed probability distributions for modeling the entire degree distributions of complex networks. We introduce a new family of generalized Lomax models (GLM) to capture the non-linearity of these heavy-tailed networks. These newly introduced GLM-type distributions provide better fitting and greater flexibility to the entire node degree distribution of complex networks. Several statistical properties of the proposed model, such as extreme value and inferential statistical properties, are derived into this context. Interestingly, the GLM family belongs to the basin of attraction of Frechet distribution, a heavy-tailed extreme value distribution. Rigorous experimental analysis showcases the excellent performance of the proposed family of distributions while fitting the heavytailed real-world complex networks over fifty real-world datasets in comparison with benchmark probability models. Our results show that GLM-type distributions are not rare, able to model almost 90% of the tested networks accurately compared to benchmark probability models. INDEX TERMS Complex networks, heavy-tailed networks, degree distribution, Lomax distribution, extreme value properties.