Now showing 1 - 10 of 13
  • Publication
    A New Method for Generalizing Burr and Related Distributions
    (2022) ;
    Das, Suchismita
    ;
    Chattopadhyay, Swarup
    A new method has been proposed to generalize Burr-XII distribution, also called Burr distribution, by adding an extra parameter to an existing Burr distribution for more flexibility. In this method, the exponent of the Burr distribution is modeled using a nonlinear function of the data and one additional parameter. The models of this newly introduced generalized Burr family can significantly increase the flexibility of the former Burr distribution with respect to the density and hazard rate shapes. Families expanded using the method proposed here is heavy-tailed and belongs to the maximum domain of attractions of the Frechet distribution. The method is further applied to yield three-parameter classical Pareto and generalized exponentiated distributions which shows the broader application of the proposed idea of generalization. A relevant model of the new generalized Burr family has been considered in detail, with particular emphasis on the hazard functions, stochastic orders, estimation procedures, and testing methods are derived. Finally, as empirical evidence, the new distribution is applied to the analysis of large-scale heavy-tailed network data and compared with other commonly used distributions available for fitting degree distributions of networks. Experimental results suggest that the proposed Burr distribution with nonlinear exponent better fits the large-scale heavy-tailed networks better than the popularly used Marhsall-Olkin generalization of Burr and exponentiated Burr distributions.
      41  15
  • Publication
    An ensemble neural network approach to forecast Dengue outbreak based on climatic condition
    (2023)
    Panja, Madhurima
    ;
    ;
    Nadim. Sk Shahid
    ;
    Ghosh, Indrajit
    ;
    Kumar, Uttam
    ;
    Liu, Nan
    Dengue fever is a virulent disease spreading over 100 tropical and subtropical countries in Africa, the Americas, and Asia. This arboviral disease affects around 400 million people globally, severely distressing the healthcare systems. The unavailability of a specific drug and ready-to-use vaccine makes the situation worse. Hence, policymakers must rely on early warning systems to control intervention-related decisions. Forecasts routinely provide critical information for dangerous epidemic events. However, the available forecasting models (e.g., weather-driven mechanistic, statistical time series, and machine learning models) lack a clear understanding of different components to improve prediction accuracy and often provide unstable and unreliable forecasts. This study proposes an ensemble wavelet neural network with exogenous factor(s) (XEWNet) model that can produce reliable estimates for dengue outbreak prediction for three geographical regions, namely San Juan, Iquitos, and Ahmedabad. The proposed XEWNet model is flexible and can easily incorporate exogenous climate variable(s) confirmed by statistical causality tests in its scalable framework. The proposed model is an integrated approach that uses wavelet transformation into an ensemble neural network framework that helps in generating more reliable long-term forecasts. The proposed XEWNet allows complex non-linear relationships between the dengue incidence cases and rainfall; however, mathematically interpretable, fast in execution, and easily comprehensible. The proposal's competitiveness is measured using computational experiments based on various statistical metrics and several statistical comparison tests. In comparison with statistical, machine learning, and deep learning methods, our proposed XEWNet performs better in 75% of the cases for short-term and long-term forecasting of dengue incidence.
      25
  • Publication
    Bayesian neural tree models for nonparametric regression
    (2023) ;
    Kamat, Gauri
    ;
    Chakraborty, Ashis Kumar
    Summary Frequentist and Bayesian methods differ in many aspects but share some basic optimal properties. In real-life prediction problems, situations exist in which a model based on one of the above paradigms is preferable depending on some subjective criteria. Nonparametric classification and regression techniques, such as decision trees and neural networks, have both frequentist (classification and regression trees (CARTs) and artificial neural networks) as well as Bayesian counterparts (Bayesian CART and Bayesian neural networks) to learning from data. In this paper, we present two hybrid models combining the Bayesian and frequentist versions of CART and neural networks, which we call the Bayesian neural tree (BNT) models. BNT models can simultaneously perform feature selection and prediction, are highly flexible, and generalise well in settings with limited training observations. We study the statistical consistency of the proposed approaches and derive the optimal value of a vital model parameter. The excellent performance of the newly proposed BNT models is shown using simulation studies. We also provide some illustrative examples using a wide variety of standard regression datasets from a public available machine learning repository to show the superiority of the proposed models in comparison to popularly used Bayesian CART and Bayesian neural network models.
      15
  • Publication
    Epicasting: An Ensemble Wavelet Neural Network for forecasting epidemics
    (2023)
    Panja, Madhurima
    ;
    ;
    Kumar, Uttam
    ;
    Liu, Nan
    Infectious diseases remain among the top contributors to human illness and death worldwide, among which many diseases produce epidemic waves of infection. The lack of specific drugs and ready-to-use vaccines to prevent most of these epidemics worsens the situation. These force public health officials and policymakers to rely on early warning systems generated by accurate and reliable epidemic forecasters. Accurate forecasts of epidemics can assist stakeholders in tailoring countermeasures, such as vaccination campaigns, staff scheduling, and resource allocation, to the situation at hand, which could translate to reductions in the impact of a disease. Unfortunately, most of these past epidemics exhibit nonlinear and non-stationary characteristics due to their spreading fluctuations based on seasonal-dependent variability and the nature of these epidemics. We analyze various epidemic time series datasets using a maximal overlap discrete wavelet transform (MODWT) based autoregressive neural network and call it Ensemble Wavelet Neural Network (EWNet) model. MODWT techniques effectively characterize non-stationary behavior and seasonal dependencies in the epidemic time series and improve the nonlinear forecasting scheme of the autoregressive neural network in the proposed ensemble wavelet network framework. From a nonlinear time series viewpoint, we explore the asymptotic stationarity of the proposed EWNet model to show the asymptotic behavior of the associated Markov Chain. We also theoretically investigate the effect of learning stability and the choice of hidden neurons in the proposal. From a practical perspective, we compare our proposed EWNet framework with twenty-two statistical, machine learning, and deep learning models for fifteen real-world epidemic datasets with three test horizons using four key performance indicators. Experimental results show that the proposed EWNet is highly competitive compared to the state-of-the-art epidemic forecasting methods.
      8
  • Publication
    Knowledge-based Deep Learning for Modeling Chaotic Systems
    (2022)
    Elabid, Zakaria
    ;
    ;
    Deep Learning has received increased attention due to its unbeatable success in many fields, such as computer vision, natural language processing, recommendation systems, and most recently in simulating multiphysics problems and predicting nonlinear dynamical systems. However, modeling and forecasting the dynamics of chaotic systems remains an open research problem since training deep learning models requires big data, which is not always available in many cases. Such deep learners can be trained from additional information obtained from simulated results and by enforcing the physical laws of the chaotic systems. This paper considers extreme events and their dynamics and proposes elegant models based on deep neural networks, called knowledge-based deep learning (KDL). Our proposed KDL can learn the complex patterns governing chaotic systems by jointly training on real and simulated data directly from the dynamics and their differential equations. This knowledge is transferred to model and forecast real-world chaotic events exhibiting extreme behavior. We validate the efficiency of our model by assessing it on three real-world benchmark datasets: El Niño sea surface temperature, San Juan Dengue viral infection, and Bjørnøya daily precipitation, all governed by extreme events' dynamics. Using prior knowledge of extreme events and physics-based loss functions to lead the neural network learning, we ensure physically consistent, generalizable, and accurate forecasting, even in a small data regime. Index Terms-Chaotic systems, long short-term memory, deep learning, extreme event modeling.
      26  2
  • Publication
    Optimized ensemble deep learning framework for scalable forecasting of dynamics containing extreme events
    (2021)
    Ray, Arnob
    ;
    ;
    Ghosh, Dibakar
    The remarkable flexibility and adaptability of both deep learning models and ensemble methods have led to the proliferation for their application in understanding many physical phenomena. Traditionally, these two techniques have largely been treated as independent methodologies in practical applications. This study develops an optimized ensemble deep learning framework wherein these two machine learning techniques are jointly used to achieve synergistic improvements in model accuracy, stability, scalability, and reproducibility, prompting a new wave of applications in the forecasting of dynamics. Unpredictability is considered one of the key features of chaotic dynamics; therefore, forecasting such dynamics of nonlinear systems is a relevant issue in the scientific community. It becomes more challenging when the prediction of extreme events is the focus issue for us. In this circumstance, the proposed optimized ensemble deep learning (OEDL) model based on a best convex combination of feed-forward neural networks, reservoir computing, and long short-term memory can play a key role in advancing predictions of dynamics consisting of extreme events. The combined framework can generate the best out-of-sample performance than the individual deep learners and standard ensemble framework for both numerically simulated and real-world data sets. We exhibit the outstanding performance of the OEDL framework for forecasting extreme events generated from a Liénard-type system, prediction of COVID-19 cases in Brazil, dengue cases in San Juan, and sea surface temperature in the Niño 3.4 region.
    Scopus© Citations 2  40  4
  • Publication
    Prediction of transportation index for urban patterns in small and medium-sized Indian cities using hybrid RidgeGAN model
    (2023)
    Thottolil, Rahisha
    ;
    Kumar, Uttam
    ;
    The rapid urbanization trend in most developing countries including India is creating a plethora of civic concerns such as loss of green space, degradation of environmental health, scarcity of clean water, rise in air pollution, and exacerbated traffic congestion resulting in significant delays in vehicular transportation. To address the intricate nature of transportation issues, many researchers and planners have analyzed the complexities of urban and regional road systems using transportation models by employing transportation indices such as road length, network density, accessibility, and connectivity metrics. This study addresses the complexities of predicting road network density for small and medium-sized Indian cities that come under the Integrated Development of Small and Medium Towns (IDSMT) project at a national level. A hybrid framework based on Kernel Ridge Regression (KRR) and the CityGAN model is introduced to predict network density using spatial indicators of human settlements. The major goal of this study is to generate hyper-realistic urban patterns of small and medium-sized Indian cities using an unsupervised CityGAN model and to study the causal relationship between human settlement indices (HSIs) and transportation index (network density) using supervised KRR for the real cities. The synthetic urban universes mimic Indian urban patterns and evaluating their landscape structures through the settlement indices can aid in comprehending urban landscape, thereby enhancing sustainable urban planning. We analyzed 503 real cities to find the actual relationship between the urban settlements and their road density. The nonlinear KRR model may help urban planners in deriving the network density for GAN-generated futuristic urban patterns through the settlement indicators. The proposed hybrid process, termed as RidgeGAN model, can gauge the sustainability of urban sprawl tied to infrastructure and transportation systems in sprawling cities. Analysis results clearly demonstrate the utility of RidgeGAN in predicting network density for different kinds of human settlements, particularly for small and medium Indian cities. By predicting future urban patterns, this study can help in the creation of more livable and sustainable areas, particularly by improving transportation infrastructure in developing cities.
      11
  • Publication
    Probabilistic AutoRegressive Neural Networks for Accurate Long-Range Forecasting
    (2023)
    Panja, Madhurima
    ;
    ;
    Kumar, Uttam
    ;
    Forecasting time series data is a critical area of research with applications spanning from stock prices to early epidemic prediction. While numerous statistical and machine learning methods have been proposed, real-life prediction problems often require hybrid solutions that bridge classical forecasting approaches and modern neural network models. In this study, we introduce a Probabilistic AutoRegressive Neural Network (PARNN), capable of handling complex time series data exhibiting non-stationarity, nonlinearity, non-seasonality, long-range dependence, and chaotic patterns. PARNN is constructed by improving autoregressive neural networks (ARNN) using autoregressive integrated moving average (ARIMA) feedback error. Notably, the PARNN model provides uncertainty quantification through prediction intervals and conformal predictions setting it apart from advanced deep learning tools. Through comprehensive computational experiments, we evaluate the performance of PARNN against standard statistical, machine learning, and deep learning models. Diverse real-world datasets from macroeconomics, tourism, epidemiology, and other domains are employed for short-term, medium-term, and long-term forecasting evaluations. Our results demonstrate the superiority of PARNN across various forecast horizons, surpassing the state-of-the-art forecasters. The proposed PARNN model offers a valuable hybrid solution for accurate long-range forecasting. The ability to quantify uncertainty through prediction intervals further enhances the model’s usefulness in various decision-making processes.
      10
  • Publication
    Searching for Heavy-Tailed Probability Distributions for Modeling Real-World Complex Networks
    (2022) ;
    Chattopadhyay, Swarup
    ;
    Das, Suchismita
    ;
    Kumar, Uttam
    ;
    Senthilnath, J.
    Perhaps the most recent controversial topic in network science research is to determine whether real-world complex networks are scale-free or not. Recently, Broido and Clauset [A.D. Broido, A. Clauset, Nature Communication, 10, 1017 (2019)] asserted that the degree distributions of real-world networks are rarely power law under statistical tests. Such complex networks, including social, biological, information, temporal, and brain networks, are often heavy-tailed where the assumption on the scale-free nature of realworld heavy-tailed networks become insignificant as the complex system evolves over time. The failure of power law distribution in fitting the degree distribution data is mainly due to the presence of an identifiable non-linearity within the entire degree distribution in a log-log scale of a complex heavy-tailed network. In this study, we attempt to address this issue by proposing a new class of heavy-tailed probability distributions for modeling the entire degree distributions of complex networks. We introduce a new family of generalized Lomax models (GLM) to capture the non-linearity of these heavy-tailed networks. These newly introduced GLM-type distributions provide better fitting and greater flexibility to the entire node degree distribution of complex networks. Several statistical properties of the proposed model, such as extreme value and inferential statistical properties, are derived into this context. Interestingly, the GLM family belongs to the basin of attraction of Frechet distribution, a heavy-tailed extreme value distribution. Rigorous experimental analysis showcases the excellent performance of the proposed family of distributions while fitting the heavytailed real-world complex networks over fifty real-world datasets in comparison with benchmark probability models. Our results show that GLM-type distributions are not rare, able to model almost 90% of the tested networks accurately compared to benchmark probability models. INDEX TERMS Complex networks, heavy-tailed networks, degree distribution, Lomax distribution, extreme value properties.
      8
  • Publication
    Stochastic forecasting of COVID-19 daily new cases across countries with a novel hybrid time series model
    (2022) ;
    Rai, Shesh N.
    ;
    Bhattacharyya, Arinjita
    An unprecedented outbreak of the novel coronavirus (COVID-19) in the form of peculiar pneumonia has spread globally since its first case in Wuhan province, China, in December 2019. Soon after, the infected cases and mortality increased rapidly. The future of the pandemic’s progress was uncertain, and thus, predicting it became crucial for public health researchers. These predictions help the effective allocation of health-care resources, stockpiling, and help in strategic planning for clinicians, government authorities, and public health policymakers after understanding the extent of the effect. The main objective of this paper is to develop a hybrid forecasting model that can generate real-time out-of-sample forecasts of COVID-19 outbreaks for five profoundly affected countries, namely the USA, Brazil, India, the UK, and Canada. A novel hybrid approach based on the Theta method and autoregressive neural network (ARNN) model, named Theta-ARNN (TARNN) model, is developed. Daily new cases of COVID-19 are nonlinear, non-stationary, and volatile; thus, a single specific model cannot be ideal for future prediction of the pandemic. However, the newly introduced hybrid forecasting model with an acceptable prediction error rate can help healthcare and government for effective planning and resource allocation. The proposed method outperforms traditional univariate and hybrid forecasting models for the test datasets on an average.
    Scopus© Citations 2  94  23