Statistics and Actuarial Science

Permanent URI for this collectionhttps://uwspace.uwaterloo.ca/handle/10012/9934

This is the collection for the University of Waterloo's Department of Statistics and Actuarial Science.

Research outputs are organized by type (eg. Master Thesis, Article, Conference Paper).

Waterloo faculty, students, and staff can contact us or visit the UWSpace guide to learn more about depositing their research.

Browse

Recent Submissions

Now showing 1 - 20 of 393
  • Item
    Statistical Analyses of Lumber Strength Properties and a Likelihood-Free Method using Empirical Likelihood
    (University of Waterloo, 2025-04-29) Yang, Yunfeng
    Wood materials should meet expected strength and reliability standards for safe and stable construction. The strength of lumber and wood products may degrade over time due to sustained applied stresses, a phenomenon known as the duration-of-load (DOL) effect. The inherent variability of lumber, combined with DOL, makes structural reliability analyses particularly challenging. This thesis develops statistical methodologies to address these challenges, focusing on reliability analysis, wood strength modeling, and likelihood-free inference. Chapter 2 evaluates the reliability of lumber, accounting for the DOL effect under different load profiles based on a multimodel Bayesian framework. Three individual DOL models previously used for reliability assessment are considered: the US model, the Canadian model, and the Gamma process model. Procedures for stochastic generation of residential, snow, and wind loads are also described. We propose Bayesian model-averaging (BMA) as a method for combining the reliability estimates of individual models under a given load profile that coherently accounts for statistical uncertainty in the choice of model and parameter values. The method is applied to the analysis of a Hemlock experimental dataset, where the BMA results are illustrated via estimated reliability indices together with 95% interval bands. Chapter 3 explores proof-loading experiments, another industrial procedure for ensuring lumber reliability and quality, besides the DOL experiment from Chapter 2. In proof-loading, a pre-determined load is applied to remove weak specimens, but this may also weaken the surviving specimens (survivors) — a phenomenon we term the damage effect. To capture and assess this effect, we propose a statistical framework that includes a damage model and a likelihood ratio test, offering advantages over existing methods by directly quantifying the damage effect. When applied to experimental data, the proposed framework successfully detects and measures the damage effect while showing good model fit. The framework also provides correlation estimates between strength properties, potentially reducing monitoring costs in industry. Chapter 4 investigates statistical models with intractable likelihoods, such as the Canadian model discussed in Chapter 2. To address the challenge they pose to parameter inference, various likelihood-free methods have been developed, including a recently proposed synthetic empirical likelihood (SEL) approach. We introduce a new SEL estimator based on the reparametrization trick, which greatly reduces the computational burden. The asymptotic property of our SEL estimator is derived for the situation where the number of parameters equals the number of summary statistics, leading to a method that is not only faster, but also yields more accurate uncertainty quantification than conventional MCMC. The SEL approach is further extended by incorporating exponential tilting, which empirically improves performance when summary statistics outnumber parameters. Simulation studies validate the robustness and efficiency of our approach across various scenarios.
  • Item
    Deep Learning Frameworks for Anomaly Detection in Time Series and Graphs with Limited Labels
    (University of Waterloo, 2025-04-29) Chen, Jiazhen
    Anomaly detection involves identifying patterns or behaviors that substantially differ from normal instances in a dataset. It has a wide range of applications in diverse fields such as cybersecurity, manufacturing, finance, and e-commerce. However, real-world anomaly detection often grapples with two main challenges: label scarcity, as anomalies are rare and hard to label, and the complexity of data structures, which can involve intricate dependencies that require careful analysis. In this thesis, we develop deep learning frameworks designed to work effectively for label-free or extremely limited labeling scenarios, with a focus on time series anomaly detection (TSAD) and graph anomaly detection (GAD). To overcome the issue of label scarcity, our initial work investigates unsupervised TSAD methods that extract meaningful patterns from abundant unlabeled data. Building on recent advances in contrastive learning from NLP and computer vision, we introduce the Contrastive Neural Transformation (CNT) framework. This approach integrates temporal contrastive learning with neural transformations to capture context-aware and discriminative patterns effectively. Moreover, the dual-loss formulation prevents representation collapse by avoiding reliance on negative samples - a common challenge in anomaly detection, where the majority of instances represent normal behavior. While capturing temporal context is essential, understanding inter-series relationships is equally important in multivariate TSAD. Anomalies may seem normal in isolation but reveal abnormal patterns when compared to other series. To address this, we introduce DyGraphAD, a dynamic graph-driven dual forecasting framework that models both intra- and inter-series dependencies through a combination of graph and time series forecasting tasks. This allows anomalies to be detected via significant forecasting errors in both the channel-wise and time-wise dimensions. To further enhance computational efficiency, we propose an alternative framework, termed Prospective Multi-Graph Cohesion (PMGC). PMGC leverages graph structure learning to model inter-series relationships in a task-specific manner, reducing computational load compared to manual sequential graph construction in DyGraphAD. Furthermore, it introduces a multi-graph cohesion mechanism to adaptively learn both long-term dependencies and diverse short-term relationships. A prospective graphing strategy is also introduced to encourage the model to capture concurrent inter-series relationships, reducing reliance solely on historical data. Beyond TSAD, GAD is also critical due to its prevalence in numerous applications. Graphs provide structural information alongside node and edge attributes, and understanding the interplay between graph structure and attributes is essential for uncovering subtle anomalies not apparent when examining nodes or edges alone. Given that obtaining labeled data is relatively more feasible in graphs than in time series data for experimental purposes, we focus on GAD settings with limited labeling, more reflecting practical real-world scenarios. Specifically, we make the first attempt to address GAD in cross-domain few-shot settings, aiming to detect anomalies in a sparsely labeled target graph by leveraging a related but distinct source graph. To handle domain shifts, our CDFS-GAD framework incorporates a domain-adaptive graph contrastive learning and a domain-specific prompt tuning, aiming to align the features across two domains while preserving unique characteristics tailored to each domain. A domain-adaptive hypersphere classification loss and a self-training phase are introduced to further refine predictions in the target domain exploiting the limited labeling information. In addition to static graphs, many real-world applications involve dynamic graph data, where both the structure and attributes evolve over time. This adds complexity to anomaly detection, as both temporal and structural variations must be accounted for. Moreover, obtaining sufficient labeled data remains challenging, and related-domain labeled data may not be available in certain scenarios. To tackle the two more practical issues, we propose the EL$^2$-DGAD framework, specifically designed for detecting anomalies in dynamic graphs in extremely labeled conditions. This framework enhances model robustness through a transformer-based temporal graph encoder that captures evolving patterns from local and global perspectives. An ego-context hypersphere classification loss is further introduced to adjust the anomaly detection boundary contextually under limited supervision, supplemented by an ego-context contrasting module to improve generalization with unlabeled data. Overall, this thesis tackles anomaly detection for two commonly used data types, addressing unsupervised, semi-supervised, and cross-domain few-shot scenarios to meet the demands of real-world applications. Our extensive experiments show that the proposed frameworks perform well against various benchmark datasets and competitive anomaly detection baselines.
  • Item
    Contributions to Change Point and Functional Data Analysis
    (University of Waterloo, 2025-04-29) VanderDoes, Jeremy
    The advent and progression of computers has led to consideration of data previous considered too unwieldy. So called high-dimensional, or big, data can be considered large in both the size of observations and the number of observations. In this thesis, we consider such data which may be infinite dimensional and is often collected over some dimension, such as time. Methodology for detection of changes and exploration of this information-rich data is explored. Chapter 1 provides a review of concepts and notation used throughout the thesis. Topics related to time series, functional data, and change point analysis are of particular interest and form the foundation of the thesis. The chapter concludes with an overview of the main contributions contained in the thesis. An empirical characteristic functional-based method for detecting distributional change points in functional time series is presented in chapter 2. Although various methods exist to detect changes in functional time series, they typically require projection or are tuned to specific changes. The characteristic functional-based approach is fully functional and sensitive to general changes in the distribution of functional time series. Integrated- and supremum-type test statistics are proposed. Theoretical considerations for the test statistics are examined, including asymptotic distributions and the measure used to integrate the test statistic over the function space. Simulation, permutation, and approximation approaches to calibrate detection thresholds for the test statistics are investigated. Comparisons to existing methods are conducted via simulation experiments. The proposed methods are applied to continuous electricity prices and high-frequency asset returns. Chapter 3 is devoted to graph-based change point detection. Graph-based approaches provide another method for detecting distributional changes in functional time series. Four test statistics and their theoretical properties are discussed. Extensive simulations provide context for graph-based tuning parameter choices and compare the approaches to other functional change point detection methods. The efficacy of graph-based change point detection is demonstrated on multi-year pedestrian counts, high-resolution stock returns, and continuous electricity prices. Despite increased interest in functional time series, available implementations are largely missing. Practical considerations for applying functional change point detection are covered in chapter 4. We present fChange, a functional time series package in R. The package combines and expands functional time series and change point methods into an easy-to-use format. The package provides functionality to store and process data, summarize and validate assumptions, characterize and perform inference of change points, and provide visualizations. The data are stored as discretely observed observations, promoting usability and accuracy. Applications to continuous electricity prices, cancer mortality, and long-term treasury rates are shown. In chapter 5, we propose novel methodology for analyzing tumor microenvironments (TMEs) in cancer research. TMEs contain vast amounts of information on patient's cancer through their cellular composition and the spatial distribution of tumor cells and immune cell populations. We present an approach to explore variation in TMEs, and determine the extent to which this information can predict outcomes such as patient survival or treatment success. Our approach can identify specific interactions which are useful in such predictions. We use spatial $K$ functions to summarize interactions, and then apply a functional random forest-based model. This approach is shown to be effective in simulation experiments at identifying important spatial interactions while also controlling the false discovery rate. We use the proposed approach to interrogate two real data sets of Multiplexed Ion Beam Images of TMEs in triple negative breast cancer and lung cancer patients. The publicly available companion R package funkycells is discussed. The random coefficient autoregressive model of order 1, RCA(1), is a model well-suited for volatile time series. Detection of changes between stable and explosive regimes of scalar data modeled with the RCA(1) is explored in chapter 6. We derive a (maximally selected) likelihood ratio statistic and show that it has power versus breaks occurring even as close as O(\log \log N) periods from the beginning/end of sample. Moreover, the use of quasi maximum likelihood-based estimates yields better power properties, with the added bonus of being nuisance-free. Our test statistic has the same distribution - of the Darling-Erd\H{o}s type - irrespective of whether the data are stationary or not, and can therefore be applied with no prior knowledge on this. Our simulations show that the test has very good power and, when applying a suitable correction to the asymptotic critical values, the correct size. We illustrate the usefulness and generality of our approach through applications to economic and epidemiological time series. Chapter 7 provides summaries and discussions on each chapter. Directions for future work are considered. These directions, with the provided commentary, extend the scope of the models and may behoove practitioners and researchers alike.
  • Item
    Robust Methods and Model Selection for Causal Inference under Missingness of the Study Exposure
    (University of Waterloo, 2025-04-28) Shi, Yuliang
    The goal of this thesis is to develop new robust methods for the estimation of the causal effect and propose a model selection algorithm when the exposure variable is not fully observed. We mainly discuss the methods using propensity score (PS) and imputation approaches to address both missingness and confounding issues in the observational dataset. How to deal with missing data in observational studies is a common concern for causal inference. However, if the exposure is missing at random (MAR), few approaches are available, and careful adjustments on both missingness and confounding issues are required to ensure a consistent estimate of the true causal effect on the response. In Chapter 2, a new inverse probability weighting (IPW) estimator based on weighted estimating equations (WEE) is proposed to incorporate weights from both the missingness and PS models, which can reduce the joint effect of extreme weights in finite samples. Additionally, we develop a triply robust (TR) estimator via WEE to protect against the misspecification of the missingness model. The asymptotic properties of WEE estimators are shown using properties of estimating equations. Based on simulation studies, WEE methods outperform others, including imputation-based approaches, in terms of bias and variability. Additionally, properly selecting the PS model is a popular topic and has been widely investigated in observational studies. However, there are very few studies investigating the model selection issue for estimating the causal effect when the exposure is MAR. In Chapter 3, we discuss how to select both imputation and PS models, which can result in the smallest root mean squared error (RMSE) of the estimated causal effect. Then, we provide a new criterion, called the “rank score”, for evaluating the overall performance of both models. The simulation studies show that the full imputation plus the outcome-related PS models leads to the smallest RMSE, and the newly proposed criterion is also able to pick the best models. Compared to the MAR assumption, the missing not at random (MNAR) assumption allows for the association between the missingness and the exposure variable, which is a weaker assumption and more reasonable in some application studies. Even though many researchers have discussed how to deal with the missing outcome or confounders in observational studies, very little discussion focuses on the missing exposure under the MNAR assumption. In Chapter 4, we propose the IPW estimators using joint modelling, called “IPW-Joint'', to estimate the causal effect when the exposure is MNAR, which combines estimated weights from the missingness and PS models. Furthermore, to address the problem of model selection in the high-dimensional setting, we apply outcome-adaptive LASSO with weighted absolute mean difference (OAL-WAMD) as a new algorithm to select the outcome-related covariates when the exposure is MNAR. The simulation studies show that IPW-Joint contains more robust properties and smaller variance than the traditional IPW approach. In addition, OAL-WAMD outperforms the traditional LASSO in terms of higher true positive rates (TPR) and smaller variance in both low and high-dimensional settings. In Chapter 5, we summarize the major findings and discuss the potential extensions. The proposed methodology can be applied to several areas, including new drug development, complex missing data problems and finding real-world evidence in observational data.
  • Item
    Efficiency and Equilibria in Centralized and Decentralized Insurance Markets
    (University of Waterloo, 2025-04-28) Zhu, Michael Boyuan
    This thesis studies Pareto efficiency and market equilibrium in the context of insurance markets. Given a specific model of insurance markets, it is of great practical interest in identifying those risk allocations that are deemed desirable to each agent in the market. Equally as important are market equilibria, the allocations and prices that result from agents’ decisions given the structure of the market. The research presented in this thesis studies the relationship between these two concepts in both centralized and decentralized markets of insurance. Throughout, we provide various characterization results for Pareto-efficient contracts and market equilibria in a variety of settings. These results are illustrated with numerical examples, including an in-depth application to markets of flood risk insurance.
  • Item
    High-Dimensional Scaling Limits of Online Stochastic Gradient Descent in Single-Index Models
    (University of Waterloo, 2025-04-25) Rangriz, Parsa
    We analyze the scaling limits of stochastic gradient descent (SGD) with a constant step size in the high-dimensional regime in single-index models. Specifically, we prove limit theorems for the trajectories of finite-dimensional summary statistics of SGD as the dimension tends to infinity. These scaling limits enable the analysis of both ballistic dynamics, described by a system of ordinary differential equations (ODEs), and diffusive dynamics, captured by a system of stochastic differential equations (SDEs). Additionally, we analyze a critical step-size scaling regime where, below this threshold, the effective ballistic dynamics align with the gradient flow of the population loss. In contrast, a new diffusive correction term appears at the threshold due to fluctuations around the fixed points. Furthermore, we discuss nearly sharp thresholds for the number of samples required for consistent estimation, which depend solely on an intrinsic property of the activation function known as the information exponent. Our main contribution is demonstrating that if a single-index model has an information exponent greater than two, the deterministic scaling limit, corresponding to the ballistic phase, or so-called dynamical mean-field theory in statistical physics, fails to achieve consistent estimation in high-dimensional inference problems. This shows the necessity of diffusive correction terms to accurately describe the dynamics of online SGD in single-index models via SDEs such as an Ornstein-Uhlenbeck process.
  • Item
    Statistical developments for network meta-analysis and methane emissions quantification
    (University of Waterloo, 2025-04-22) Wigle, Augustine
    This thesis provides statistical contributions to solve challenges in Network Meta-Analysis (NMA) and the quantification of methane emissions from the oil and gas industry. NMA is an extension of pairwise meta-analysis which facilitates the simultaneous comparison of multiple treatments using data from randomized controlled trials. Some treatments may involve combinations of components, such as one or more drugs given in different combinations. Component NMA (CNMA) is an extension of NMA which allows the estimation of the relative effects of components. In Chapter 2, we compare the popular Bayesian and frequentist approaches to additive CNMA and show that there is an important difference in the assumptions underlying these commonly used models. We prove that the most popular Bayesian CNMA model is prone to misspecification, while the frequentist approach makes a less restrictive assumption. We develop novel Bayesian CNMA models which avoid the restrictive assumption and are robust, and demonstrate in a simulation study that the proposed Bayesian models have favourable statistical properties compared to the existing Bayesian model. The use of all CNMA approaches is demonstrated on a published network. A commonly reported item in an NMA is a list of treatments ranked from most to least preferred, also known as a treatment hierarchy. In Chapter 3, we present the Precision Of Treatment Hierarchy (POTH), a metric which quantifies the level of certainty in a treatment hierarchy from Bayesian or frequentist NMA. POTH summarises the level of certainty into a single number between 0 and 1, making it simple to interpret regardless of the number of treatments in the network. We propose modifications of POTH which can be used to investigate the role of individual treatments or subsets of treatments in the level of certainty in the hierarchy. We calculate POTH for a database of published NMAs to investigate its distribution and relationships with network characteristics. We also provide an in-depth worked example to demonstrate the methods on a real dataset. In the second part of the thesis, we focus on some problems in the quantification of methane emissions from the oil and gas industry. Measurement-based methane inventories, which involve surveying oil and gas facilities and compiling data to estimate methane emissions, are becoming the gold standard for quantifying emissions. However, there is a current lack of statistical guidance for the design and analysis of such surveys. In Chapter 4, we propose the novel application of multi-stage survey sampling techniques to analyse measurement-based methane survey data, providing estimators of total and stratum-level emissions and an interpretable variance decomposition. We also suggest a potentially more efficient approach involving the Hajek estimator, and outline a simple Monte Carlo approach which can be combined with the multi-stage approach to incorporate measurement error. We investigate the performance of the multi-stage estimators in a simulation study and apply the methods to aerial survey data of oil and gas facilities in British Columbia, Canada, to estimate the methane emissions in the province. In Chapter 5, we introduce a Bayesian model for measurements from a methane quantification technology given a true emission rate. The models are fit using data collected in controlled releases (CR) of methane for six different technology types. We use a weighted bootstrap algorithm to provide the distribution of the true emission rate given a new measurement, which synthesizes the new measurement data with the CR data and external information about the possible true emission rate. We present results for the measurement uncertainty of six quantification technologies. Finally, we demonstrate the use of the weighted bootstrap algorithm with different priors and data.
  • Item
    Applications of Lévy Semistationary Processes to Storable Commodities
    (University of Waterloo, 2025-04-16) Lacoste-Bouchet, Simon
    Volatility Modulated Lévy-driven Volterra (VMLV) processes have been applied by Barndorff-Nielsen, Benth and Veraart (2013) to construct a new framework for modelling spot prices of non-storable commodities, namely energy. In this thesis, we extend this framework to storable commodities by showing that successful classical models belong to the framework albeit under some parameter restrictions (a result which to our knowledge is new). Additionally, we propose a new model for spot prices of storable commodities which is built on the VMLV processes and their important subclass of so-called Lévy semi-stationary (LSS) processes. The main feature of the framework exploited in the model proposed in this thesis is the memory of the VMLV processes which is used judiciously to account for cumulative changes in inventory over time and the corresponding expected changes in prices and volatility. To the best of our knowledge, this is the first study which uses the LSS processes to investigate pricing in storable (as opposed to non-storable) commodity markets to account for the impact of inventory on pricing. To complement the theoretical development of the new model, we also provide in this thesis a companion set of calibration and empirical analyses to shed light on the new model’s performance compared to previously established models in the literature.
  • Item
    Advances in the Analysis of Irregular Longitudinal Data Using Inverse Intensity Weighting
    (University of Waterloo, 2025-04-14) Tompkins, Grace
    The analysis of irregular longitudinal data can be complicated by the fact that the timing at which individuals are observed in the data are related to the longitudinal outcome. For example, this can occur when patients are more likely to visit a clinician when their symptoms are worse. In such settings, the observation process is referred to as informative, and any analysis that ignores the observation process can be biased. Inverse intensity weighting (IIW) is a method that has been developed to handle specific cases of informative observation processes. IIW weights observations by the inverse probability of being observed at any given time, and creates a pseudopopulation where the observation process is subsequently ignorable. IIW can also be easily combined with inverse probability of treatment weighting (IPTW) to handle non-ignorable treatment assignment processes. While IIW is relatively intuitive and easy to implement compared to other existing methods, there are few peer-reviewed papers examining IIW and its underlying assumptions. In this thesis, we begin by evaluating a flexible weighting method which combines IIW and IPTW through multiplication to handle informative observation processes and non-randomized treatment assignment processes. We show that the FIPTIW weighting method is sensitive to violations of the noninformative censoring assumption and show that a previously proposed extension fails under such violations. We also show that variables confounding the observation and outcome processes should always be included in the observation intensity model. Finally, we show scenarios where weight trimming should and should not be used, and highlight sensitivities of the FIPTIW method to extreme weights. We also include an application of the methodology to a real data set to examine the impacts of household water sources on malaria diagnoses of children in Uganda. Next, we investigate the impact of missing data on the estimation of IIW weights, and evaluate the performance of existing missing data methods through empirical simulation. We show that there is no "one-size-fits-all" approach to handling missing data in the IIW model, and show that the results are highly dependent on the type of covariates that are missing in the observation times model. We then apply the missing data methods to a real data set to estimate the association between sex assigned at birth and malaria diagnoses in children living in Uganda. Finally, we provide an in-depth evaluation on the assumptions made on IIW across various peer-reviewed papers published in the literature. For each set of assumptions, we construct directed acyclic graphs (DAGs) to visualize the assumptions made on the observation and censoring processes which we use to highlight inconsistencies and potential ambiguity among the assumptions presented in existing works involving IIW. We also discuss when causal estimates of the marginal outcome model can be obtained, and propose a general set of assumptions for IIW.
  • Item
    A Study of Statistical Methods for Modelling Longevity and Climate Risks
    (University of Waterloo, 2025-03-27) Guo, Yiping
    In recent years, two pivotal risks have emerged and taken a significant position in modern actuarial science: longevity risk and climate risk. Longevity risk, or the risk of individuals living longer than expected, poses a severe challenge to both private insurance companies and public pension systems, potentially destabilizing financial structures built on assumptions of life expectancy. On the other hand, climate risk, associated with fluctuations and extreme conditions in weather, has substantial implications for various sectors such as agriculture, energy, and insurance, particularly in the era of increasing climate change impacts. The Society of Actuaries (SOA) has recognized the growing importance of these risks, advocating for innovative research and solutions to manage them effectively. Furthermore, statistical modelling plays an indispensable role in understanding, quantifying, and managing these risks. The development of sophisticated and robust statistical methods enables practitioners and researchers to capture complex risk patterns and make reliable predictions, thereby informing risk management strategies. This thesis, composed of four distinct projects, explores statistical methods for modelling longevity and weather risk, contributing valuable insights to these fields. The first part in this thesis studies the statistical methods for modelling longevity risk, and in particular, modelling mortality rates. In the first chapter, we study parameter estimation of the Lee-Carter model and its multi-population extensions. Although the impact of outliers on stochastic mortality modelling has been examined, previous studies on this topic focus on how outliers in the estimated time-varying indexes may be detected and/or modelled, with little attention being paid to the adverse effects of outliers on estimation robustness, particularly that pertaining to age-specific parameters. In this chapter, we propose a robust estimation method for the Lee-Carter model, through a reformulation of the model into a probabilistic principal component analysis with multivariate t-distributions and an efficient expectation-maximization algorithm for implementation. The proposed method yields significantly more robust parameter estimates, while preserving the fundamental interpretation for the bilinear term in the model as the first principal component and the flexibility of pairing the estimated time-varying parameters with any appropriate time-series process. We also extend the proposed method for use with multi-population generalizations of the Lee-Carter model, allowing for a wider range of applications such as quantification of population basis risk in index-based longevity hedges. Using a combination of real and pseudo datasets, we demonstrate that the superiority of the proposed method relative to conventional estimation approaches such as singular value decomposition and maximum likelihood. Next, we move onto parameter estimation of the Renshaw-Haberman model, a cohort-based extension to the Lee-Carter model. In mortality modelling, cohort effects are often taken into consideration as they add insights about variations in mortality across different generations. Statistically speaking, models such as the Renshaw-Haberman model may provide a better fit to historical data compared to their counterparts that incorporate no cohort effects. However, when such models are estimated using an iterative maximum likelihood method in which parameters are updated one at a time, convergence is typically slow and may not even be reached within a reasonably established maximum number of iterations. Among others, the slow convergence problem hinders the study of parameter uncertainty through bootstrapping methods. In this chapter, we propose an intuitive estimation method that minimizes the sum of squared errors between actual and fitted log central death rates. The complications arising from the incorporation of cohort effects are overcome by formulating part of the optimization as a principal component analysis with missing values. Using mortality data from various populations, we demonstrate that our proposed method produces satisfactory estimation results and is significantly more efficient compared to the traditional likelihood-based approach. The third part of this thesis continues our exploration of the efficient computational algorithm of the Renshaw-Haberman model. Existing software packages and estimation algorithms often rely on maximum likelihood estimation with iterative Newton-Raphson methods, which can be computationally intensive and prone to convergence issues. In this chapter, we present the R package RHals, offering an efficient alternative with an alternating least squares method for fitting a generalized class of Renshaw-Haberman models, including configurations with multiple age-period terms. We extend this method to multi-population settings, allowing for shared or population-specific age effects under various configurations. The full modelling workflow and functionalities of RHals are demonstrated using mortality data from England and Wales. Lastly, we turn to modelling climate risk in the final chapter of the thesis. The use of weather index insurances is subject to spatial basis risk, which arises from the fact that the location of the user's risk exposure is not the same as the location of any of the weather stations where an index can be measured. To gauge the effectiveness of weather index insurances, spatial interpolation techniques such as kriging can be adopted to estimate the relevant weather index from observations taken at nearby locations. In this chapter, we study the performance of various statistical methods, ranging from simple nearest neighbor to more advanced trans-Gaussian kriging, in spatial interpolations of daily precipitations with data obtained from the US National Oceanic and Atmospheric Administration. We also investigate how spatial interpolations should be implemented in practice when the insurance is linked to popular weather indexes including annual consecutive dry days (CDD) and maximum five-day precipitation in one month (MFP). It is found that although spatially interpolating the raw weather variables on a daily basis is more sophisticated and computationally demanding, it does not necessarily yield superior results compared to direct interpolations of CDD/MFP on a yearly/monthly basis. This intriguing outcome can be explained by the statistical properties of the weather indexes and the underlying weather variables.
  • Item
    Online Likelihood-free Inference for Markov State-Space Models Using Sequential Monte Carlo
    (University of Waterloo, 2025-01-23) Verhaar, Tyler; Wong, Samuel
    Sequential Monte Carlo (SMC) methods (a.k.a particle filters) refer to a class of algo- rithms used for filtering problems in non-linear state-space models (SSMs). SMC methods approximate posterior distributions over latent states by propagating and resampling par- ticles, where each particle is associated with a weight representing its relative importance in approximating the posterior. Through iteratively updating particles and/or weights SMC methods gradually refine the particle-based approximation to reflect true posterior distribution. A key challenge in SMC methods arises when the likelihood, responsible for guiding particle weighting, is intractable. In such cases, Approximate Bayesian Computa- tion (ABC) methods can approximate the likelihood, bypassing the need for closed-form expressions. The particle SMC method of interest in this thesis is Chopin’s SMC2 frame- work Chopin et al. [2013] which uses a nested SMC approach. An “outer” operates on the parameter space θ, and an “inner” particle filter estimates the likelihood for a fixed param- eter θ, facilitating joint inference over the parameters and states. The framework proposed by Chopin required closed-form likelihoods and was intended for offline learning. This thesis proposes an ABC-SMC2 algorithm for online-inference in SSMs with intractable likelihoods. Our method uses Approximate Bayesian Computation (ABC) in the inner particle filter to approximate likelihoods via an ABC kernel, thus enabling inference with- out closed-form observation likelihoods. To address the challenges of online learning, we introduce an adaptive ε-scheduler for dynamically selecting the ABC kernel’s tolerance lev- els and a likelihood recalibration mechanism that retroactively refines posterior estimates using previously observed data. We validate our approach on three case studies using com- partment models governed by an ODE system: a toy linear ODE system, the non-linear Lotka–Volterra equations, and a high-dimensional SEIR model with real-world covariates. In these experiments, ABC-SMC2 outperforms fixed and adaptive ε-schedulers in terms of credible interval coverage, posterior accuracy, RMSE.
  • Item
    High-Dimensional Statistical Inference and False Discovery Rate Control with Covariates
    (University of Waterloo, 2025-01-17) Zheng, Liyuan; Qin, Yingli; Liang, Kun
    In this thesis, we focus on three statistical problems. First, we consider graph-based tests for differences of two high-dimensional distributions. Second, we investigate the estimation of multiple large covariance matrices and the application to high-dimensional quadratic discriminant analysis. Lastly, we focus on controlling the false discovery rate while incorporating complex auxiliary information. Testing whether two samples are from a common distribution is an important problem in statistics. Friedman & Rafsky (1979) proposed a non-parametric multivariate distribution test based on the minimal spanning tree (MST). Recently, this test has been extended under various scenarios. However, as demonstrated in Chapter 2, these extensions are not sensitive to sparse alternatives. To address this, we propose a two-step testing procedure, IM-MST. Specifically, IM-MST incorporates marginal screening while accounting for the dependence structure via energy distance, followed by MST-based tests. IM-MST combines the strength of both non-parametric screening and MST-based tests. Simulation studies and real data applications are conducted to evaluate the numerical performance of the two-step procedure, demonstrating that IM-MST exhibits substantial power gains. When estimating covariance matrices for data from two related categories, it is reasonable to assume that these covariance matrices share certain structural components. As a result, the precision matrix (the inverse of the covariance matrix) for each category can be decomposed into three parts: a common diagonal component, a common low-rank component, and a category-specific low-rank component. This decomposition can be motivated by a factor model, where some latent factors are common across two categories while others are specific to individual categories. In Chapter 3, we propose a consistent joint estimation method for two precision matrices building on the work of Wu (2017). Furthermore, these estimators are applied to formulate a high-dimensional quadratic discriminant analysis (QDA) rule, for which we derive the convergence rate for the classification error. In many genetic multiple testing applications, the signs of the test statistics provide important directional information. For example, in RNA-seq data analysis, a negative sign could suggest that the expression of the corresponding gene is potentially suppressed, while a positive sign could indicate a potentially elevated expression level. However, most existing procedures that control the false discovery rate (FDR) ignore such valuable information. In Chapter 4, we extend the covariate and direction adaptive knockoff procedure (Tian 2020) by implementing powerful predictive functions. Through simulation studies and real data analysis, we show that our procedures are competitive to existing covariate-adaptive methods. The companion R package Codak is available.
  • Item
    Assessing Climate Change Impacts and Associated Risks: Applications in Finance and Insurance
    (University of Waterloo, 2024-12-17) Zhang, Jiayue; Wirjanto, Tony; Porth, Lysa
    The focus of this thesis is to investigate the application of sustainable and green investments in the finance and insurance industries, specifically addressing their relevance to climate change and resiliency. To develop a portfolio optimized for both financial returns and sustainability factors, Environmental, Social, and Governance (ESG) scores are integrated into the reward function of a conventional Reinforcement Learning model (a branch of Machine Learning models) in Chapter 2. This approach enables a more robust analysis of the impact of various ESG ratings published by major rating agencies on the coherence of investment strategies. The model addresses the widespread confusion arising from the high heterogeneity in published ESG ratings by treating it as a source of ambiguity (or Knightian uncertainty) and proposes four ESG ensemble strategies catering to investors with different risk and (smooth) ambiguity preference profiles. Additionally, a Double-Mean-Variance model is constructed to combine financial returns and ESG score objectives, define three investor types based on their ambiguity preferences, and develop novel ESG-modified Capital Asset Pricing Models to evaluate the resulting optimized portfolio performance. In Chapter 3, the ESG score evaluation methods for large companies are expanded to assess sustainability of individual farmers' production in the context of climate change. We propose integrating agricultural sustainability factors into classical personal credit evaluation systems to create a sustainable credit score, called the Environmental, Social, Economics (ESE) score. This ESE score is then incorporated into theoretical joint liability models to derive optimal group size and individual-ESE score propositions. Moreover, the utility function of farmers is refined to a mean-variance form, accounting for the risk associated with expected profit. Simulation exercises are provided to examine the effects of different climatic conditions, offering a deeper understanding of the implications of incorporating ESE scores into the credit evaluation system. Chapter 4 investigates strategic investments needed to mitigate transition risks, particularly focusing on sectors significantly impacted by the shift to a low-carbon economy. It emphasizes the importance of tailored sector-specific strategies and the role of government interventions, such as carbon taxes and subsidies, in shaping corporate behavior. In providing a multi-period framework, this chapter evaluates the economic and operational trade-offs companies face under four various decarbonization scenarios-immediate, quick, slow, and no transitions. The analysis provides practical insights for both policymakers and business leaders, demonstrating how regulatory frameworks and strategic investments can be aligned to manage transition risks while optimizing long-term sustainability effectively. The findings contribute to a deeper understanding of the economic impacts of regulatory policies and offer a comprehensive framework to navigate the complexities of transitioning to a low-carbon economy. Finally, Chapter 5 summarizes the thesis and outlines potential directions for further research.
  • Item
    Design and Analysis of Studies Assessing Exposure Effects in Complex Settings
    (University of Waterloo, 2024-12-12) Li, Kecheng; Cook, Richard J.
    Understanding and efficiently estimating the effects of exposures on health outcomes is a fundamental goal in public health research. The accuracy and efficiency of exposure effect estimates are heavily influenced by study design and the statistical methodologies employed for analyses. This thesis consists of three projects where statistical methods are developed to address unique challenges in the estimation of exposure effects in diverse and complex settings. The first project concerns causal inference regarding the effects of multiple exposure variables and their interactions which require modeling the joint distribution of exposures given pertinent confounding variables. This modeling can be carried out via a second-order regression model. Chapter 2 presents methodologies using regression adjustment and inverse weighting to investigate the asymptotic bias of estimators when the dependence model for the generalized propensity score incorrectly assumes conditional independence of exposures or is based on a naive dependence model which does not accommodate the effect of confounders on the conditional association of exposures. We also consider the problem of a semi-continuous bivariate exposure that arises when the population is made up of a sub-population of unexposed individuals and a sub-population of exposed individuals in whom the level of exposure can be quantified. We propose a two-stage estimation technique to study the effects of prenatal alcohol exposure, and specifically the effects of drinking frequency and intensity on childhood cognition in Chapter 2. The second project focuses on plasma donation, highlighting the importance of rigorous device safety evaluations for donors. Complications arise from the fact that outcomes on successive donations from the same donor are not independent, adverse event rates are extremely rare, and there is substantial heterogeneity in the propensity for donors to donate over time. Chapter 3 introduces a statistical framework for designing superiority and non-inferiority trials to assess the safety of a new donation device compared to the standard one. A unique feature is that the number of donations per donor varies substantially so some individuals contribute more information and others less. Historical data on the donation rate and variation in the donation rate (heterogeneity) across donors, the adverse event rate, and the serial dependence in adverse events provide the necessary information to plan for the duration of accrual and follow-up periods. Specifically the sample size formula is derived to ensure power requirements are met when analyses are based on generalized estimating equations and robust variance estimation. The complexity of recruiting donors from a heterogeneous and dynamic population of donors means that it is challenging to characterize the rate at which information is acquired on treatment effects over the course of the study. Strategies for interim monitoring based on group sequential designs using alpha spending functions are developed based on a robust covariance matrix for estimates of treatment effect over successive analyses. The design of a plasma donation study is illustrated in Chapter 3 aiming to investigate the safety of a new device with the outcome being serious hypotensive adverse events. Many chronic diseases can be naturally characterized using multistate models. Longitudinal cohorts and registry studies of chronic diseases typically recruit and follow individuals to record data on the nature and timing of disease progression. In many cases the exact transition times between disease states are not observed directly, but the state occupied at each clinic visit is known. Such studies also routinely collect and store serum samples at intermittent clinic visits. The final project explores the design and analysis of two-phase studies for evaluating the effect of a biomarker of interest. We consider the design of two-phase studies in Chapter 4 aimed at selecting individuals for biospecimen assays to measure biomarkers of interest and estimate their association with disease progression through intensity-based modeling. Likelihood-based and estimating function approaches are developed and the efficiency gains from score residual-dependent sampling strategies are investigated for joint models of the biomarker and disease progression processes. The efficiency of these different frameworks is investigated, and the methods are applied to a study investigating the association between the HLA-B27 marker and joint damage in psoriatic arthritis patients. The thesis concludes with some topics of future research in Chapter 5.
  • Item
    Methods for Improving Performance of Precision Health Prediction Models
    (University of Waterloo, 2024-09-20) Krikella, Tatiana; Dubin, Joel A.
    Prediction models for a specific index patient which are developed on a similar subpopulation have been shown to perform better than one-size-fits-all models. These models are often called \textit{personalized predictive models} (PPMs) as they are tailored to specific individuals with unique characteristics. In this thesis, through a comprehensive set of simulation studies and data analyses, we investigate the relationship between the size of similar subpopulation used to develop the PPMs and model performance. We propose an algorithm which fits a PPM using the size of a similar subpopulation that optimizes both model discrimination and calibration, as it is criticized that calibration is not assessed as often as discrimination in predictive modelling. We do this by proposing a loss function to use when tuning the size of subpopulation which is an extension of a Brier Score decomposition, which consists of separate terms corresponding to model discrimination and calibration, respectively. We allow flexibility through use of a mixture loss term to emphasize one performance measure over another. Through simulation study, we confirm previously investigated results and show that the relationship between the size of subpopulation and discrimination is, in general, negatively proportional: as the size of subpopulation increases, the discrimination of the model deteriorates. Further, we show that the relationship between the size of subpopulation and calibration is quadratic in nature, thus small and large sizes of subpopulation result in relatively well-calibrated models. We investigate the effect of patient weighting on performance, and conclude, as expected, that the choice of the size of subpopulation has a larger effect on the PPM's performance compared to the weight function applied. We apply these methods to a dataset from the eICU database to predict the mortality of patients with diseases of the circulatory system. We then extend the algorithm by proposing a more general loss function which allows further flexibility of choosing the measures of model discrimination and calibration to include in the function used to tune the size of subpopulation. We also recommend bounds on the grid of values used in tuning to reduce the computational burden of the algorithm. Prior to recommending bounds, we further investigate the relationship between the size of subpopulation and discrimination, as well as the size of subpopulation and calibration, under 12 different simulated datasets, to determine if the results from the previous investigation were robust. We find that the relationship between the size of subpopulation and discrimination is always negatively proportional, and the relationship between the size of subpopulation and calibration, although not entirely consistent among the 12 cases, shows that a low size of subpopulation is good, if not optimal, in many cases we considered. Based on this study, we recommend a lower bound on the grid of values to be 20\% of the entire training dataset, and the upper bound to be either 50\% or 70\% of the training dataset, depending on the interests of the study. We apply the methods proposed to both simulated and real data, specifically, the same dataset from the eICU database, and show that the results previously seen are robust, and that the choice of measures for the general loss function have an effect on the optimal size of subpopulation chosen. Finally, we extend the algorithm to predict the longitudinal, continuous outcome trajectory of an index patient, rather than predicting a binary outcome. We investigate the relationship between the size of subpopulation and the mean absolute error, and find that the performance drastically improves up to a point, where it then stabilizes, leading to the model fit to the full training data as the optimal model, just tending to be slightly better than a model fit to 60\% of the subpopulation. As these results are counter-intuitive, we present three other simulation studies which show that these results stem from predicting the trajectory of a patient, rather than from predicting a continuous outcome. Although the answer to why this is the case is an open research question, we speculate that, since a personalized approach still leads to comparable performance to the full model, these results can be attributed to testing these methods on a small sample size. Due to the computational intensity of the methods, however, testing on a larger sample size to generalize these results is currently impractical. Areas of future work include improving the computational efficiency of these methods, which can lead to investigating these same relationships under more complex models, such as random forest or gradient boosting. Further investigation of personalized predictive model performance when predicting a trajectory should also be considered. The methods presented in this thesis will be summarized into an R package to allow for greater usability.
  • Item
    Statistical Methods for Joint Modeling of Disease Processes under Intermittent Observation
    (University of Waterloo, 2024-09-20) Chen, Jianchu; Cook, Richard
    In studies of life history data, individuals often experience multiple events of interest that may be associated with one another. In such settings, joint models of event processes are essential for valid inferences. Data used for statistical inference are typically obtained through various sources, including observational data from registries or clinics and administrative records. These observation processes frequently result in incomplete histories of the event processes of interest. In settings where interest lies in the development of conditions or complications that are not self-evident, data become available only at periodic clinic visits. This thesis focuses on developing statistical methods for the joint analysis of disease processes involving incomplete data due to intermittent observation. Many disease processes involve recurrent adverse events and an event which terminates the process. Death, for example, terminates the event process of interest and precludes the occurrence of further events. In Chapter 2, we present a joint model for such processes which has appealing properties due to its construction using copula functions. Covariates have a multiplicative effect on the recurrent event intensity function given a random effect, which is in turn associated with the failure time through a copula function. This permits dependence modeling while retaining a marginal Cox model for the terminal event process. When these processes are subject to right-censoring, simultaneous and two-stage estimation strategies are developed based on the observed data likelihood, which can be implemented by direct maximization or via an expectation-maximization algorithm - the latter facilitates semi-parametric modeling for the terminal event process. Variance estimates are derived based on the missing information principle. Simulation studies demonstrate good finite sample performance of proposed methods and high efficiency of the two-stage procedure. An application to a study of effect of pamidronate on reducing skeletal complications in patient with skeletal metastases illustrates the use of this model. Interval-censored recurrent event data can occur when the events of interest are only evident through intermittent clinical examination. Chapter 3 addresses such scenarios and extends the copula-based joint model for recurrent and terminal events proposed in Chapter 2 to accommodate interval-censored recurrent event data resulting from intermittent observation. Conditional on a random effect, the intensity for the recurrent event process has a multiplicative form with a weak parametric piecewise constant baseline rate, and a Cox model is formulated for the terminal event process. The two processes are then linked via a copula function, which defines a joint model for the random effect and the terminal event. The observed data likelihood can be maximized directly or via an EM algorithm; the latter facilitates a semi-parametric terminal event process. A computationally convenient two-stage estimation procedure is also investigated. Variance estimates are derived and validated by simulation studies. We apply this method to investigate the association between a biomarker (HLA-B27) and joint damage in patients with psoriatic arthritis. Databases of electronic medical records offer an unprecedented opportunity to study chronic disease processes. In survival analysis, interest may lie in studying the effects of time-dependent biomarkers on a failure time through Cox regression models. Often however, it is too labour intensive to collect and clean data on all covariates at all times, and in such settings it is common to select a single clinic visit at which variables are measured. In Chapter 4, we consider several cost-effective ad hoc strategies for inference, consisting of: 1) selecting either the last or the first visit for a measurement of the marker value, and 2) using the measured value with or without left-truncation. The asymptotic bias of estimators based on these strategies arising from misspecified Cox models is investigated via a multistate model constructed for the joint modeling of the marker and failure processes. An alternative selection method for efficient selection of individuals is discussed under budgetary constraint, and the corresponding observed data likelihood is derived. The asymptotic relative efficiency of regression coefficients obtained from Fisher information is explored and an optimal design is provided under this selection scheme.
  • Item
    The roughness exponent and its application in finance
    (University of Waterloo, 2024-08-30) Han, Xiyue; Schied, Alexander
    Rough phenomena and trajectories are prevalent across mathematics, engineering, physics, and the natural sciences. In quantitative finance, sparked by the observation that the historical realized volatility is rougher than a Brownian martingale, rough stochastic volatility models were recently proposed and studied extensively. Unlike classical diffusion volatility models, the volatility process in a rough stochastic volatility model is driven by an irregular process. The roughness of the volatility process plays a pivotal role in governing the behavior of such models. This thesis aims to explore the concept of roughness and estimate the roughness of financial time series in a strictly pathwise manner. To this end, we introduce the notion of the roughness exponent, which is a pathwise measure of the degree of roughness. A substantial portion of this thesis focuses on the model-free estimation of this roughness exponent for both price and volatility processes. Towards the end of this thesis, we study the Wiener–Young Φ-variation of classical fractal functions, which can be regarded as a finer characterization than the roughness exponent. Chapter 2 introduces the roughness exponent and establishes a model-free estimator for the roughness exponent based on direct observations. We say that a continuous real-valued function x admits the roughness exponent R if the pth variation of x converges to zero for p>1/R and to infinity for p<1/R . The main result of this chapter provides a mild condition on the Faber–Schauder coefficients of x under which the roughness exponent exists and is given as the limit of the classical Gladyshev estimates. This result can be viewed as a strong consistency result for the Gladyshev estimator in an entirely model-free setting because no assumption whatsoever is made on the possible dynamics of the function x. Nonetheless, we show that the condition of our main result is satisfied for the typical sample paths of fractional Brownian motion with drift, and we provide almost-sure convergence rates for the corresponding Gladyshev estimates. In this chapter, we also discuss the connections between our roughness exponent and Besov regularity and weighted quadratic variation. Since the Gladyshev estimators are not scale-invariant, we construct several scale-invariant modifications of our estimator. Finally, we extend our results to the case where the p^th variation of x is defined over a sequence of unequally spaced partitions. Chapter 3 considers the problem of estimating the roughness exponent of the volatility process in a stochastic volatility model that arises as a nonlinear function of a fractional Brownian motion with drift. To this end, we establish a new estimator based on the Gladyshev estimator that estimates the roughness exponent of a continuous function x, but based on the observations of its antiderivative y. We identify conditions on the underlying trajectory x under which our estimates converge in a strictly pathwise sense. Then, we verify that these conditions are satisfied by almost every sample path of fractional Brownian motion with drift. As a consequence, we obtain strong consistency of our estimator in the context of a large class of rough volatility models. Numerical simulations are implemented to show that our estimation procedure performs well after passing to a scale-invariant modification of our estimator. Chapter 4 highlights the rationale of constructing the estimator from the Gladyshev estimator. In this chapter, we study the problem of reconstructing the Faber–Schauder coefficients of a continuous function from discrete observations of its antiderivative. This problem arises in the task of estimating the roughness exponent of the volatility process of financial assets but is also of independent interest. Our approach starts with formulating this problem through piecewise quadratic spline interpolation. We then provide a closed-form solution and an in-depth error analysis between the actual and approximated Faber–Schauder coefficients. These results lead to some surprising observations, which also throw new light on the classical topic of quadratic spline interpolation itself: They show that the well-known instabilities of this method can be located exclusively within the final generation of estimated Faber–Schauder coefficients, which suffer from non-locality and strong dependence on the initial value and the given data. By contrast, all other Faber–Schauder coefficients depend only locally on the data, are independent of the initial value, and admit uniform error bounds. We thus conclude that a robust and well-behaved estimator for our problem can be obtained by simply dropping the final-generation coefficients from the estimated Faber–Schauder coefficients. Chapter 5 studies the Wiener–Young Φ-variation of classical fractal functions with a critical degree of roughness. In this case, the functions have vanishing p^th variation for all p>q but are also of infinite p^th variation for p< q for some q≥1 . We partially resolve this apparent puzzle by showing that these functions have finite, nonzero, and linear Wiener–Young Φ-variation along the sequence of certain partitions. For instance, functions of bounded variation admit vanishing p^th variation for any p>1. On the other hand, Weierstrass and Takagi–van der Waerden functions have vanishing p^th variation for p>1 but are also nowhere differentiable and hence not of bounded variation. As a result, power variation and the roughness exponent fail to distinguish the difference of degree of roughness for these functions. However, we can individuate these functions by showing that the Weierstrass and Takagi–van der Waerden functions admit a nontrivial and linear Φ-variation along the sequence of b-adic partitions, where Φ_q (x)=x/(-log ⁡x)^1/2. Moreover, for q>1 , we further develop a probabilistic approach so as to identify functions in the Takagi class that have linear and nontrivial Φ_q-variation for a prescribed Φ_q . Furthermore, for each fixed q>1 , the collection of functions Φ_q forms a wide class of increasing functions that are regularly varying at zero with an index of regular variation q.
  • Item
    Projection Geometric Methods for Linear and Nonlinear Filtering Problems
    (University of Waterloo, 2024-08-28) Ahmed, Ashraf; Marriott, Paul
    In this thesis, we review the infinite-dimensional space containing the solution of a broad range of stochastic filtering problems, and outline the substantive differences between the foundations of finite dimensional information geometry and Pistone’s extension to infinite dimensions characterizing the substantive differences between the two geometries with respect to the geometric structures needed for projection theorems such as a dually flat affine manifold preserving the affine and convex geometry of the set of all probability measures with the same support, the notion of orthogonal complement between the different tangent representation which are key for the generalized Pythagorean theorem, and the key notion of exponential and mixture parallel transport needed for projecting a point on a submanifold. We also explore the projection method proposed by Brigo and Pistone for reducing the dimensionality of infinite-dimensional measure valued evolution equations from the infinite-dimensional space in which they are written, that is the infinitedimensional statistical manifold of Pistone, onto a finite-dimensional exponential subfamily using a local generalized projection theorem that is a non-parameteric analog of the generalized projection theorem proposed by Amari. Also, we explore using standard arguments the projection idea in the discrete state space with focus on building intuition and using computational examples to understand properties of the projection method. We establish two novel results regarding the impact of the boundary and choosing a subfamily that does not contain the initial condition of the problem. We demostrate, when the evolution process approaches the boundary of the space, the projection method fails completely due to the classical boundary relating to the vanishing of the tangent spaces at the boundary. We also show the impact of choosing a subfamily to project onto that does not contain the initial condition of the problem showing that, in certain directions, the approximation by projection changes from the true value due to solving a different differential equation than if we are to start from within the low-dimensional manifold. We also study the importance of having a sufficient statistics of the exponential subfamily to lie in the span of the left eigenfunctions of the infinitesimal generator of the process we wish to project using computational experiments.
  • Item
    Efficient Bayesian Computation with Applications in Neuroscience and Meteorology
    (University of Waterloo, 2024-08-23) Chen, Meixi; Lysy, Martin; Ramezan, Reza
    Hierarchical models are important tools for analyzing data in many disciplines. Efficiency and scalability in model inference have become increasingly important areas of research due to the rapid growth of data. Traditionally, parameters in a hierarchical model are estimated by deriving closed-form estimates or Monte Carlo sampling. Since the former approach is only possible for simpler models with conjugate priors, the latter, Markov Chain Monte Carlo (MCMC) methods in particular, has become the standard approach for inference without a closed form. However, MCMC requires substantial computational resources when sampling from hierarchical models with complex structures, highlighting the need for more computationally efficient inference methods. In this thesis, we study the design of Bayesian inference to improve computational efficiency, with a focus on a class of hierarchical models known as \textit{latent Gaussian models}. The background of hierarchical modelling and Bayesian inference is introduced in Chapter 1. In Chapter 2, we present a fast and scalable approximate inference method for a widely used model in meteorological data analysis. The model features a likelihood layer of the generalized extreme value (GEV) distribution and a latent layer integrating spatial information via Gaussian process (GP) priors on the GEV parameters, hence the name GEV-GP model. The computational bottleneck is caused by the high number of spatial locations being studied, which corresponds to the dimensionality of the GPs. We presented an inference procedure based on the Laplace approximation to the likelihood followed by a Normal approximation to the posterior of interest. By combining the above approach with a sparsity-inducing spatial covariance approximation technique, we demonstrate through simulations that it accurately estimates the Bayesian predictive distribution of extreme weather events, scales to several thousand spatial locations, and is several orders of magnitude faster than MCMC. We also present a case study in forecasting extreme snowfall across Canada. Building on the approximate inference scheme discussed in Chapter 2, Chapter 3 introduces a new modelling framework for capturing the correlation structure in high-dimensional neuronal data, known as \textit{spike trains}. We propose a novel continuous-time multi-neuron latent factor model based on the biological mechanism of spike generation, where the underlying neuronal activities are represented by a multivariate Markov process. To the best of our knowledge, this is the first multivariate spike-train model in a continuous-time setting to study interactions between neural spike trains. A computationally tractable Bayesian inference procedure is proposed to address the challenges in estimating high-dimensional latent parameters. We show that the proposed model and inference method can accurately recover underlying neuronal interactions when applied to a variety of simulated datasets. Application of our model on experimental data reveals that the correlation structure of spike trains in rats' orbitofrontal cortex predicts outcomes following different cues. While Chapter 3 restricts modelling to Markov processes for the latent dynamics of spike trains, Chapter 4 presents an efficient inference method for non-Markov stationary GPs with noisy observations. While computations for such models typically scale as $\mathcal{O}(n^3)$ in the number of observations $n$, our method utilizes a ``superfast'' Toeplitz system solver which reduces computational complexity to $\mathcal{O}(n \log^2 n)$. We demonstrate that our method dramatically improves the scalability of Gaussian Process Factor Analysis (GPFA), which is commonly used for extracting low-dimensional representation for high-dimensional neural data. We extend GPFA to accommodate Poisson count observations and design a superfast MCMC inference algorithm for the extended GPFA. The accuracy and speed of our inference algorithms are illustrated through simulation studies.
  • Item
    Robust Decision-Making in Finance and Insurance
    (University of Waterloo, 2024-08-22) Zhang, Yuanyuan; Landriault, David; Li, Bin
    Traditional finance models assume a decision maker (DM) has a single view on the stochastic dynamics governing the price process but, in practice, the decision maker (DM) may be uncertain about the true probabilistic model that governs the occurrence of different states. If only risk is present, that is, the DM fully relies on a single probabilistic model P. When the DM is ambiguous, he holds different views on the precise distributions of the price dynamics. This type of model uncertainty due to the multiple probabilistic views is called ambiguity. In the presence of model ambiguity, Maccheroni et al. (2013) propose a novel robust mean-variance model which is referred to as the mean-variance-variance (M-V-V) criterion in my thesis. The M-V-V model is an analogue of the Arrow-Pratt approximation to the well-known smooth ambiguity model, but it offers a more tractable structure and meanwhile separates the modeling of ambiguity, ambiguity aversion, and risk aversion. In Chapters 3 and 4, we study the dynamic portfolio optimization and the dynamic reinsurance problem under the M-V-V criterion and derive the equilibrium strategies in light of the issue of time inconsistency. We find the equilibrium strategies share many properties with the ones from smooth ambiguity, but the time horizon appears inconsistently in the objective function of the M-V-V criterion, in turn causing the equilibrium strategies to be non-monotonic with respect to the risk aversion. To resolve this issue, we further propose a mean-variance-standard deviation (M-V-SD) criterion. The corresponding equilibrium investment strategy exhibits the appealing feature of limited stock market participation, a well-documented stylized fact in empirical studies. The corresponding equilibrium reinsurance strategy also displays the property of restricted insurance retention. Chapter 5 analyzes optimal longevity risk transfers, focusing on differing buyer and seller risk aversions using a Stackelberg game framework. We compare static contracts, which offer long-term protection with fixed terms, to dynamic contracts, which provide short-term coverage with variable terms. Our numerical analysis with real-life mortality data shows that risk-averse buyers prefer static contracts, leading to higher welfare gains and flexible market conditions, while less risk-averse buyers favor dynamic contracts. Ambiguity, modeled as information asymmetry, reduces welfare gains and market flexibility but does not change contract preferences. These findings explain key empirical facts and offer insights into the longevity-linked capital market. In the rest of the chapters, Chapter 1 introduces the background literature and main motivations of this thesis. Chapter 2 covers the mathematical preliminaries for the sub-sequent chapters. The core analysis and findings are presented in the following chapters. Finally, Chapter 6 concludes the thesis and suggests potential directions for future research.