Systems Design Engineering
Permanent URI for this collectionhttps://uwspace.uwaterloo.ca/handle/10012/9914
This is the collection for the University of Waterloo's Department of Systems Design Engineering.
Research outputs are organized by type (eg. Master Thesis, Article, Conference Paper).
Waterloo faculty, students, and staff can contact us or visit the UWSpace guide to learn more about depositing their research.
Browse
Recent Submissions
Item New Attack Detection Methods for Connected and Automated Vehicles(University of Waterloo, 2025-05-08) Bian, ShuhaoEnsuring the security of Connected and Automated Vehicles (CAVs) against adversarial threats remains a critical challenge in cyber-physical systems. This thesis investigates attack detection methodologies and presents novel dual-perspective detection frameworks to enhance CAVs resilience. We first propose a vehicle dynamics-based attack detector that integrates the Unscented Kalman Filter (UKF) with machine learning techniques. This approach monitors physical system behaviour and identifies anomalies when sensor readings deviate from predicted states. Our enhanced model captures nonlinear vehicle dynamics while maintaining real-time performance, enabling the detection of sophisticated attacks that traditional linear models would miss. We develop a complementary trajectory-based detection framework that analyzes driving behaviour rationality to address the limitations of purely physics-based detection. This system evaluates vehicle trajectories within their environmental context, incorporating road conditions, traffic signals, and surrounding vehicle data. By leveraging neural networks for trajectory prediction and evaluation, our approach can identify malicious interventions even when attackers manipulate vehicle behaviour within physically plausible limits. Integrating these two detection perspectives—one based on vehicle dynamics modelling and the other on trajectory rationality analysis—provides a comprehensive security framework that significantly improves detection accuracy while reducing false positives. Experimental results demonstrate our system’s effectiveness against various attack vectors, including false data injection, adversarial control perturbations, and sensor spoofing attacks. Our research contributes to autonomous vehicle security by developing a holistic detection approach that considers both immediate physical anomalies and broader behavioural inconsistencies, enhancing system resilience against increasingly sophisticated cyber-physical threats.Item Towards Urban Digital Twins With Gaussian Splatting, Large-Language-Models, and Cloud Mapping Services(University of Waterloo, 2025-05-08) Gao, KyleComputer Vision Remote Sensing Gaussian Splatting Point Cloud 3D Modelling Urban Digital Twin GIS Large Language ModelsItem Towards the development of an all-optical, non-contact, photon absorption remote sensing (PARS) endomicroscope for blood vasculature imaging(University of Waterloo, 2025-05-06) Warren, AlkrisThe need for high-resolution, label-free imaging techniques has spurred the development of advanced endoscopic technologies for real-time tissue characterization. This thesis presents the design, development, and validation of the first forward-viewing, non-contact, all-optical Photon Absorption Remote Sensing (PARS) endomicroscope for in vivo vascular imaging. The proposed system is designed to leverage the endogenous optical absorption of hemoglobin to achieve high-resolution contrast, without the use of exogenous labels or acoustic coupling, addressing longstanding limitations of conventional absorption-based and scattering-based imaging modalities.Two prototype designs were developed using image guide fiber (IGF) technology and achromatic graded-index (GRIN) lenses, with systematic de-risking experiments guiding their evolution. The first prototype (P1) achieved a resolution of ~1 µm and signal-to-noise ratio (SNR) of 22 dB, demonstrating the feasibility of high-fidelity PARS imaging within a 1.6mm outer diameter (OD) device footprint. A second design (P2) was introduced to address constraints in working distance and imaging depth for in vivo use, trading resolution for improved accessibility in biological tissues. This work establishes a novel platform for PARS miniaturization and integration with widefield endoscopy, positioning the technology for future applications, including real-time, in situ virtual biopsies, blood oxygenation measurement, and surgical guidance within internal bodily cavities. The results represent a foundational advancement in the translation of PARS microscopy to clinical settings and lay the groundwork for real-time, high-resolution endoscopic diagnostics.Item Encoding FHIR Medical Data for Transformers(University of Waterloo, 2025-04-29) Yu, TrevorThe open source Fast Healthcare Interoperability Resources (FHIR) data standard is becoming increasingly adopted as a format for representing and communicating medical data. FHIR represents various types of medical data as resources, which have a standardized JSON structure. FHIR boasts the advantage of interoperability and can be used for electronic medical record storage and, more recently, machine learning analytics. Recent trends in the machine learning field have been the development of large, foundation models that are trained on large volumes of unstructured data. Transformers are a deep neural network architecture for sequence modelling and have been used to build foundation models for natural language processing. Text is input to transformers as a sequence of tokens. Tokenization algorithms break text into discrete chunks, called tokens. Using language tokenizers on FHIR JSON data is inefficient, producing several hundred text tokens per resource. Patient records may contain several thousand resources, which overall exceeds the total number of tokens that most text transformers can handle. Additionally, discrete encoding of numeric and time data may not be appropriate for these continuous quantities. In this thesis, I design a tokenization method that operates on data using the open source Health Level 7 FHIR standard. This method takes JSON returned from a FHIR server query and assigns tokens to chunks of JSON, based on FHIR data structures. The FHIR tokens can be used to train transformer models, and the methodology to train FHIR transformer models on sequence classification and masked language modelling tasks is presented. The performance of this method on the open source MIMIC-IV FHIR dataset is validated for length-of-stay prediction (LOS) and mortality prediction (MP) tasks. In addition, I explore methods for encoding numerical and time-delta values using continuous vector encodings rather than assigning discrete tokens to values. I also explore using compression methods to reduce the long sequence lengths. Previous works using MIMIC-IV have reported their performance on the LOS and MP tasks using XGBoost models, which use bespoke feature encodings. The results show that the FHIR transformer can perform the LOS task better than an XGBoost model, but the transformer performs worse at the MP task. None of the continuous encoding methods perform significantly better than discrete encoding methods, but they are not worse either. Compression methods provide a performance improvement on long sequence lengths in both accuracy and inference speed. Since performance is task dependent, future research should validate the performance of this method on other datasets and tasks. MIMIC-IV is too small to see benefits of pre-training, but if a larger dataset can be obtained, the methodology developed in this work could be applied towards creating a large FHIR foundation models.Item Electrostatic MEMS Sensors: From Mechanism Discovery to Deployment in Liquid Media(University of Waterloo, 2025-04-28) Shama, YasserThis thesis presents a methodical investigation into the fundamental sensing mechanism of electrostatic MEMS sensors in gas and liquid media. It provides new insights into electrostatic MEMS sensing mechanisms that can improve the sensor design process by combining mass sorption and permittivity change to enhance the sensitivity of gas and liquid sensors. First, it compares among the responsivities of a set of MEMS isopropanol sensors. I found that functionalized static-mode sensors do not exhibit a measurable change in response due to added mass, whereas bare sensors showed a clear change in response to isopropanol vapor. Functionalized dynamic-mode sensors showed a measurable frequency shift due to the added mass of isopropanol vapor. The frequency shift increased by threefold in the presence of strong electrostatic fields. These results show that the sensing mechanism is a combination of a weaker added mass effect and a stronger permittivity effect and that electrostatic MEMS gas sensors are independent of the direction of the gravitational field and are, thus, robust to changes in alignment. It is erroneous to refer to them as `gravimetric' sensors. I investigated the repeatability of electrostatic MEMS sensors over prolonged excitations. The sensors were subjected to two test conditions: continuous frequency sweeps and long-term residence on a resonant branch beyond the cyclic-fold bifurcation. I found that prolonged high-amplitude oscillations undermine repeatability and cause significant shifts in the bifurcation location toward lower frequencies by building up plastic deformations that reduce the capacitive gap. Biased excitation waveforms were also found to lead to charge buildup within dielectrics, exacerbating the drift in frequency of the bifurcation point. In comparison, stiffer in-plane sensors with no metallization operating under unbiased waveforms showed dramatic improvement in repeatability. With a view to deployment of electrostatic MEMS sensors in liquid media, I studied the use of motion-induced current to detect their high frequency vibrations. While current and ground truth (optical) measurements aligned well at lower frequency resonances, current measurements showed valleys rather than peaks at high frequency resonances. The root cause was found to be current behavior switching from capacitive to inductive as the frequency crossed a resonance in the measurement circuit. It was also found that output current diminishes with increasing mode number. Finally, I found a measurable change beyond 10 MHz in the output current of a bare chip carrier when the analyte (mercury acetate) was introduced at the concentration of 100 ppm into deionized water, suggesting a potential for interference with inertial sensing. In the final phase of this work, the fundamental vibration mode of electrostatic MEMS sensors was used to detect 100 ppm of mercury acetate in deionized water. The sensors measured a consistent shift in the frequency and amplitude of the resonant peak. This demonstrates the viability of electrostatic MEMS sensors for underwater applications and the need for further work to improve their detection mechanisms.Item Enhancing Space Situational Awareness with AI and Optimization Techniques(University of Waterloo, 2025-04-24) Kazemi, SajjadAs space becomes increasingly congested and contested, ensuring the safe operation of satellites has emerged as a critical concern for both public and private sector stakeholders. The growing number of active satellites and space debris significantly increases the risk of collisions, making Space Situational Awareness (SSA) an essential capability for modern space operations. SSA aims to provide timely and accurate assessments of space objects’ trajectories to prevent collisions and maintain the long-term sustainability of space activities. Currently, SSA processes are heavily reliant on human operators who must analyze large volumes of data from multiple sources, identify high-priority risks, interpret and validate information, and ultimately make decisions regarding collision risks. While computational tools assist in these processes, the dependence on human judgment introduces limitations, including delays in decision-making and potential errors in critical assessments. Given the increasing complexity of the space environment, there is a pressing need for automated and data-driven approaches to enhance SSA capabilities. A fundamental challenge within SSA is orbit prediction—the ability to accurately forecast the future trajectories of space objects. However, precise trajectory estimation alone is not sufficient, as some scenarios require active collision avoidance maneuvers. In such cases, decision support systems must generate reliable and efficient maneuver plans to ensure satellites can safely adjust their orbits without unnecessary fuel expenditure or operational disruptions. This thesis addresses both orbit prediction and collision avoidance through a combination of machine learning and optimization techniques. First, a transformer-based deep learning model is trained using publicly available data to predict space object trajectories with high accuracy and computational efficiency. This approach leverages advances in sequence modeling to improve predictive performance in dynamic orbital environments. Next, Reinforcement Learning (RL) techniques are employed to develop an autonomous decision-making framework that generates optimized collision avoidance maneuvers for satellites. By learning from simulated interactions, the RL-based approach aims to provide adaptive and fuel-efficient avoidance strategies. Finally, a Sequential Convex Optimization (SCvx) approach is explored to solve the collision avoidance problem from a purely optimization-driven perspective without relying on data-driven models. This method ensures mathematically rigorous maneuver planning based on physical constraints and operational requirements. This work contributes to the advancement of SSA by enhancing the accuracy of orbit prediction and the reliability of collision avoidance strategies. Besides that, this work has the potential to improve automation in space traffic management, reducing reliance on human operators and increasing the resilience of satellite operations.Item Towards Decision Support and Automation for Safety Critical Ultrasonic Nondestructive Evaluation Data Analysis(University of Waterloo, 2025-04-16) Torenvliet, NicholasA set of machine learning techniques that provide decision support and automation to the analysis of data taken during ultrasonic non-destructive evaluation of Canada Deuterium Uranium reactor pressure tubes is proposed. Data analysis is carried out primarily to identify and characterizes the geometry of flaws or defects on the pressure tube inner diameter surface. A baseline approach utilizing a variational auto-encoder ranks data by likelihood and performs analysis using Nominal Profiling (NPROF), a novel technique that characterize the very likely nominal component of the dataset and determines variance from it. While effective, the baseline method expresses limitations, including sensitivity to outliers, challenged explainability, and the absence of a strong fault diagnosis and error remediation mechanism. To address these shortcomings, Diffusion Partition Consensus (DiffPaC), a novel method integrating Conditional Score-Based Diffusion with Savitzky-Golay Filters, is proposed. The approach includes a mechanism for outlier removal during training that reliably improves model performance. It also features strong explainability and, with a human in the loop, mechanisms for fault diagnosis and error correction. These features advance applicability in safety-critical contexts such as nuclear nondestructive evaluation. Methods are integrated and scaled to provide: (a) a principled probabilistic performance model, (b) enhanced explainability through interpretable outputs, (c) fault diagnosis and error correction with a human-in-the-loop, (e) independence from dataset curation and out-of-distribution generalization (f) strong preliminary results that meet accuracy requirements on dimensional estimates as specified by the regulator in \cite{cog2008inspection}. Though not directly comparable, the integrated set of methods makes many qualitative improvements upon prior work, which is largely based on discriminative methods or heuristics. And whose results rely on data annotation, pre-processing, parameter selection, and out of distribution generalization. In regard to these, the integrated set of fully learned data driven methods may be considered state of the art for applications in this niche context. The probabilistic model, and corroborating results, imply a principled basis underlying model behaviors and provide a means to interface with regulatory bodies seeking some justification for usage of novel methods in safety critical contexts. The process is largely autonomous, but may include a human in the loop for fail-safe analysis. The integrated methods make a significant step forward in applying machine learning in this safety-critical context. And provide a state-of-the-art proof of concept, or minimum viable product, upon which a new and fully refactored process for utility owner operators may be developed.Item Dynamics of Golf Discs using Trajectory Experiments for Parameter Identification and Model Validation(University of Waterloo, 2025-04-15) Turner, AdamThe trajectories of flying discs are heavily affected by their aerodynamics and can vary greatly. The growing sport of disc golf takes advantage of these variations, offering seemingly endless disc designs to use in a round. Despite the increasing popularity of disc golf, most manufacturers lack a scientific approach to disc design and instead use subjective assessments and inconsistent disc rating systems to characterize disc performance. This leads to more guess work for players. This thesis addresses this issue by presenting a physics-based disc trajectory model optimized using experimental trajectory data, and by exploring the possibility for a standardized disc rating system. A novel stereo-camera-based methodology was developed to capture three-dimensional initial conditions and trajectories of disc golf throws. This data was used to identify the aerodynamic coefficients of physics-based models. These models included six aerodynamic coefficients that depended on five independent variables. Disc wobble was included as a variable affecting the aerodynamic coefficients for the first time. Its effect on model performance was compared to simpler models, which excluded it. The models used various coefficient estimation methods for parameter identification, including polynomial functions and a recently proposed deep-learning approach. The deep-learning approach modelled some relationships with a neural network, which had the benefit of allowing the model to form the most appropriate relationships without relying on functional approximations. Polynomial functions were also used to augment a model that used coefficients previously determined from computational fluid dynamics. These approaches were validated using experimental trajectory data. The model using a mix of computational fluid dynamics data and polynomial functions showed significant improvement over the baseline computational fluid dynamics model. The complete polynomial approaches resulted in the best performing models and showed good agreement with the validation data. The neural network approaches mostly performed well, but could not beat the pure polynomial approaches. The incorporation of disc wobble as a variable affecting the aerodynamic coefficients showed a negligible improvement over the models that disregarded it. Further model improvement is unlikely without first addressing measurement errors in data collection, particularly pertaining to disc attitude, which is the disc plane's orientation relative to the global coordinate system. The possibility of a trajectory-based test standard for discs was also explored, highlighting the need to carefully choose standardized initial conditions to evaluate disc trajectories with a wide range of flight characteristics. Possible approaches for quantifying flight numbers were also discussed. Considerations for disc mass, initial spin ratio, and air density were also highlighted as these factors were shown to affect disc flight and can have implications for a testing standard. This research contributes to the growing work surrounding disc golf, by proposing a capture method for three-dimensional disc golf trajectories and validated physics-based disc trajectory models, and by exploring a standardized disc rating system. This work contributes to the understanding of disc behaviour for both manufacturers and players alike, and propels disc golf towards a more scientifically informed future.Item Toward Enhanced Sea Ice Parameter Estimation: Fusing Ice Surface Temperature with the AI4Arctic Dataset using Convolutional Neural Networks(University of Waterloo, 2025-04-14) de Loe, LilyArctic sea ice mapping is essential for supporting several key applications. These include facilitating safe marine navigation, providing accurate data for climate monitoring, and assisting efforts by remote northern communities to adapt to variable ice conditions. Automated mapping approaches can leverage an abundance of freely accessible satellite data, with the potential to supplement navigational ice charts, improve operational forecasting, and produce high-resolution estimates of sea ice parameters. However, current approaches rely on synthetic aperture radar (SAR) and passive microwave (PM) data, which can struggle to distinguish ice features due to ambiguous textures, atmospheric effects, and sensor limitations. This thesis explores the potential for thermal-infrared data to improve estimates of sea ice concentration, stage of development, and floe size produced by multi-task deep learning architectures. Work builds on the recent AI4Arctic dataset, which combines Sentinel-1 SAR, AMSR2 brightness temperature, ERA-5 reanalysis data, and ice charts to enhance deep learning-based mapping approaches. VIIRS ice surface temperature (IST) is investigated for its potential to improve predictions in regions where SAR and PM measurements are challenging to interpret. A VIIRS-AI4Arctic dataset is developed, which consists of 84 scenes, and demonstrates overlap between VIIRS, Sentinel-1, and AMSR2 products. Three variations on the U-Net architecture are introduced, which incorporate IST features at the input- and feature-levels. These models are evaluated against the winning AI4EO AutoICE Challenge architecture, which acts as an AI4Arctic baseline. A SIC accuracy metric is introduced to provide an additional assessment of model performance. Results demonstrate that models incorporating IST consistently reduce classification errors across all three tasks, particularly when identifying open water under conditions with low-incidence angle (SAR), high atmospheric moisture (PM), and wind roughening (SAR and PM). A single, shared decoder improves contextual awareness, although multi-decoder architectures effectively reconstruct task-specific features. The DEU-Net-V architecture, which learns IST features separately from AI4Arctic channels, is most effective at mitigating ambiguity introduced by SAR and PM data. Finally, estimation of aleatoric uncertainty yields heightened variance in marginal ice zones, highlighting potential discrepancies between ice chart labels and pixel-level conditions, and demonstrating the value of quantifying uncertainty from observation noise. IST ultimately enhances sea ice classification, but is limited by cloud contamination and the resolution of current products. These findings support the continued development of deep learning approaches incorporating IST, and highlight the potential for next-generation thermal-infrared instruments to further improve automated sea ice mapping.Item Advancing Photometric Odometry to Dense Volumetric Simultaneous Localization and Mapping(University of Waterloo, 2025-03-25) Hu, Yan Song; Zelek, JohnNavigating complex environments remains a fundamental challenge in robotics. At the core of this challenge is Simultaneous Localization and Mapping (SLAM), the process of creating a map of the environment while simultaneously using that map for navigation. SLAM is essential for mobile robotics because effective navigation is a prerequisite for nearly all real-world robotic applications. Visual SLAM, which relies solely on the input of RGB cameras is important because of the accessibility of cameras, which makes it an ideal solution for widespread robotic deployment. Recent advances in graphics have driven innovation in the visual SLAM domain. Techniques like Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) enable the rapid generation of dense volumetric scenes from RGB images. Researchers have integrated these radiance field techniques into SLAM to address a key limitation of traditional systems. Although traditional SLAM excels at localization, the generated maps are often unsuitable for broader robotics applications. By incorporating radiance fields, SLAM systems have the potential for the real-time creation of volumetric metric-semantic maps, offering substantial benefits for robotics. However, current radiance field-based SLAM approaches face challenges, particularly in processing speed and map reconstruction quality. This work introduces a solution that addresses limitations in current radiance fields SLAM systems. Direct SLAM, a traditional SLAM technique, shares key operational similarities with radiance field approaches that suggest potential synergies between the two systems. Both methods rely on photometric loss optimization, where the pixel differences between images guide the optimization process. This work demonstrates that the benefits of combining these complementary techniques extend beyond theory. This work demonstrates the synergy between radiance field techniques and direct SLAM through a novel system that combines 3DGS with direct SLAM, achieving a superior combination of quality, memory efficiency, and speed compared to existing approaches. The system, named MGSO, addresses a challenge in current 3DGS SLAM systems: Initializing 3D Gaussians while performing SLAM simultaneously. The proposed approach leverages direct SLAM to produce dense and structured point clouds for 3DGS initialization. This results in faster optimization, memory compactness, and higher-quality maps even with mobile hardware. These results demonstrate that traditional direct SLAM techniques can be effectively integrated with radiance field representations, opening avenues for future research.Item Transformer-based Point Cloud Processing and Analysis for LiDAR Remote Sensing(University of Waterloo, 2025-03-24) Lu, Dening; Li, Jonathan; Xu, LinlinThe processing and analysis of Light Detection and Ranging (LiDAR) point cloud data, a fundamental task in Three-Dimensional (3D) computer vision, is essential for a wide range of remote sensing applications. However, the disorder, sparsity, and uneven spatial distribution of LiDAR point clouds pose significant challenges to effective and efficient processing. In recent years, Transformers have demonstrated notable advantages over traditional deep learning methods in computer vision, yet designing Transformer-based frameworks tailored to point clouds remains an underexplored topic. This thesis investigates the potential of Transformer models for accurate and efficient LiDAR point cloud processing. Firstly, a 3D Global-Local (GLocal) Transformer Network (3DGTN) is introduced to capture both local and global context, thereby enhancing model accuracy for LiDAR data. This design not only ensures a comprehensive understanding of point cloud characteristics but also establishes a foundation for subsequent efficient Transformer frameworks. Secondly, a fast point Transformer network with Dynamic Token Aggregation (DTA-Former) is proposed to improve model speed. By optimizing point sampling, grouping, and reconstruction, DTA-Former substantially reduces the time complexity of 3DGTN while retaining its strong accuracy. Finally, to further reduce time and space complexity, a 3D Learnable Supertoken Transformer (3DLST) is presented. Building on DTA-Former, 3DLST employs a novel supertoken clustering strategy that lowers computational overhead and memory consumption, achieving state-of-the-art performance across multi-source LiDAR point cloud tasks in terms of both accuracy and efficiency. These Transformer-based frameworks contribute to more robust and scalable LiDAR point cloud processing solutions, supporting diverse remote sensing applications such as urban planning, environmental monitoring, and autonomous navigation. By enabling efficient yet high-accuracy analysis of large-scale 3D data, this work fosters further research and innovation in LiDAR remote sensing.Item Multi-Object Tracking using Mamba and an Investigation into Data Association Strategies(University of Waterloo, 2025-03-19) Khanna, Dheraj; Zelek, JohnMulti-Object Tracking (MOT) is a critical component of computer vision, with applications spanning autonomous driving, video surveillance, sports analytics, and more. Despite significant advancements in tracking algorithms and computational power, challenges such as maintaining long-term identity associations, handling dynamic object counts, managing irregular movements, and mitigating occlusions persist, particularly in complex and dynamic environments. This research addresses these challenges by proposing a learning-based motion model that leverages past trajectories to improve motion prediction and object re-identification, and we also investigate how to maximize the performance of trackers with data association. Inspired by recent advancements in state-space models (SSMs), particularly Mamba, we propose a novel learning-based architecture for motion prediction that combines the strengths of Mamba and self-attention layers to effectively capture non-linear motion patterns within the Tracking-By-Detection (TBD) paradigm. Mamba's input-dependent sequence modeling capabilities enable efficient and robust handling of long-range temporal dependencies, making it well for complex motion prediction tasks. Building on this foundation, we explore hybrid data association strategies to improve object tracking robustness, particularly in scenarios with occlusions and identity switches. By integrating stronger cues such as Intersection over Union (IoU) for spatial consistency and Re-Identification (Re-ID) for appearance-based matching, we enhance the reliability of object associations across frames, reducing errors in long-term tracking. Fast motion and partial overlaps often lead to identity mismatches in object tracking. Traditionally, spatial association relies on IoU, which can struggle in such scenarios. To address this, we enhance the cost matrix by incorporating Height-based IoU to handle partial overlaps more effectively. Additionally, we extend the original bounding boxes with a buffer to account for fast motion, thereby improving the robustness and accuracy of the spatial association process. Additionally, we study the impact of dynamically updating the feature bank for Re-ID during the matching stage, culminating in a refined weighted cost matrix. To further address challenges in identity switching and trajectory consistency, we introduce the concept of virtual detections in overlapping scenarios and explore its effectiveness in mitigating ID switches. Developing a robust and accurate MOT tracker demands a critical interplay between accurate motion modeling and a sophisticated combination of stronger and weaker cues in data association. Through extensive experimental evaluations on challenging benchmarks such as DanceTrack and SportsMOT, the proposed approaches achieve significant performance gains, with HOTA scores of 63.16% and 77.26% respectively, surpassing multiple existing state-of-the-art methods. Notably, our approach outperforms DiffMOT by 0.9% on DanceTrack and 0.06% on SportsMOT, while achieving 3- 7% improvements over other learning-based motion models. This work contributes to advancing MOT systems capable of achieving high performance across diverse and demanding scenarios.Item Deployment of Piezoelectric Disks in Sensing Applications(University of Waterloo, 2025-02-12) Abdelrahman, Mohamed; Abdel-rahman, Eihab; Yavuz, MustafaMicro-electromechanical Systems (MEMS) have revolutionized the way we approach sensing and actuation, offering benefits like low power usage, high sensitivity, and cost efficiency. These systems rely on various sensing mechanisms such as electrostatic, piezoresistive, thermal, electromagnetic, and piezoelectric principles. This thesis focuses on piezoelectric sensors, which stand out due to their ability to generate electrical signals without needing an external power source. Their compact size and remarkable sensitivity make them highly attractive. However, they’re not without challenges—their performance can be affected by temperature changes, and they can’t measure static forces. These limitations call for advanced signal processing and compensation techniques. Piezoelectric sensors, which operate based on the direct and inverse piezoelectric effects, find use in a wide range of applications, from measuring force and acceleration to detecting gases. This research zooms in on two key applications of piezoelectric sensors: force sensing and gas detection. For force sensing, the study focuses on developing smart shims that measure forces between mechanical components, which helps prevent structural failures. The experimental setup includes an electrodynamic shaker, a controller, and custom components like a glass wafer read-out circuit and a 3D-printed shim holder. During tests, the system underwent a frequency sweep from 10 Hz to 500 Hz, and a resonance was detected at about 360 Hz, matching the structural resonance. Some inconsistencies in the sensor’s output were traced back to uneven machining of the shim’s holes and variations in circuit attachment. To address these issues, the study suggests improving the machining process and redesigning the shim holder for better circuit alignment. Future work will include testing for bending moments, shear forces, and introducing a universal joint in the design to study moment applications more effectively. On the gas sensing side, the research examines a piezoelectric disk with a Silver- Palladium electrode for detecting methane. Using the inverse piezoelectric effect, the sensor’s natural frequency was found to be around 445 kHz. When coated with a sensitive material—PANI doped with ZnO—the disk exhibited a frequency shift of 2.538 kHz, indicating successful methane detection. The setup for this experiment included a gas chamber with precise control over gas flow and displacement measurements. Interestingly, after methane was replaced with nitrogen, the natural frequency returned to its original value, demonstrating the sensor’s reversible detection capability. Future research will expand to test other gases and sensitive materials, broadening the scope of applications. In summary, this thesis pushes the boundaries of piezoelectric MEMS sensors by tackling key design and performance challenges. Through detailed experimental methods, results, and suggested improvements, it lays a solid foundation for further research aimed at enhancing the reliability and versatility of piezoelectric sensors in real-world applications.Item Examining Computer-Generated Aeronautical English Accent Testing and Training(University of Waterloo, 2025-02-04) Seong, Hyun Su; Cao, Shi; Kearns, SuzanneObjective: This thesis focused on the persisting problem of language-related issues, in pilot-air traffic controller (ATC) communication, particularly in relation to foreign accents interfering with pilots’ understanding. It examined the effect of foreign accents embedded in human and computer voice (HV, CV), as well as demographic background on the level of understanding of the participants. Background: Studies focusing on the impacts of foreign accents in Aviation English (AE) are scant. Accents have been identified as one of the main contributors to miscommunication between pilot-ATC radiotelephony communication in the air, thereby endangering flight safety. It is necessary to examine how to train ab initio and returning pilots on extracting accurate meanings from an accented instruction coming from ATCs. This thesis introduces a Text-to-Speech (TTS) supported by artificial intelligence for such training. Method: Multiple studies (a total of six) were conducted: 2 literature reviews, 4 empirical studies. For the empirical studies, 50 participants from the University of Waterloo who had experiences with flight or had experience in listening to pilot-ATC communications were recruited. They were put into two Voice Groups (HV and CV) one of which played only human voices and the other TTS. They completed two rounds (Round 1 and 2) of listening tests that contained both Aviation Script (AS; scripts read in foreign accents that were related to aviation context) and Neutral Scripts (NS; non-aviation scripts read in foreign accents with no contextual background). The foreign accents used in the listening tests along with native-accented English were three of the ICAO’s main languages: Arabic, Spanish, and French. Scores were analyzed according to the Script Types (NS, AS), Accents (Arabic, Spanish, French), Rounds (1 and 2), and Demographic Profiles (Age, Gender, Years of Speaking English, Flight Hours, Flight Ratings, Language Background, Familiarity with Arabic, Spanish, French, and Aviation English). Results: For the empirical studies, in the HV group, participants improved their scores from round 1 to 2 in the AS portion of the tests. In the CV group, participants improved their scores in NS. Examination of demographic information showed that non-native English speakers (NNES) tended to perform more poorly on average than native English speakers (NES). Being familiar with Aviation English was beneficial for completing listening tests. Also, having a higher flight rating was beneficial. Having more years of speaking English was only partially advantageous. Post survey results were analyzed, and it was found that participants in the CV group found the speech mostly unnatural. Those in the HV group also expressed difficulty in understanding due to accents but mentioned that the speech was clear, and scripts were representative of real-life pilot-ATC communication. Participants expressed foreign accents interfered with their process of logical deduction when choosing answers on the tests. Participants – regardless of whether they belonged to the HV or CV group – found NS difficult and challenging due to lacking contexts when answering questions on the tests. For AS, participants were able to piece together information using contextual knowledge related to aviation. Conclusion: Accents do interfere with pilots’ understanding in radiotelephony communication by making extracting content challenging, which in turn makes interpreting messages or instruction difficult. This is an important finding as it will affect situational awareness to a certain extent when making decisions on the fly. Pilots have to multi-task whenever possible to keep the passengers safe and to find the best route to get to a destination that maximizes fuel efficiency but minimizes passenger wait times. Communication plays a large role in deciding the fate of an aircraft’s journey. In this logic, accents can be said to be at the core of this overarching issue with language in the context of aviation. Therefore, training with a new technology such as TTS, along with other educational resources, could confer a valuable experience and exposure to pilots who are either beginning or re-starting their language training.Item Toward Adaptive and User-Centered Intelligent Vehicles: AI Models with Granular Classifications for Risk Detection, Cognitive Workload, and User Preferences(University of Waterloo, 2025-01-29) Lee, Hyowon; Samuel, SibyAs artificial intelligence (AI) increasingly integrates into our transportation systems, intelligent vehicles have emerged as research topics. Many advancements aim to enhance both the safety and comfort of drivers and the reliability of intelligent vehicles. The main focus of my research is addressing and responding to the varying states and needs of drivers, which is essential for improving driver-vehicle interactions through user-centered design. To contribute to this evolving field, this thesis explores the use of physiological signals and eye-tracking data to decode user states, perceptions, and intentions. While existing studies mostly rely on binary classification models, these approaches are limited in capturing the full spectrum of user states and needs. Addressing this gap, my research focuses on developing AI-driven models with more granular classifications for cognitive workload, risk severity levels, and user preferences for self-driving behaviours. This thesis is structured into three core domains: collision risk detection, cognitive workload estimation, and perception of user preferences for self-driving behaviours. By integrating AI techniques with multi-modal physiological data, my studies develop ML (Machine Learning) models for the domains introduced above and achieve high performance of the ML models. Feature analytical techniques are employed to enhance model interpretability for a better understanding of features and to improve the model performance. These findings pave the way for a new paradigm of intelligent vehicles that are not only more adaptive but also more aligned with user needs and preferences. This research lays the groundwork for the future development of user-centered intelligent companion systems in vehicles, where adaptive, perceptive, and interactive vehicles can better meet the complex demands of their users.Item Evaluating the Potential Environmental and Human Toxicity of Solvents Proposed for use in Post-Combustion Carbon Capture(University of Waterloo, 2025-01-28) Ghiasi, Fatima; Elkamel, AliCarbon dioxide emitted by industrial activities is a growing concern due to the effects on global climate. For this reason, firms are being urged to lower their carbon footprint. Post combustion carbon capture is being explored as a method for the power and materials industries to decarbonize. The most mature technique of carbon capture is amine absorption. Different amines are being explored to potentially be used within post-combustion carbon capture units. Many biological molecules are amines, and amines that resemble them can disrupt biological processes, harming organisms. In addition, if an amine is soluble within lipids, it can persist within the food chain and cause long term toxic effects that are not immediately visible. 151 solvents were compared based on four properties: volatility, lipophilicity, mutagenicity, and neuroactivity. Machine learning models were trained to predict these values. Due to their hydrophilicity, amino acids were determined to have the lowest potential of causing environmental toxicity.Item Investigating Technology Implementation in a Canadian Community Hospital(University of Waterloo, 2025-01-27) Allana, Sana; Burns, CatherineThe integration of technology into healthcare has witnessed significant advancements. However, the widespread adoption of such technologies may not be uniformly positive. While highest levels of adoption are typically found in densely populated urban areas, community healthcare facilities face challenges due to insufficient resources, like infrastructure, funding, and specialized staff, exacerbated by their remote locations. This is cause for concern as community hospitals account for 90% of all hospitals in Canada. This reveals a major opportunity to improve technology adoption and implementation at community hospitals, to aid their existing challenges, increase equity in healthcare, and improve generalizability of healthcare technologies. This research aims to uncover the perceptions, expectations, cultural nuances, and barriers to technology adoption at a community-level hospital in Ontario, Canada. The study began with a contextual inquiry approach, incorporating semi-structured interviews and surveys. Data was collected from nine clinical and managerial staff members whose workflows were impacted by three pilot technology projects. The interviews aimed to explore staff expectations and experiences with how these pilot projects impacted their workflows, patient care, and the overall technology implementation process. The survey included demographic questions and items based on the Unified Theory of Acceptance and Use of Technology (UTAUT) model, designed to predict factors influencing technology acceptance. The pilot technologies included a discharge planning tool, a portable X-ray scanner, and a digital pathology tool. A thematic analysis of the qualitative data was conducted, followed by affinity mapping to identify overarching themes. The Functional Resonance Analysis Method (FRAM) was also used to understand and model the impact of integrating the pilot technologies into preexisting, variable workflows. Finally, survey results were analyzed using frequency distributions to identify trends and triangulate findings. Overall, most staff reported a high level of technology use in both their work and daily lives. They also acknowledged that technology breakdowns at the workplace were inevitable, often resulting in time-consuming, manual workarounds. As well, for all pilot projects, staff felt overburdened by the additional workload required to manage the pilots alongside their regular duties. However, despite these challenges, all staff expressed an appreciation for innovation and a strong willingness to try new tools to improve their work. The discharge planning and X-ray scanner tools did not integrate well into existing workflows or provide additional value. Both tools performed inconsistently and failed to meet expectations for streamlining processes, leading to reluctance and distrust among staff. Additionally, change management planning was insufficient for both tools, with staff experiencing abrupt workflow changes, limited training, and a lack of clarity on project timelines or statuses. As a result, neither tool was requested for purchase following pilot testing. Conversely, staff decided to purchase the digital pathology tool, despite the disruptions to existing workflows, as the perceived benefits to both staff and patient care outweighed these challenges. Staff were excited about the tool’s potential and engaged in close collaboration with the manufacturer and project team. Furthermore, change management was carefully planned, with a phased implementation approach. The pilot was also driven by strong advocacy from a pathologist, which ensured alignment with clinical needs. Based on these findings, several recommendations were uncovered to improve the technology implementation process. First, the challenges with change management highlight the need for better resource allocation. This includes providing sufficient time for introducing new tools, clearly explaining the reasons for their selection, offering personalized training that covers tool usage, troubleshooting, and its impact on existing processes, and ensuring staff have the necessary bandwidth to manage change without disrupting daily operations. Second, communication channels should be improved. Startup companies should collaborate closely with the hospital during the development and testing phases to better understand staff needs and workflows, while also providing tailored support throughout the implementation process. Additionally, communication with hospital leadership must be strengthened to secure strong support, allocate resources effectively, and incorporate feedback on the challenges staff encounter, fostering a more collaborative environment that is better equipped to drive innovation. Finally, it is crucial to define and share specific success metrics for pilot projects. These metrics will help staff assess the technology's impact, make informed decisions about its use, evaluate the implementation process, identify lessons learned, and pinpoint areas for improvement, all of which can refine future technology adoption strategies. Overall, technology implementation and adoption are influenced by a variety of factors, which are further compounded by the high workload, staffing shortages, and unpredictable environments commonly found in community hospitals. By addressing these recommendations, health organizations can enhance the adoption and effectiveness of new technologies, ultimately improving staff workflows and patient care.Item Multi-Wavelength in vivo Photon Absorption Remote Sensing: Towards Non-Contact Label-Free Functional Vascular Imaging(University of Waterloo, 2025-01-23) Werezak, Sarah; Haji Reza, ParsinBlood oxygen saturation (SO2) is an important functional metric in the diagnosis and monitoring of blinding eye diseases and cancer. Additionally, SO2 imaging has high value in illustrating changes in blood oxygenation within a vascular network, particularity when changes are demonstrated within the context of surrounding biological structures. This has promising potential to provide valuable information to researchers and clinicians on the mechanisms of disease progression and the efficacy of treatment. Various techniques have been explored for SO2 imaging, however limitations of inaccuracy in measurement, a requirement of contact with the tissue and the reliance on exogenous labels have prevented the clinical adoption of these approaches. Photon absorption remote sensing (PARS) is a novel imaging technique that is label-free, non-contact and absorption-based. When a photon is absorbed by a biomolecule, energy can be released through radiative or non-radiative relaxation. Most imaging modalities are limited to capturing one form of relaxation contrast, however PARS is capable of capturing both simultaneously. The unique PARS approach has promising potential as an SO2 imaging modality. This thesis explores work which furthers efforts towards accurate, non-contact, label-free SO2 imaging using PARS. First, system developments are implemented to demonstrate the first multi-wavelength in-vivo PARS system. The use of independent excitation paths, power compensation, and the improvement of the secondary excitation generation enables the reliable and consistent in-vivo multi-wavelength PARS imaging of chicken embryo vasculature. Additionally, the power compensation of incident excitation pulses is critical for quantitative SO2 measurements to ensure that measured SO2 is not impacted by power variations in the excitation source. This is followed by the development of techniques for in-vitro phantom studies. A blood oxygenation and deoxygenation protocol is developed and tested, enabling the time-efficient and low-cost preparation of blood samples at various oxygenation levels. Additionally, a flow phantom is developed with a 50 micrometer channel which successfully enables PARS signal to be captured from blood in an in-vitro flow phantom. This experimental setup was unable to demonstrate a change in PARS signal across various blood samples at differing oxygenation levels. Simulation is used to demonstrate that the blood preparation and samples are not the cause of the unsuccessful result. This result is determined to be a consequence of the flow phantom design. The knowledge gained through the iterative design process provides valuable insight to guide future flow phantom developments. Finally, in-vivo experimentation of the multi-wavelength PARS system successfully demonstrated the variation in blood oxygenation during the hypoxia and recovery of a chicken embryo. The hypoxia holder was designed to modulate the ambient oxygen inside the holder and induce states of hypoxia and recovery. This highlights the success of the PARS multi-wavelength system in demonstrating a relative change in SO2 in-vivo. The presented work furthers efforts towards accurate, non-contact, label-free PARS SO2 imaging through the development of the first multi-wavelength in-vivo PARS system, in-vitro blood and flow phantom developments and the in-vivo demonstration of relative change in SO2 measured using PARS.Item Towards Humanoids Operating Mobility Devices Designed for Humans(University of Waterloo, 2025-01-22) Rajendran, Vidyasagar; Mombaur, KatjaHumanoid robotics is advancing rapidly, with significant potential to address challenges in disaster recovery, manufacturing, and healthcare. Despite progress, current humanoid capabilities remain limited, particularly in terms of efficient mobility over long distances. Integrating humanoid robots with personal transporters (PTs) like Segways, offers a promising solution, enabling them to operate more efficiently in human-centric environments such as factories, malls, and airports. This approach not only preserves the humanoid's ability to navigate complex, uneven terrain with its legs but also enhances versatility, allowing for faster, more energy-efficient movement on flat surfaces. This thesis explores methods for enabling bipedal humanoids to operate PTs, focusing on the REEM-C humanoid riding a Segway x2 SE. The research begins by analyzing human interactions with Segways to reverse-engineer their internal controllers, leading to a high-fidelity simulation model. This model informs the development of control algorithms for the REEM-C, enabling successful simulation-based demonstrations of humanoid-driven Segway motions, including translational, rotational, and mixed maneuvers. Building on this, balance stabilization strategies are devised for actuated balance boards, addressing both frontal and sagittal plane control through an integration of admittance control strategies. A comprehensive analysis of bimanual manipulation is also conducted, emphasizing manipulability and stability within a constrained workspace. Using a combined manipulability-stability metric, collision-free bimanual trajectories are generated, demonstrating improved stability during dynamic tasks such as manipulating objects of varying shapes and masses. This analysis underpins the implementation of bimanual manipulation strategies needed for operating the Segway’s LeanSteer handlebar. The final contribution consolidates all findings, presenting a whole-body control strategy that enables the REEM-C to ride a Segway safely and effectively. A stack-of-tasks quadratic program is utilized to ensure stability, balance, and bimanual control in dynamic conditions. Experimental validation demonstrates the feasibility of this approach, showcasing the REEM-C’s ability to operate a Segway under real-world conditions. This research provides a step towards more versatile and adaptable humanoid mobility solutions for everyday human environments.Item Broadcast is all you need: Robust Multiplayer Tracking in Ice Hockey using Monocular Videos(University of Waterloo, 2025-01-22) Prakash, Harish; Clausi, David; Zelek, JohnMOT in ice hockey pursues the combined task of detecting and associating players across a given sequence to maintain their identities. Tracking players in sports using monocular broadcast videos is an important computer vision problem that enables several downstream analytics and enhances viewership experience. However, existing tracking approaches encounter significant challenges in dealing with occlusions, blurs, camera pan-tilt-zoom effects, and dynamic player movements prevalent in telecast feeds. These challenges are further exacerbated in fast-paced sports such as ice hockey, where existing trackers struggle to maintain identity consistency due to players' sudden, non-linear motion patterns. In this thesis, acknowledging the fundamental role of quality datasets, we first present two hockey tracking datasets: our previously developed HTD-1 and a newly curated, open-source dataset called HTD-2, annotated from broadcast NHL games. Based on this new dataset, we establish a reference benchmark by evaluating six SOTA tracking methods to enable performance comparisons in hockey MOT. A detailed study is conducted for each algorithm to understand their merits and drawbacks on tracking players. Next, to address the present limitations, we propose a novel tracking model formulating MOT as a bipartite graph matching problem cued with homography inputs. Specifically, we disambiguate the positional representation of occluded players as viewed through broadcast footage, by warping them onto a view-invariant overhead rink template and encode their transformations into the graph message passing network. This ensures reliable spatial context for identity-preserved track prediction. Experimental results demonstrate that our model achieves a 10 times reduction in IDsw and a 32.45% improvement in IDF1 score compared to the existing baseline on HTD-1, establishing a new SOTA. The proposed model also exhibits strong generalization capabilities, achieving 92.8% IDF1 and only 60 IDsw during cross-validation on HTD-2. Finally, ablation studies are presented to validate our performance and substantiate our approach.