Show simple item record

dc.contributor.authorHryniowski, Andrew
dc.date.accessioned2024-04-25 13:49:51 (GMT)
dc.date.available2024-04-25 13:49:51 (GMT)
dc.date.issued2024-04-25
dc.date.submitted2024-04-11
dc.identifier.urihttp://hdl.handle.net/10012/20495
dc.description.abstractOver the past decade, convolutional neural networks (CNNs) have become the defacto machine learning model for image processing due to their inherent ability to capitalize on modern data availability and computational resources. Much of a CNN's capabilities come from their modularity and flexibility in model design. As such, practitioners have been able to successfully tackle applications not previously possible with other contemporary methods. The downside to this flexibility is that it makes designing and improving upon a CNN's performance an arduous task. Designing a CNN is not a straightforward process. Model architecture design, learning strategies, and data selection and processing must all be precisely tuned for a researcher to produce even a non-random performing model. Finding the correct balance to achieve start-of-the-art can be its own challenge requiring months or years of effort. When building a new model, researchers will rely on quantitative metrics to guide the development process. Typically, these metrics revolve around model performance characteristic constraints (e.g., accuracy, recall, precision, robustness) and computational (e.g., number of parameters, number of FLOPS), while the learned internal data processing behaviour of a CNN is ignored. Some research investigating the internal behaviour of CNNs has been proposed and adopted by a niche group within the broader deep learning community. Because these methods operate on extremely high dimensional latent embeddings (between one to three orders of magnitude larger than the input data) they are computationally expensive to compute. In addition, many of the most common methods do not share a common root from which downstream metrics can be computed, thus making the use of multiple metrics prohibitive. In this work we propose a novel analytic framework that offers a broad range of complementary metrics that can be used by a researcher to study the internal behaviour of a CNN, and whose findings can be used to guide model performance improvements. We call the proposed framework Representational Response Analysis (RRA). The RRA framework is built around a common computational kNN based model of the latent embeddings of a dataset at each layer in a CNN. Using the information contained within these kNNs, we propose three complementary metrics that extract targeted information and provides a researcher with the ability to investigate specific behaviours of a CNN across all of its layers. For this work we focus our attention on classification based CNNs and perform two styles of experiments using the proposed RRA framework. The first set of experiments revolve around better understanding RRA hyper-parameter selection and the impacts on the downstream metrics with regards to observed characteristics of a CNN. From this first set of experiments we determine the effects of adjusting specific RRA hyper-parameters, and we propose general guidelines for selecting these hyper-parameters. The second set of experiments investigates the impact of specific CNN design choices. To be more precise, we use RRA to investigate the consequences on a CNN's latent representation when training with and without data augmentations, and to understand the latent embedding symmetries across different pooled spatial resolutions. For each of these experiments RRA provides novel insights into the internal workings of a CNN. Using the insights from the pooled spatial resolution experiments we propose a novel CNN attention-based building block that is specifically designed to take advantage of key latent properties of a ResNet. We call the proposed building block the Scale Transformed Attention Condenser (STAC) module. We demonstrate that the proposed STAC module not only improves a model's performance across a selection of model-dataset pairs, but that it does so with an improved performance-to-computational-cost tradeoff when compared to other CNN spatial attention-based modules of similar FLOPS or number of parameters.en
dc.language.isoenen
dc.publisherUniversity of Waterlooen
dc.subjectconvolutional neural networken
dc.subjectmanifolden
dc.subjectlatent spaceen
dc.subjectrepresentational response analysisen
dc.subjectdeep learningen
dc.subjectspatial attentionen
dc.subjectaugmentationen
dc.titleA Representational Response Analysis Framework For Convolutional Neural Networksen
dc.typeDoctoral Thesisen
dc.pendingfalse
uws-etd.degree.departmentSystems Design Engineeringen
uws-etd.degree.disciplineSystem Design Engineeringen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.degreeDoctor of Philosophyen
uws-etd.embargo.terms0en
uws.contributor.advisorWong, Alexander
uws.contributor.affiliation1Faculty of Engineeringen
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages