Home WakeSpace Scholarship › Electronic Theses and Dissertations

Efficient Classifier Building Through Uncertainty Sampling

Electronic Theses and Dissertations

Item Files

Item Details

title
Efficient Classifier Building Through Uncertainty Sampling
author
Johnson, Johe Houston
abstract
In a world where unlabeled data are abundant and labeling them is labor-intensive, automating most of the labeling process with predictions made by a classifier trained on a smaller set of labeled samples is a natural and efficient strategy. However, randomly sampling the training data may not always be suitable; class imbalance, especially in conjunction with goals of increasing non-accuracy metrics such as sensitivity motivates a cleverer approach to sampling that could yield better performance with less manual labelling. In this thesis, we explore uncertainty sampling, a popular active learning method where we iteratively query the most uncertain observation. We implement uncertainty sampling on a real world dataset, and compare its performance to the performance of random sampling for various metrics. We also conduct simulations with data possessing different class imbalances and structure, for classifiers with varying levels of flexibility.
subject
active learning
case study
classification
machine learning
selective sampling
uncertainty sampling
contributor
Evans, Ciaran (advisor)
Berenhaut, Kenneth (committee member)
Lotspeich, Sarah (committee member)
date
2023-07-25T17:48:30Z (accessioned)
2023-07-25T17:48:30Z (available)
2023 (issued)
degree
Statistics (discipline)
identifier
http://hdl.handle.net/10339/102231 (uri)
language
en (iso)
publisher
Wake Forest University
type
Thesis

Usage Statistics