Active Learning with semi-supervised Gaussian Mixture Model on Network Flow Detection

dc.contributor.advisor Li, Qi
dc.contributor.advisor Cai, Ying
dc.contributor.advisor Zhang, Wensheng
dc.contributor.author Song, Wenfei
dc.contributor.department Department of Computer Science
dc.date.accessioned 2023-06-20T22:17:19Z
dc.date.available 2023-06-20T22:17:19Z
dc.date.issued 2023-05
dc.date.updated 2023-06-20T22:17:19Z
dc.description.abstract One of the biggest challenges in network flow detection is that the data distribution is extremely unbalanced, which leads to the inability of traditional unsupervised learning to find data features effectively, while semi-supervised/supervised learning methods require a large number of labeled samples to obtain enough information to study. Active learning can effectively reduce the cost of labeling through the process of "training-labeling the most valuable samples-training". Existing work pays more attention to the application of pool-based active learning in related scenarios, and mostly ignores the startup problem when there are no labeled samples in the initial condition. This report better simulates real-world application scenarios through special pool-stream input settings, focusing on solving the cold-start problem on extremely unbalanced datasets, and proposes a new initiative that combines uncertainty and marginal information for active learning selection strategies. We organize the rest of the report as following. In Chapter 1, we briefly discuss the background of active learning in the field of network flow detection, and in Chapter 2 we discuss existing related work. Chapter 3 presents the preliminary work, and Chapter 4 provides a formal definition of the problem. In Chapter 5, we give a concrete description of our work and present our work results in Chapter 6. At last, we make a conclusion of our work in Chapter 7.
dc.format.mimetype PDF
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/ywAbmxZv
dc.language.iso en
dc.language.rfc3066 en
dc.subject.disciplines Computer science en_US
dc.subject.keywords active learning en_US
dc.subject.keywords Gaussian Mixture Model en_US
dc.subject.keywords imbalance en_US
dc.subject.keywords semi-supervised en_US
dc.title Active Learning with semi-supervised Gaussian Mixture Model on Network Flow Detection
dc.type thesis en_US
dc.type.genre thesis en_US
dspace.entity.type Publication
relation.isOrgUnitOfPublication f7be4eb9-d1d0-4081-859b-b15cee251456
thesis.degree.discipline Computer science en_US
thesis.degree.grantor Iowa State University en_US
thesis.degree.level thesis $
thesis.degree.name Master of Science en_US
File
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Song_iastate_0097M_20697.pdf
Size:
812.15 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
0 B
Format:
Item-specific license agreed upon to submission
Description: