Active Learning with semi-supervised Gaussian Mixture Model on Network Flow Detection
dc.contributor.advisor | Li, Qi | |
dc.contributor.advisor | Cai, Ying | |
dc.contributor.advisor | Zhang, Wensheng | |
dc.contributor.author | Song, Wenfei | |
dc.contributor.department | Department of Computer Science | |
dc.date.accessioned | 2023-06-20T22:17:19Z | |
dc.date.available | 2023-06-20T22:17:19Z | |
dc.date.issued | 2023-05 | |
dc.date.updated | 2023-06-20T22:17:19Z | |
dc.description.abstract | One of the biggest challenges in network flow detection is that the data distribution is extremely unbalanced, which leads to the inability of traditional unsupervised learning to find data features effectively, while semi-supervised/supervised learning methods require a large number of labeled samples to obtain enough information to study. Active learning can effectively reduce the cost of labeling through the process of "training-labeling the most valuable samples-training". Existing work pays more attention to the application of pool-based active learning in related scenarios, and mostly ignores the startup problem when there are no labeled samples in the initial condition. This report better simulates real-world application scenarios through special pool-stream input settings, focusing on solving the cold-start problem on extremely unbalanced datasets, and proposes a new initiative that combines uncertainty and marginal information for active learning selection strategies. We organize the rest of the report as following. In Chapter 1, we briefly discuss the background of active learning in the field of network flow detection, and in Chapter 2 we discuss existing related work. Chapter 3 presents the preliminary work, and Chapter 4 provides a formal definition of the problem. In Chapter 5, we give a concrete description of our work and present our work results in Chapter 6. At last, we make a conclusion of our work in Chapter 7. | |
dc.format.mimetype | ||
dc.identifier.uri | https://dr.lib.iastate.edu/handle/20.500.12876/ywAbmxZv | |
dc.language.iso | en | |
dc.language.rfc3066 | en | |
dc.subject.disciplines | Computer science | en_US |
dc.subject.keywords | active learning | en_US |
dc.subject.keywords | Gaussian Mixture Model | en_US |
dc.subject.keywords | imbalance | en_US |
dc.subject.keywords | semi-supervised | en_US |
dc.title | Active Learning with semi-supervised Gaussian Mixture Model on Network Flow Detection | |
dc.type | thesis | en_US |
dc.type.genre | thesis | en_US |
dspace.entity.type | Publication | |
relation.isOrgUnitOfPublication | f7be4eb9-d1d0-4081-859b-b15cee251456 | |
thesis.degree.discipline | Computer science | en_US |
thesis.degree.grantor | Iowa State University | en_US |
thesis.degree.level | thesis | $ |
thesis.degree.name | Master of Science | en_US |
File
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- Song_iastate_0097M_20697.pdf
- Size:
- 812.15 KB
- Format:
- Adobe Portable Document Format
- Description:
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 0 B
- Format:
- Item-specific license agreed upon to submission
- Description: