Are Code Examples on an Online Q&A Forum Reliable?

Zhang, Tianyi
Upadhyaya, Ganesha
Rajan, Hridesh
Reinhardt, Anastasia
Rajan, Hridesh
Kim, Miryung
Major Professor
Committee Member
Journal Title
Journal ISSN
Volume Title
Research Projects
Organizational Units
Computer Science
Organizational Unit
Journal Issue
Computer Science

Programmers often consult an online Q&A forum such as Stack Overflow to learn new APIs. This paper presents an empirical study on the prevalence and severity of API misuse on Stack Overflow. To reduce manual assessment effort, we design ExampleCheck, an API usage mining framework that extracts patterns from over 380K Java repositories on GitHub and subsequently reports potential API usage violations in Stack Overflow posts. We analyze 217,818 Stack Overflow posts using ExampleCheck and find that 31% may have potential API usage violations that could produce unexpected behavior such as program crashes and resource leaks. Such API misuse is caused by three main reasons---missing control constructs, missing or incorrect order of API calls, and incorrect guard conditions. Even the posts that are accepted as correct answers or upvoted by other programmers are not necessarily more reliable than other posts in terms of API misuse. This study result calls for a new approach to augment Stack Overflow with alternative API usage details that are not typically shown in curated examples.


This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Zhang, Tianyi, Ganesha Upadhyaya, Anastasia Reinhardt, Hridesh Rajan, and Miryung Kim. "Are code examples on an online Q&A forum reliable?: a study of API misuse on stack overflow." In Proceedings of the 40th International Conference on Software Engineering, pp. 886-896. ACM, 2018. DOI: 10.1145/3180155.3180260. Posted with permission.