What Kinds of Contracts Do ML APIs Need?

dc.contributor.author Khairunnesa, Samantha Syeda
dc.contributor.author Ahmed, Shibbir
dc.contributor.author Imtiaz, Sayem Mohammad
dc.contributor.author Rajan, Hridesh
dc.contributor.author Leavens, Gary T.
dc.contributor.department Department of Computer Science
dc.date.accessioned 2023-08-03T14:53:50Z
dc.date.available 2023-08-03T14:53:50Z
dc.date.issued 2023-07-26
dc.description.abstract Recent work has shown that Machine Learning (ML) programs are error-prone and called for contracts for ML code. Contracts, as in the design by contract methodology, help document APIs and aid API users in writing correct code. The question is: what kinds of contracts would provide the most help to API users? We are especially interested in what kinds of contracts help API users catch errors at earlier stages in the ML pipeline. We describe an empirical study of posts on Stack Overflow of the four most often-discussed ML libraries: TensorFlow, Scikit-learn, Keras, and PyTorch. For these libraries, our study extracted 413 informal (English) API specifications. We used these specifications to understand the following questions. What are the root causes and effects behind ML contract violations? Are there common patterns of ML contract violations? When does understanding ML contracts require an advanced level of ML software expertise? Could checking contracts at the API level help detect the violations in early ML pipeline stages? Our key findings are that the most commonly needed contracts for ML APIs are either checking constraints on single arguments of an API or on the order of API calls. The software engineering community could employ existing contract mining approaches to mine these contracts to promote an increased understanding of ML APIs. We also noted a need to combine behavioral and temporal contract mining approaches. We report on categories of required ML contracts, which may help designers of contract languages.
dc.description.comments This preprint is made available through arXiv at: https://arxiv.org/abs/2307.14465. Copyright 2023, The Authors. This work is available under a Creative Commons Attribution 4.0 International License https://creativecommons.org/licenses/by/4.0/.
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/gwW7JLGw
dc.language.iso en
dc.source.uri https://arxiv.org/abs/2307.14465 *
dc.subject.disciplines DegreeDisciplines::Physical Sciences and Mathematics::Computer Sciences::Software Engineering
dc.subject.disciplines DegreeDisciplines::Physical Sciences and Mathematics::Computer Sciences::Programming Languages and Compilers
dc.subject.keywords Machine Learning
dc.subject.keywords API contracts
dc.subject.keywords Empirical software engineering
dc.subject.keywords Software engineering for machine learning
dc.title What Kinds of Contracts Do ML APIs Need?
dc.type preprint
dc.type.genre preprint
dspace.entity.type Publication
relation.isAuthorOfPublication 4e3f4631-9a99-4a4d-ab81-491621e94031
relation.isOrgUnitOfPublication f7be4eb9-d1d0-4081-859b-b15cee251456
File
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
2023-Rajan-WhatKindsPreprint.pdf
Size:
1.56 MB
Format:
Adobe Portable Document Format
Description:
Collections