Developing and validating a methodology for crowdsourcing L2 speech ratings in Amazon Mechanical Turk

Date
2019-01-21
Authors
Nagle, Charles
Nagle, Charles
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Altmetrics
Authors
Research Projects
Organizational Units
World Languages and Cultures
Organizational Unit
Journal Issue
Series
Department
World Languages and Cultures
Abstract

Researchers have increasingly turned to Amazon Mechanical Turk (AMT) to crowdsource speech data, predominantly in English. Although AMT and similar platforms are well positioned to enhance the state of the art in L2 research, it is unclear if crowdsourced L2 speech ratings are reliable, particularly in languages other than English. The present study describes the development and deployment of an AMT task to crowdsource comprehensibility, fluency, and accentedness ratings for L2 Spanish speech samples. Fifty-four AMT workers who were native Spanish speakers from 11 countries participated in the ratings. Intraclass correlation coefficients were used to estimate group-level interrater reliability, and Rasch analyses were undertaken to examine individual differences in rater severity and fit. Excellent reliability was observed for the comprehensibility and fluency ratings, but indices were slightly lower for accentedness, leading to recommendations to improve the task for future data collection.

Comments

This accept article is published as Nagle, C.L.V., Developing and validating a methodology for crowdsourcing L2 speech ratings in Amazon Mechanical Turk. Journal of Second Language Pronunciation. 2019. DOI: 10.1075/jslp.18016.nag. Posted with permission.

Description
Keywords
Citation
DOI
Collections