A Gold Standard Methodology for Evaluating Accuracy in Data-To-Text Systems

Thomson, Craig AlexanderReiter, Ehud2020-12-072020-12-072020-12Thomson, C A & Reiter, E 2020, 'A Gold Standard Methodology for Evaluating Accuracy in Data-To-Text Systems', Paper presented at Proceedings of the 13th International Conference on Natural Language Generation, Dublin, Ireland, 15/12/20 - 18/12/20 pp. 158-168. https://doi.org/10.18653/v1/2020.inlg-1.22conferenceORCID: /0000-0002-7548-9504/work/84977771https://hdl.handle.net/2164/15469Acknowledgements: Many thanks to the Mechanical Turk annotators who participated in our experiment, and also to David Reiter, Tim Daniels, Rodrigo de Oliveira, and Andrew Smith for serving as pilot annotators when we were developing the methodology described in this paper. We would also like to thank Moray Greig for being our basketball domain expert during development. We are also grateful for the very helpful comments on this paper from the anonymous reviewers, the Aberdeen CLAN group, David Howcroft, Clement Rebuffel, and Chris van ´ der Lee. We would also like to thank Sam Wiseman, Ratish Puduppully, and Clement Rebuffel for pro- viding the generated texts from their respective systems. The work presented here is partially funded by the Engineering and Physical Sciences Research Council (EPSRC), which funds Craig Thomson under a National Productivity Investment Fund Doctoral Studentship (EP/R512412/1).11201756engQA75 Electronic computers. Computer scienceEngineering and Physical Sciences Research Council (EPSRC)EP/R512412/1QA75A Gold Standard Methodology for Evaluating Accuracy in Data-To-Text SystemsConference paper10.18653/v1/2020.inlg-1.22https://www.aclweb.org/anthology/2020.inlg-1.22/