MaChAmp: Multi-task Learning to the Rescue in Resource Scarce Scenarios - Slides
In Natural Language Processing (NLP) the Wall Street Journal section of the Penn Treebanks has been the main evaluation benchmark for a long time. This dataset contains well-edited English news texts from the 1980s, and is thus not representative of most real-world language use. If we transfer to new domains and languages, current systems struggle more since our algorithms were not designed for these and training data is scarce. MaChAmp is a toolkit focusing on multi-task learning, which can be used to bridge the performance gap to more interesting language varieties. In this talk, I will walk through the abilities of the toolkit, how its made to be efficient, and how we used it to cheaply improve performance on a wide variety of tasks and languages.
Rob van der Goot got his PhD from the University of Groningen, where he worked on normalization of social media data and automatic syntactic analysis. Since then, he has been at the ITU, where he has broadened up the scope of his research which now focuses on multi-task, multi-lingual, and cross-domain natural language processing.