论文标题
TICO-19:COVID-19的翻译计划
TICO-19: the Translation Initiative for Covid-19
论文作者
论文摘要
1900年的大流行是一个多世纪以来世界上最糟糕的大流行。对于阻止SARS-COV-2病毒的潮流至关重要的是,脆弱的人群可以保护自己。为此,构成Covid-19的翻译计划(TICO-19)的合作者已经为AI和MT研究人员提供了35种不同语言的AI和MT研究人员的测试和开发数据,以促进开发工具和资源,以改善这些语言中有关COVID-19的信息的访问。除了9种高资源的“枢轴”语言外,该团队还针对26种资源较低的语言,特别是非洲,南亚和东南亚的语言,他们的人口可能最容易受到病毒的传播。相同的数据转化为所代表的所有语言,这意味着可以为集合中的任何语言配对进行测试或开发。此外,团队正在将测试和开发数据转换为翻译记忆(TMX),这些记忆可以由来自任何语言的本地化使用者使用。
The COVID-19 pandemic is the worst pandemic to strike the world in over a century. Crucial to stemming the tide of the SARS-CoV-2 virus is communicating to vulnerable populations the means by which they can protect themselves. To this end, the collaborators forming the Translation Initiative for COvid-19 (TICO-19) have made test and development data available to AI and MT researchers in 35 different languages in order to foster the development of tools and resources for improving access to information about COVID-19 in these languages. In addition to 9 high-resourced, "pivot" languages, the team is targeting 26 lesser resourced languages, in particular languages of Africa, South Asia and South-East Asia, whose populations may be the most vulnerable to the spread of the virus. The same data is translated into all of the languages represented, meaning that testing or development can be done for any pairing of languages in the set. Further, the team is converting the test and development data into translation memories (TMXs) that can be used by localizers from and to any of the languages.