论文标题
WikidoMiner:Wikipedia域特异性矿工
WikiDoMiner: Wikipedia Domain-specific Miner
论文作者
论文摘要
我们介绍了Wikidominer,这是一种通过爬行Wikipedia自动生成特定于域的Corpora的工具。 WikidOminer帮助需求工程师创建一个外部知识资源,该资源是特定于给定需求规范(RS)的基础领域的外部知识资源。能够构建这样的资源很重要,因为特定于域的数据集很少。 Wikidominer首先从给定的RS提取一组特定域的关键字来生成语料库,然后对这些关键字查询Wikipedia。 Wikidominer的输出是与输入Rs域相关的一组Wikipedia文章。针对特定领域知识的挖掘Wikipedia可能对多个需求工程任务有益,例如模棱两可的处理,需求分类和问答。 Wikidominer可在Zenodo公开提供,并获得开源许可(DOI:10.5281/Zenodo.6671357)。
We introduce WikiDoMiner, a tool for automatically generating domain-specific corpora by crawling Wikipedia. WikiDoMiner helps requirements engineers create an external knowledge resource that is specific to the underlying domain of a given requirements specification (RS). Being able to build such a resource is important since domain-specific datasets are scarce. WikiDoMiner generates a corpus by first extracting a set of domain-specific keywords from a given RS, and then querying Wikipedia for these keywords. The output of WikiDoMiner is a set of Wikipedia articles relevant to the domain of the input RS. Mining Wikipedia for domain-specific knowledge can be beneficial for multiple requirements engineering tasks, e.g., ambiguity handling, requirements classification, and question answering. WikiDoMiner is publicly available on Zenodo under an open-source license (DOI: 10.5281/zenodo.6671357).