论文标题
使用数据信封分析评估自然语言模型的资源绩效权衡取舍
Assessing Resource-Performance Trade-off of Natural Language Models using Data Envelopment Analysis
论文作者
论文摘要
自然语言模型通常通过一组高维度的描述性指标来概括,包括培训语料库规模,训练时间,可训练参数的数量,推理时间和评估统计数据,以评估跨任务的性能。这些指标的高维质在客观比较模型方面引起了挑战。特别是要评估绩效和资源之间的权衡模型(计算时间,内存等)是一项挑战。 我们将数据包络分析(DEA)应用于评估资源绩效权衡的问题。 DEA是一种非参数方法,可衡量消耗一个或多个输入并产生至少一个输出的抽象单元的生产效率。我们将自然语言模型重新铸造为适合DEA的单元,我们表明DEA可用于创建一个有效的框架来量化模型性能和效率。 DEA的一个主要特征是它标识了一个在高效的性能边界上生活的模型子集。 DEA也可扩展,已应用于数千个单位的问题。我们报告了DEA的经验结果应用于具有各种体系结构的14种不同语言模型,我们表明DEA可用于确定有效平衡资源需求与性能的模型子集。
Natural language models are often summarized through a high-dimensional set of descriptive metrics including training corpus size, training time, the number of trainable parameters, inference times, and evaluation statistics that assess performance across tasks. The high dimensional nature of these metrics yields challenges with regard to objectively comparing models; in particular it is challenging to assess the trade-off models make between performance and resources (compute time, memory, etc.). We apply Data Envelopment Analysis (DEA) to this problem of assessing the resource-performance trade-off. DEA is a nonparametric method that measures productive efficiency of abstract units that consume one or more inputs and yield at least one output. We recast natural language models as units suitable for DEA, and we show that DEA can be used to create an effective framework for quantifying model performance and efficiency. A central feature of DEA is that it identifies a subset of models that live on an efficient frontier of performance. DEA is also scalable, having been applied to problems with thousands of units. We report empirical results of DEA applied to 14 different language models that have a variety of architectures, and we show that DEA can be used to identify a subset of models that effectively balance resource demands against performance.