使用形状指标来描述2D数据点

论文标题

使用形状指标来描述2D数据点

Using Shape Metrics to Describe 2D Data Points

论文作者

Lamberti, William Franz

论文摘要

传统的机器学习（ML）算法，例如多重回归，要求人类分析师就如何处理数据做出决定。这些决定可以使模型构建过程主观且难以复制那些不建立模型的人。一旦人类分析师构建体系结构，深度学习方法通过允许模型了解哪些功能很重要，从而受益。因此，一种自动化传统ML建模某些人类决策的方法将有助于提高可重复性并消除模型构建过程的主观方面。为此，我们建议使用形状指标来描述2D数据，以帮助使分析更容易解释和解释。拟议的方法为以可解释和可解释的方式帮助自动化模型构建的各个方面提供了基础。这在“解释权”至关重要的医学界的应用中尤其重要。我们提供各种模拟数据集，包括概率分布，函数和模型质量控制检查（例如QQ图和来自普通最小二乘正方形的剩余分析），以展示这种方法的广度。

Traditional machine learning (ML) algorithms, such as multiple regression, require human analysts to make decisions on how to treat the data. These decisions can make the model building process subjective and difficult to replicate for those who did not build the model. Deep learning approaches benefit by allowing the model to learn what features are important once the human analyst builds the architecture. Thus, a method for automating certain human decisions for traditional ML modeling would help to improve the reproducibility and remove subjective aspects of the model building process. To that end, we propose to use shape metrics to describe 2D data to help make analyses more explainable and interpretable. The proposed approach provides a foundation to help automate various aspects of model building in an interpretable and explainable fashion. This is particularly important in applications in the medical community where the `right to explainability' is crucial. We provide various simulated data sets ranging from probability distributions, functions, and model quality control checks (such as QQ-Plots and residual analyses from ordinary least squares) to showcase the breadth of this approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题