论文标题
由自动生成和手动定义的健身功能驱动的基于搜索的软件测试
Search-based Software Testing Driven by Automatically Generated and Manually Defined Fitness Functions
论文作者
论文摘要
基于搜索的软件测试(SBST)通常依赖健身功能来指导搜索探索软件故障。有两种主要技术来定义适应性功能:(a)系统要求规范的自动健身功能计算,以及(b)手动健身功能设计。两种技术都有优势。前者使用系统要求中的信息来指导搜索输入域的部分更可能包含故障。后者使用工程师的领域知识。我们提出了雅典娜(Athena),这是一种新型的SBST框架,结合了从需求规格和工程师手动定义的功能自动生成的健身功能。我们设计和实施雅典娜 - 雅典娜实例,该实例针对Simulink模型。我们通过考虑来自不同领域的大量模型来评估Athena-S。我们的结果表明,与现有的基线工具相比,Athena-S产生更多的失败测试用例,并且Athena-S的运行时性能与基线工具之间的差异在统计学上不显着。我们还评估Athena-S应用于两个代表性案例研究时是否可以产生避开失败的测试案例:一个来自汽车领域,一个来自医疗领域。我们的结果表明,雅典娜 - 成功揭示了我们的案例研究违反了要求。
Search-based software testing (SBST) typically relies on fitness functions to guide the search exploration toward software failures. There are two main techniques to define fitness functions: (a) automated fitness function computation from the specification of the system requirements, and (b) manual fitness function design. Both techniques have advantages. The former uses information from the system requirements to guide the search toward portions of the input domain more likely to contain failures. The latter uses the engineers' domain knowledge. We propose ATheNA, a novel SBST framework that combines fitness functions automatically generated from requirements specifications and those manually defined by engineers. We design and implement ATheNA-S, an instance of ATheNA that targets Simulink models. We evaluate ATheNA-S by considering a large set of models from different domains. Our results show that ATheNA-S generates more failure-revealing test cases than existing baseline tools and that the difference between the runtime performance of ATheNA-S and the baseline tools is not statistically significant. We also assess whether ATheNA-S could generate failure-revealing test cases when applied to two representative case studies: one from the automotive domain and one from the medical domain. Our results show that ATheNA-S successfully revealed a requirement violation in our case studies.