论文标题

Nighthawk:通过视觉理解显示全自动定位UI显示问题

Nighthawk: Fully Automated Localizing UI Display Issues via Visual Understanding

论文作者

Liu, Zhe, Chen, Chunyang, Wang, Junjie, Huang, Yuekai, Hu, Jun, Wang, Qing

论文摘要

图形用户界面(GUI)提供了软件应用程序和最终用户之间的视觉桥梁,他们可以通过它们相互交互。随着移动设备的升级和美学的发展,GUI的视觉效果越来越吸引人,用户更加关注应用程序的可访问性和可用性。但是,这种GUI复杂性对GUI实施构成了巨大的挑战。根据我们对人群测试的错误报告的试点研究,显示文本重叠,组件遮挡等问题,由于软件或硬件兼容性,在GUI渲染期间始终出现丢失的图像。它们对应用程序的可用性产生负面影响,导致用户体验差。为了检测这些问题,我们提出了一种完全自动化的方法Nighthawk,基于对GUI屏幕截图的视觉信息进行建模的深度学习。 Nighthawk可以在显示问题上检测GUI,并在给定GUI中找到问题的详细区域,以指导开发人员修复错误。同时,培训该模型需要大量标记的越野车屏幕截图,这需要大量的手动努力来准备它们。因此,我们提出了一种基于启发式的培训数据自动生成方法,以自动生成标记的培训数据。评估表明,我们的Nighthawk可以平均达到0.84的精度和0.84召回,在检测UI显示问题时,平均0.59 AP和0.60 AR在本地化这些问题时。我们还在Google Play和F-Droid上使用流行的Android应用程序评估了Nighthawk,并成功地发现了151个以前未定的UI显示问题,其中75个已被确认或固定。

Graphical User Interface (GUI) provides a visual bridge between a software application and end users, through which they can interact with each other. With the upgrading of mobile devices and the development of aesthetics, the visual effects of the GUI are more and more attracting, and users pay more attention to the accessibility and usability of applications. However, such GUI complexity posts a great challenge to the GUI implementation. According to our pilot study of crowdtesting bug reports, display issues such as text overlap, component occlusion, missing image always occur during GUI rendering on different devices due to the software or hardware compatibility. They negatively influence the app usability, resulting in poor user experience. To detect these issues, we propose a fully automated approach, Nighthawk, based on deep learning for modelling visual information of the GUI screenshot. Nighthawk can detect GUIs with display issues and also locate the detailed region of the issue in the given GUI for guiding developers to fix the bug. At the same time, training the model needs a large amount of labeled buggy screenshots, which requires considerable manual effort to prepare them. We therefore propose a heuristic-based training data auto-generation method to automatically generate the labeled training data. The evaluation demonstrates that our Nighthawk can achieve average 0.84 precision and 0.84 recall in detecting UI display issues, average 0.59 AP and 0.60 AR in localizing these issues. We also evaluate Nighthawk with popular Android apps on Google Play and F-Droid, and successfully uncover 151 previously-undetected UI display issues with 75 of them being confirmed or fixed so far.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源