论文标题
嵌合体:用于FPGA高级合成的混合机器学习驱动的多目标设计空间探索工具
Chimera: A Hybrid Machine Learning Driven Multi-Objective Design Space Exploration Tool for FPGA High-Level Synthesis
论文作者
论文摘要
近年来,由于FPGA的非凡灵活性,基于现场可编程的门阵列(FPGA)的硬件加速器已被广泛采用。但是,由于高灵活性,设计和优化的难度都很难。通常,这些加速器的设计具有低级硬件描述性语言,这意味着以复杂的行为创建大型设计非常困难。因此,创建了高级合成(HLS)工具,以简化FPGA的硬件设计。它们使用户能够使用高级语言创建硬件设计,并提供各种优化指令,以帮助提高合成硬件的性能。但是,应用这些优化来实现高性能是耗时的,通常需要专家知识。为了解决这个困难,我们提出了一种用于应用HLS优化指令(称为Chimera)的自动设计空间探索工具,该指令大大减少了创建高性能HLS设计所需的人类努力和专业知识。它利用了一种新型的多目标探索方法,该方法无缝地集成了活跃的学习,进化算法和汤普森采样,使其能够在帕累托曲线上找到一组优化设计,并且在探索过程中仅评估了少量的设计点。在实验中,在不到24小时的时间内,这种混合方法探索了与由Rosetta基准Suite专家HLS用户创建的高度优化的手工调整设计相比具有相同或出色的性能的设计点。除了发现极端点外,它还探索了帕累托边境,肘部点可以节省多达26 \%的触发器资源,而延迟较高。
In recent years, hardware accelerators based on field-programmable gate arrays (FPGAs) have been widely adopted, thanks to FPGAs' extraordinary flexibility. However, with the high flexibility comes the difficulty in design and optimization. Conventionally, these accelerators are designed with low-level hardware descriptive languages, which means creating large designs with complex behavior is extremely difficult. Therefore, high-level synthesis (HLS) tools were created to simplify hardware designs for FPGAs. They enable the user to create hardware designs using high-level languages and provide various optimization directives to help to improve the performance of the synthesized hardware. However, applying these optimizations to achieve high performance is time-consuming and usually requires expert knowledge. To address this difficulty, we present an automated design space exploration tool for applying HLS optimization directives, called Chimera, which significantly reduces the human effort and expertise needed for creating high-performance HLS designs. It utilizes a novel multi-objective exploration method that seamlessly integrates active learning, evolutionary algorithm, and Thompson sampling, making it capable of finding a set of optimized designs on a Pareto curve with only a small number of design points evaluated during the exploration. In the experiments, in less than 24 hours, this hybrid method explored design points that have the same or superior performance compared to highly optimized hand-tuned designs created by expert HLS users from the Rosetta benchmark suite. In addition to discovering the extreme points, it also explores a Pareto frontier, where the elbow point can potentially save up to 26\% of Flip-Flop resource with negligibly higher latency.