论文标题
重新思考基于逻辑综合学习的增强学习
Rethinking Reinforcement Learning based Logic Synthesis
论文作者
论文摘要
最近,增强学习已用于通过将操作员序列优化问题作为马尔可夫决策过程来解决逻辑综合。但是,通过广泛的实验,我们发现学习的策略使决策与电路功能(即状态)独立,并产生一个操作员序列,该序列在某种程度上在操作员方面不变。基于这些发现,我们开发了一种新的基于RL的方法,该方法可以自动识别关键操作员并生成可推广到看不见的电路的常见操作员序列。我们的算法在EPFL基准,私人数据集和工业规模的电路上都得到了验证。实验结果表明,它在延迟,区域和运行时可以达到良好的平衡,并且在工业使用方面是实用的。
Recently, reinforcement learning has been used to address logic synthesis by formulating the operator sequence optimization problem as a Markov decision process. However, through extensive experiments, we find out that the learned policy makes decisions independent from the circuit features (i.e., states) and yields an operator sequence that is permutation invariant to some extent in terms of operators. Based on these findings, we develop a new RL-based method that can automatically recognize critical operators and generate common operator sequences generalizable to unseen circuits. Our algorithm is verified on both the EPFL benchmark, a private dataset and a circuit at industrial scale. Experimental results demonstrate that it achieves a good balance among delay, area and runtime, and is practical for industrial usage.