基于FPGA的深度强化学习硬件加速技术研究
2022,30(6):242-247
摘要:深度强化学习(Deep Reinforcement Learning, DRL)是机器学习领域的一个重要分支,用于解决各种序贯决策问题,在自动驾驶、工业物联网等领域具有广泛的应用前景。由于DRL具备计算密集型的特点,导致其难以在计算资源受限且功耗要求苛刻的嵌入式平台上进行部署。针对DRL在嵌入式平台上部署的局限性,采用软硬件协同设计的方法,设计了一种面向DRL的FPGA加速器,提出了一种设计空间探索方法,在ZYNQ7100异构计算平台上完成了对Cartpole应用的在线决策任务。实验结果表明,研究在进行典型DRL算法训练时的计算速度和运行功耗相对于CPU和GPU平台具有明显的优势,相比于CPU实现了12.03的加速比,相比于GPU实现了28.08的加速比,运行功耗仅有7.748W,满足了深度强化学习在嵌入式领域的在线决策任务。
关键词:深度强化学习;FPGA;异构计算;在线决策;嵌入式领域
Research on hardware acceleration technology of deep reinforcement learning based on FPGA
Abstract:Deep reinforcement learning (DRL) is an important branch in the field of machine learning. It is used to solve various sequential decision-making problems. It has a wide application prospect in the fields of automatic driving, industrial Internet of things and so on. Because DRL is computationally intensive, it is difficult to deploy on embedded platforms with limited computing resources and demanding power consumption. In view of the limitations of DRL deployment on embedded platform, a DRL oriented FPGA accelerator is designed by using the method of software and hardware collaborative design, and a design space exploration method is proposed. The online decision-making task of cartpole application is completed on the zynq7100 heterogeneous computing platform. The experimental results show that the computing speed and running power consumption of the research in the training of typical DRL algorithm have obvious advantages over the CPU and GPU platform. Compared with the CPU, the CPU achieves an acceleration ratio of 12.03 and the GPU achieves an acceleration ratio of 28.08, and the running power consumption is only 7.748w, which meets the online decision-making task of deep reinforcement learning in the embedded field.
Key words:DRL; FPGA; Heterogeneous computing; Online decision-making; Embedded field
收稿日期:2021-12-20
基金项目:
