多目标奖励概括：提高对单资产交易中应用深度强化学习的性能

论文标题

多目标奖励概括：提高对单资产交易中应用深度强化学习的性能

Multi-Objective reward generalization: Improving performance of Deep Reinforcement Learning for applications in single-asset trading

论文作者

Cornalba, Federico, Disselkamp, Constantin, Scassola, Davide, Helf, Christopher

论文摘要

我们研究了对股票和加密货币单资产交易的多目标，深度强化学习的潜力：特别是，我们考虑了一种多目标算法，该算法将奖励功能和折现因子推广（即，这些组件未指定先验，但并入到学习过程中）。首先，使用几个重要资产（加密货币对BTCUSD，ETHUSDT，XRPUSDT和Stock索引索引AAPL，SPY，NIFTY50），我们验证了拟议的多目标算法的奖励概括属性，并提供了预测性统计学的证据，并提供了预测性稳定性增加的稳定性稳定性的唯一型单键型。其次，我们表明，当奖励机制稀疏时（即，随着时间的推移，非无零反馈是很不常见时，多目标算法在相应的单目标策略上具有清晰的边缘。最后，我们讨论了有关折现因子的概括属性。我们的整体代码以开源格式提供。

We investigate the potential of Multi-Objective, Deep Reinforcement Learning for stock and cryptocurrency single-asset trading: in particular, we consider a Multi-Objective algorithm which generalizes the reward functions and discount factor (i.e., these components are not specified a priori, but incorporated in the learning process). Firstly, using several important assets (cryptocurrency pairs BTCUSD, ETHUSDT, XRPUSDT, and stock indexes AAPL, SPY, NIFTY50), we verify the reward generalization property of the proposed Multi-Objective algorithm, and provide preliminary statistical evidence showing increased predictive stability over the corresponding Single-Objective strategy. Secondly, we show that the Multi-Objective algorithm has a clear edge over the corresponding Single-Objective strategy when the reward mechanism is sparse (i.e., when non-null feedback is infrequent over time). Finally, we discuss the generalization properties with respect to the discount factor. The entirety of our code is provided in open source format.

下载PDF全文

下载文献需遵守相关版权规定

论文标题