论文标题
可以解释的强化学习能够以自己的方式管理繁荣吗?
Can Interpretable Reinforcement Learning Manage Prosperity Your Way?
论文作者
论文摘要
产品和服务的个性化正在迅速成为银行和商业成功的驱动力。机器学习具有对客户需求和偏好的更深入了解和量身定制的希望。尽管对财务决策问题的传统解决方案经常依赖于模型假设,但强化学习能够利用大量数据,以改善具有更少假设的复杂财务环境中的客户建模和决策。从监管的角度来看,解释性和可解释性提出了挑战,该角度需要接受透明度;他们还提供了改善对客户的了解和理解的机会。事后方法通常用于解释预验证的增强学习模型。基于我们以前对客户支出行为的建模,我们适应了最近的强化学习算法,这些学习算法本质地表征了理想的行为,并且我们过渡到资产管理问题。我们训练固有的可解释的强化学习代理人提供与原型财务人格特征保持一致的投资建议,这些建议合并以做出最终建议。我们观察到,受过训练的代理商的建议遵守其预期特征,他们学习复合增长的价值,并且在没有任何明确的参考的情况下,风险的概念以及改善的政策融合。
Personalisation of products and services is fast becoming the driver of success in banking and commerce. Machine learning holds the promise of gaining a deeper understanding of and tailoring to customers' needs and preferences. Whereas traditional solutions to financial decision problems frequently rely on model assumptions, reinforcement learning is able to exploit large amounts of data to improve customer modelling and decision-making in complex financial environments with fewer assumptions. Model explainability and interpretability present challenges from a regulatory perspective which demands transparency for acceptance; they also offer the opportunity for improved insight into and understanding of customers. Post-hoc approaches are typically used for explaining pretrained reinforcement learning models. Based on our previous modeling of customer spending behaviour, we adapt our recent reinforcement learning algorithm that intrinsically characterizes desirable behaviours and we transition to the problem of asset management. We train inherently interpretable reinforcement learning agents to give investment advice that is aligned with prototype financial personality traits which are combined to make a final recommendation. We observe that the trained agents' advice adheres to their intended characteristics, they learn the value of compound growth, and, without any explicit reference, the notion of risk as well as improved policy convergence.