Y Wang, R Savani, A Gu, C Mascioli, T Turocy, and MP Wellman
5th ACM International Conference on AI in Finance (ICAIF), pages 643-651, November 2024.
Abstract
In market making, a market maker (MM) can concurrently place many buy and sell limit orders at various prices and volumes, resulting in a vast action space. To handle this large action space, beta policies were introduced, utilizing a scaled beta distribution to concisely represent the volume distribution of an MM’s orders across different price levels. However, in these policies, the parameters of the scaled beta distributions are either fixed or adjusted only according to predefined rules based on the MM’s inventory. As we show, this approach potentially limits the effectiveness of market-making policies and overlooks the significance of other market characteristics in a dynamic market. To address this limitation, we introduce a general adaptive MM based on beta policies by employing deep reinforcement learning (RL) to dynamically control the scaled beta distribution parameters and generate orders based on current market conditions. A sophisticated market simulator is employed to evaluate a wide range of existing market-making policies and to train the RL policy in markets with varying levels of inventory risk, ensuring a comprehensive assessment of their performance and effectiveness. By carefully designing the reward function and observation features, we demonstrate that our RL beta policy outperforms baseline policies across multiple metrics in different market settings. We emphasize the strong adaptability of the learned RL beta policy, underscoring its pivotal role in achieving superior performance compared to other market-making policies.