Faculté et Recherche

VC Theory, Policy Structure, and Deployable Reinforcement Learning for Inventory Management

14 nov

2025

11H15 - 12H30

Jouy-en-Josas

Anglais

Participer

Ajouter au calendrier

Information Systems and Operations Management

Intervenant: Will Ma (Columbia GSB)

Salle Bernard Ramanantsoa

Abstract:

We study data-driven inventory management, where demands over T time periods are drawn from unknown independent distributions, and we are given N samples from each distribution.

Existing work suggests that N needs to grow (rapidly) in T to learn a near-optimal policy.
We show that N need not grow with T at all, by taking a supervised learning approach to data-driven inventory, employing VC theory but still leveraging the "base stock" structure of the optimal inventory policy.

Motivated by our collaboration with Alibaba, we then study the same problem in a contextual setting, where high-dimensional features provide refined distributions for upcoming demands. We again take a supervised learning approach, using offline data and Deep Reinforcement Learning (DRL) to train Neural Networks that order inventory based on these high-dimensional features. But again, we leverage the structure of optimal inventory policies, which we show significantly accelerates our DRL training and also improves the final policy. This has enabled a 100% deployment of DRL for inventory on Alibaba's Tmall e-commerce platform, today managing over 1 million inventories.

Based on two papers:
- VC Theory for Inventory Policies
- DeepStock: Reinforcement Learning with Policy Regularizations for Inventory Management
that are joint work with Yaqi Xie (Chicago Booth), Linwei Xin (Cornell ORIE);
the latter is also joint work with Xinru Hao, Jiaxi Liu, Lei Cao, Yidong Zhang (Alibaba Taobao & Tmall Group).

LINK TO FIRST PAPER: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4794903

Participer

Ajouter au calendrier

Revenir à la liste des événements

Voir tous les événements Faculté et Recherche