Hansheng Sun, Roy Kwon, Robust contextual bandit method for optimal loan offering
Full Text: PDF
DOI: 10.23952/jano.7.2025.3.04
Volume 7, Issue 3, 1 December 2025, Pages 333-358
Abstract. This paper proposes a Group-DRO enhanced doubly-robust contextual bandit approach to designing optimal policies for loan product offerings. This approach is particularly suited to high-stakes decision-making such as lending decisions, where one must leverage historical data (with inherent biases and uncertainties) to design future policies. By using doubly-robust estimation, we make efficient use of the data and mitigate bias from unknown logging propensities. By incorporating distributional robustness with group-based ambiguity sets, we ensure that the learned policy is insulated against worst-case shifts in each subgroup, thereby protecting the overall performance from crashing if, say, economic conditions change that strongly impact a minority group. By adding fairness constraints such as demographic parity or equal opportunity, we can align the policy with ethical and regulatory standards, ensuring that no group is left behind or unfairly treated by the automated decision process. We present empirical evidence on a small business credit card portfolio, demonstrating significant improvements over standard methods. This proposed framework contributes a step toward responsible AI in finance.
How to Cite this Article:
H. Sun, R. Kwon, Robust contextual bandit method for optimal loan offering, J. Appl. Numer. Optim. 7 (2025), 333-358.
