WATCH VIDEO
In recent times, many industries have become interested in developing algorithms for dynamically varying prices based on product, customer and other market related features, to drive profitability and revenue growth. However, in many real-world use cases, where historical prices show very little or no variation, typical methods for dynamic pricing by estimating purchase probabilities are not applicable. In this study, we develop practical approaches for dynamic pricing based on reinforcement learning. We propose a hybrid contextual bandit model considering the customer or product features. The pricing setting presents significant challenges to the application of the multi-armed bandit framework since the arms are highly correlated. To capture correlation across arms, we consider a hybrid model with both arm independent and dependent features. We demonstrate using simulations that these methods can efficiently discover the optimal prices and provide a revenue lift of up to 7.6% as compared to the current practices.