Optimal Online Learning in Bidding for Sponsored Search Auction
The paper formulates learning of the optimal bidding policy for sponsored search auctions as a stochastic optimization problem, and model the auctions to build a simulator based on a real world dataset focusing on the simulated sponsored search auctions to show hourly variations in auction frequency, click propensity, bidding competition, and revenue originating from advertisements.
It presents several bidding policies that learn from bidding results and that can be easily implemented.
It also presents a knowledge gradient learning policy that can guide bidding to generate samples from which the bidding policies learn. The bidding policies can be trained with a small number of samples to achieve a significant performance in advertising profit.