Targeting optimization for internet advertising by learning from logged bandit feedback.


Pay-per-click advertising includes various formats (eg, search, contextual, and social) with a total investment of more than 140 billion USD per year. An advertising campaign is composed of some subcampaigns-each with a different ad-and a cumulative daily budget. The allocation of the ads is ruled exploiting auction mechanisms. In this paper, we propose, for the first time to the best of our knowledge, an algorithm for the online joint bid/budget optimization of pay-per-click multi-channel advertising campaigns. We formulate the optimization problem as a combinatorial bandit problem, in which we use Gaussian Processes to estimate stochastic functions, Bayesian bandit techniques to address the exploration/exploitation problem, and a dynamic programming technique to solve a variation of the Multiple-Choice Knapsack problem. We experimentally evaluate our algorithm both in simulation-using a synthetic setting generated from real data from Yahoo!-and in a real-world application over an advertising period of two months.


Alessandro Nuara, Francesco Trovo, Nicola Gatti, Marcello Restelli


Proceedings of the AAAI Conference on Artificial Intelligence


April 26, 2018