A combinatorial-bandit algorithm for the online joint bid/budget optimization of pay-per-click advertising campaigns.


In the last two decades, online advertising has become the most effective way to sponsor a product or an event. The success of this advertising format is mainly due to the capability of the Internet channels to reach a broad audience and to target different groups of users with specific sponsored announces. This is of paramount importance for media agencies, companies whose primary goal is to design ad campaigns that target only those users who are interested in the sponsored product, thus avoiding unnecessary costs due to the display of ads to uninterested users. In the present work, we develop an automatic method to find the best user targets (a.k.a. contexts) that a media agency can use in a given Internet advertising campaign. More specifically, we formulate the problem of target optimization as a Learning from Logged Bandit Feedback (LLBF) problem, and we propose the TargOpt algorithm, which uses a tree expansion of the target space to learn the partition that efficiently maximizes the campaign revenue. Furthermore, since the problem of finding the optimal target is intrinsically exponential in the number of the features, we propose a tree-search method, called A-TargOpt, and two heuristics to drive the tree expansion, aiming at providing an anytime solution. Finally, we present empirical evidence, on both synthetically generated and real-world data, that our algorithms provide a practical solution to find effective targets for Internet advertising.


Margherita Gasparini, Alessandro Nuara, Francesco Trovò, Nicola Gatti, Marcello Restelli


2018 International Joint Conference on Neural Networks (IJCNN)


July 8, 2018