Estimating user interaction probability for non-guaranteed display advertising
Author
Date
2014Permanent Link
http://hdl.handle.net/10092/14658Thesis Discipline
MathematicsDegree Grantor
University of CanterburyDegree Level
MastersDegree Name
Master of ScienceBillions of advertisements are displayed to internet users every hour, a market worth approximately $110 billion in 2013. The process of displaying advertisements to internet users is managed by advertising exchanges, automated systems which match advertisements to users while balancing conflicting advertiser, publisher, and user objectives. Real-time bidding is a recent development in the online advertising industry that allows more than one exchange (or demand-side platform) to bid for the right to deliver an ad to a specific user while that user is loading a webpage, creating a liquid market for ad impressions. Real-time bidding accounted for around 10% of the German online advertising market in late 2013, a figure which is growing at an annual rate of around 40%. In this competitive market, accurately calculating the expected value of displaying an ad to a user is essential for profitability.
In this thesis, we develop a system that significantly improves the existing method for estimating the value of displaying an ad to a user in a German advertising exchange and demand-side platform. The most significant calculation in this system is estimating the probability of a user interacting with an ad in a given context. We first implement a hierarchical main-effects and latent factor model which is similar enough to the existing exchange system to allow a simple and robust upgrade path, while improving performance substantially. We then use regularized generalized linear models to estimate the probability of an ad interaction occurring following an individual user impression event. We build a system capable of training thousands of campaign models daily, handling over 300 million events per day, 18 million recurrent users, and thousands of model dimensions. Together, these systems improve on the log-likelihood of the existing method by over 10%.
We also provide an overview of the real-time bidding market microstructure in the German real- time bidding market in September and November 2013, and indicate potential areas for exploiting competitors’ behaviour, including building user features from real-time bid responses. Finally, for personal interest, we experiment with scalable k-nearest neighbour search algorithms, nonlinear dimension reduction, manifold regularization, graph clustering, and stochastic block model inference using the large datasets from the linear model.