Table of Contents
Fraud has always been a widely discussed topic in the online advertising industry. Fake clicks on digital ads cost companies billions of dollars a year – we all know it.
In consequence, advertising businesses have found themselves looking for ways to defeat digital corruption. However, there is no simple solution to such a complex problem.
Advanced Anti-Fraud Technologies by Voluum
Voluum team of experts joined the battle last year and as a result of their efforts, the Anti-Fraud Kit was created. The findings of the research are presented in the academic paper entitled “Click-Fraud Detection for Online Advertising”.
If you want to understand and accurately identify fraud, read the following blog post by Roman Wiatr (Voluum DSP) who played the main role in bringing our anti-fraud project to life.
How Does the Anti-Fraud Kit Work?
AFK works with the Real-Time part of TRK without slowing it down. The platform has its own technology where billions of data requests are processed in the blink of an eye.
Each event generated by the Traffic Servers is processed by the Real-Time System and stored in the Event Data Storage where it can be accessed as a time-series or an event log. The Events are also passed through the Anti-Fraud Kit where suspicious events are flagged and returned to the Real-Time System and then stored in the Event Data Storage.
This simple yet efficient architecture provides valuable insights into the quality of the traffic. Having this system in place also makes it possible to find out what events can be classified as fraud.
How Does Voluum Use Machine Learning to Flag the Invalid Traffic?
The main challenge was to choose the flags correctly. Flags represent characteristics that typically accompany the invalid traffic. An example of such a characteristic is no delay between viewing an advertisement and clicking on it.
Voluum’s AFK team created an extensive list of metrics to scan all the incoming traffic. When criteria set in those metrics are matched such an event is then flagged as suspicious.
Remember that not all bots are bad. Some of them are used to index the web page content. That’s exactly why when choosing the flags, we had two goals in mind: explainability and accuracy.
Explainability is fundamentally important to us. We want to provide feedback that is understandable. We could use a machine learning model that predicts what should be treated as fraud and what shouldn’t. Nonetheless, it is hard to explain why exactly such a model marks certain events as fraud.
The second problem is that fraud is sometimes hard to define. Of course, it is easy to say that an activity producing artificial clicks in a harmful way is fraudulent. But it’s not that easy to explain what this definition really means.
Microsoft  suggests an alternative approach. Instead of building complex models, it is better to use a lot of simple flags. The thing is that the fraudster has to bypass all the flags in order to succeed, but it only takes one flag for you to find the malicious software! This brings us to the second point – accuracy.
How do we know that we have captured fraud? It depends on what kind of fraud we are dealing with. If we take click fraud into consideration then what the implications are?
One of two things: no difference between flagged and unflagged traffic OR a low conversion rate on flagged traffic – as we are dealing with traffic that can’t convert. But there is a twist. There is always a chance that low conversion rates appear just by chance.
Let’s imagine that our campaign recorded a single click coming from a data center and it did not convert. This means that the data center flag found 100% of the traffic that does not covert. But is it true or is it accidental?
Keep in mind that even a little error in detecting fraud can hugely impact the ROAS and produce extra cost for you.
Just imagine – you have billions of data points where only half of them was accurately marked by the system. Would you trust such a fraud detection system? I bet you wouldn’t.
That’s why it’s important for both you and us to maximize the accuracy. For this reason, we deploy analytics and statistics.
The Call for Data Analysis
Having simple flags comes at a cost, they can be noisy and sometimes produce false positives. This is where analytics and statistics come into play.
First, we need to arrange the data in a way that helps us decide what is beneficial for us (this is our final goal 😎). How to do it? Imagine we don’t buy traffic marked by the data center flag and see how many conversions we lose. As simple as that. If something doesn’t convert, don’t buy it. Of course, we need to apply the statistical test that removes inaccurate or unreliable data.
So we need to create a plot to compare flagged clicks with flagged conversions for each campaign. The color denotes the (accumulated) amount of traffic in the campaign(s). The outcome is very interesting:
- Plot a shows the data that we are sure of.
- Plot b shows the discarded data – the data where we are not sure if it is an accident or a pattern.Plot a shows the data that we are sure of.
Let’s have a closer look. The blue arrow points to a set of campaigns where about 18% of flagged traffic is responsible for 17% of conversions. Well, as sad as it is, this doesn’t give us any information. If the fraudster exists in this portion of the traffic, they are doing a good job hiding it (at least for this flag).
On the other hand, if you look at the green arrow, you will find that 82% of traffic is responsible for 2% of conversions (there is always noise). For this particular case it means that traffic coming from data centers is not converting as good as it should be. This can be used as information for filtering traffic or even as a proof required for a refund.
Hopefully, this extract will help you understand the basic idea behind Voluum’s anti-fraud project. We believe that with our solution – the Anti-Fraud Kit – advertisers can become more aware of the challenges around digital ad fraud.
We cannot win this battle without you, so take action: learn about fraud, dig into your stats, identify the anomalies, and block the fraudsters. Your actions are the key to limiting the damage affecting our exciting industry.
If you are interested in more flag insights, statistical, and technical insights, check out the full publication “Click-Fraud Detection for Online Advertising”.
- Wiatr, Roman et al. “Click-Fraud Detection for Online Advertising.” PPAM (2019).
- Kitts, Brendan et al. “Click Fraud Detection: Adversarial Pattern Recognition over 5 Years at Microsoft.” (2015).