Predictive Online Digital Sales (PODS) and Marketing

About The challenge

Digital advertisements of products and services are commonplace in almost every online platform used for distributing and/or selling content, goods, and services. We usually come across these paid digital ads when we visit a website or use an app or watch TV (among other ad-delivery platforms) while interacting with the content. Vendors usually run digital campaigns to manage these ads and pay the online platforms that deliver their ads. In 2024, digital ads spent about $750 billion worldwide. By 2028, it is expected to cross $1 trillion.

Running a digital marketing campaign efficiently involves collecting the performance data about the ads, analyzing those, and adapting the parameters (e.g. ad content, bids for ads) of the campaigns accordingly. When an ad shows up in front of the user on the screen, we say that the ad got an impression. If the ad is of interest to the user, she may click on the ad (for those platforms that allow user interaction). The ratio of the total number of clicks and impressions is called the Click Through Rate (CTR). CTR of an ad is often used by the platform to measure the level of interest of the users to that ad and possibly to the associated product/services it is trying to promote. Therefore, predicting CTR is an important problem in digital campaign management. Conversion from ads is also important particularly in e-commerce platforms where selling goods and services is one of the primary objectives of the vendors.

The objective in this Discovery Challenge is to optimize sponsored ad targeting in e-commerce platforms where ads show up in response to keyword-based search by the users. The first task will involve predicting future CTR for a keyword based on campaign performance data containing keyword bid, cost-per-click (CPC) for thousands of related keywords among others. The second task is to predict future ad-conversion using the provided data set. Participants will develop scalable algorithms that can be used for large scale online campaign management. Agnik is releasing campaign management data for the first time to support this competition and advance machine learning research in this emerging field.

The Discovery Challenge Problem

Consider sponsored ads in online e-commerce platforms where ads show up in response to keyword-based searches by the users.


Dataset

Agnik is releasing campaign management data for the first time to support this competition and advance machine learning research in this emerging field:

  • It will contain 30 days of campaign management data sampled every few minutes round the clock.
  • It will contain the impression, click, spending, CPC, and conversion data among other items for various search keywords.

The task

There are three tasks:

  • Predict the future CTR as a function of the previous values of the CTR and bids allocated to the keywords.
  • Predict future conversion at least one hour in advance.
  • Optimize the scalability of the algorithm. The task is to keep the execution time growth manageable as the dataset size expands.

Communication Plan

  • Online platforms will be used for promoting the competition.
  • Agnik’s digital marketing team and user population worldwide will also be used to publicize the competition.For any question, please contact us pods2025@agnik.com

Technical Details

We will use CodaBench for managing the communications.

Here's the Codabench competition link: https://www.codabench.org/competitions/7588/

Evalution

  • CTR Prediction - one hour into the future. (The average RMSE error for all the provided keywords will be the chosen metric).
  • Conversion Prediction - one hour in the future. (The average RMSE error will be the chosen metric).
  • Scalability - will be measured based by tracking the duration (execution time) on the growth of computing time as more keywords are added to the prediction algorithm!

Date, Time, and Place

  • Start Date: April 21, 2025 at 05:30 UTC
  • End Date: June 30, 2025 at 05:30 UTC
  • Results will be published by July 8, 2025 at 05:30 UTC
  • Competition Report will be submitted by July 31, 2025 at 05:30 UTC
  • Place: Online

Prizes

  • The top three competitors will receive a time slot to present their work at the conference!
  • The winner at the first place will receive free registration for the ECML-PKDD 2025.
  • Moreover, we will offer prize money to the top three winners. The team in the first place will receive 500€, the second-place 300€, and the third-place 200€!

Reporting Requirements

  • External data cannot be used.
  • Teams must be composed of at most five people.
  • Participants must release the code publicly to ensure compliance and verify the results.
  • A technical report no longer than 4 pages must be provided.

Privacy and Ethics

  • The data and the keywords will be anonymized using unique non-identifiable words.
  • No human subjects will be involved in the competition.
  • Participants will maintain professionalism towards other competitors and organizers.
  • Any form of cheating, plagiarism, or unethical behavior will result in disqualification.

Contact Us

Please reach us for any queries at

Compose Email