Watch Zeta Live Sessions On-Demand. >>

Utilizing Ensemble Models for Marketing Campaigns

An insightful piece on ensemble modeling from Pawan Pathak, Gayan De Silva, David Hanzelka, and Sunpreet Singh Khanuja.

Companies are improving their Machine Learning (ML) processes to increase the performance of their marketing campaigns. They’re doing this because customer behavior data is so sophisticated (i.e., complex) that it’s become difficult for humans to make sense of it on their own. As the relationships between data attributes become increasingly non-linear and complex, we need a more intelligent approach to Machine Learning that can not only highlight these relationships but learn from them to guide business decisioning. An effective method is the simultaneous application of multiple advanced ML models, combined to form an ensemble model. In this blog, you’ll see an example of utilizing an ensemble model for marketing campaigns and learn how to overcome common problems related to ensemble models.

What is ensemble modeling?

Ensemble modeling is a technique that combines multiple ML models to create a final singular prediction. The prediction may be in the form of Classification (Yes/No), Estimation (predicting future demand) or Ranking (force rank an audience based on perform a desired action).

Why use ensemble models for marketing?

Simply put, more is better than one.

Multiple ML models learn different patterns, and combining them into an ensemble model improves the accuracy and stability of the predictions.

Five things to keep in mind when designing an ensemble model…

If you’re ready to transition from legacy model processes to ensemble models, there are five key things you need to keep in mind:

  1. Type of Prediction (Classification vs Estimation vs Ranking)
  2. ML algorithms to be determined based on the type of prediction
  3. Architecture (including existing technology and deployment limitations)
  4. Model transparency (explainable AI)
  5. Performance measurement KPIs

An example—how Zeta implemented ensemble models

During the initial phase of implementation, our primary objective was setting up a semi-automated pipeline (we decided to leave the performance improvements and fine-tuning for later on).

We considered 5 ML algorithms to start:

  • Logistic regression
  • Random forest
  • XGboost
  • Gradient boosting machines (GBM)
  • Neural networks

The simplified architecture for the pipeline is shown below.

The first step in modeling is the creation of the modeling dataset. This process is automated using historical conversion data from previous marketing campaigns to create the ”Dependent” variable. This step also includes appending “Independent” variables to the modeling dataset (to be clear, Zeta owns many data sources containing demographic, affinity, location, behavioral, etc. information for 300M+ permissioned email addresses).

Steps two and three are the most critical in any modeling process, which (at a high-level) involve cleaning the data, transforming and reducing the features to retain only the most relevant (yet non-redundant) independent features for the model.

In the fourth step, the cleaned, transformed, and reduced modeling data is fed to multiple ML algorithms, each producing a model score and model object as an output.

Finally, these model scores are combined using soft voting to obtain the final ensemble model score and object.

Ensemble model scoring

Zeta’s ensemble modeling pipeline was developed in a python environment. To efficiently achieve the scalability required for scoring 300M – 800M email addresses, Zeta uses scalable SQL tools.

Each individual model used to create the ensemble is extracted as a set of rules and constraints in the form of an SQL query. These SQL queries are then executed on our 300M+ universe to create a score from each model.

Finally, the scores from each individual model are combined using soft voting to create the final ensemble model score on the entire universe.

Model validation

The performance of a model is key in determining its business impact on a marketing campaign.  Before rolling out ensemble models, we did extensive comparison with the business-as-usual modeling process. We used the following metrics to evaluate ensemble models versus business-as-usual models:

  • AUC
  • Accuracy
  • Precision
  • Recall
  • F-score
  • Confusion matrix
  • Lift charts

The results?

In general, we see 20% improvement in new customer acquisition cost for our clients using ensemble models. This evaluation was carried out in a designed experiment, ensuring statistical significance and confidence in the achieved lift.

Need more help with ensemble models for marketing—Talk to us!

If you’d like help learning how to use ensemble models to deliver more customer satisfaction, talk to us.

In The Know

Get Fresh Marketing Insights Delivered Right To Your Inbox.

SUBSCRIBE TO OUR NEWSLETTER