• Graduate program
  • Research
  • Summer School
  • Events
    • Summer School
      • Applied Public Policy Evaluation
      • Deep Learning
      • Economics of Blockchain and Digital Currencies
      • Economics of Climate Change
      • Foundations of Machine Learning with Applications in Python
      • From Preference to Choice: The Economic Theory of Decision-Making
      • Gender in Society
      • Machine Learning for Business
      • Sustainable Finance
      • Tuition Fees and Payment
      • Business Data Science Summer School Program
    • Events Calendar
    • Events Archive
    • Tinbergen Institute Lectures
    • 16th Tinbergen Institute Annual Conference
    • Annual Tinbergen Institute Conference
  • News
  • Alumni
  • Magazine
Home | Events Archive | Model-Learning Bandits for Personalization
Seminar

Model-Learning Bandits for Personalization


  • Location
    Erasmus University Rotterdam, Mandeville Building, Room T3-01
    Rotterdam
  • Date and time

    September 20, 2023
    13:00 - 14:00

Abstract

Personalization strategies often build on a large set of customer-specific and contextual features to optimally select among the available marketing actions. Contextual multi-armed bandit algorithms can help marketers to adaptively select optimal personalized actions. However, conventional contextual bandit algorithms are not well-suited for use with a large number of features, a common characteristic of many real-world problems. Exploration is beneficial to identify relevant features, yet, when faced with high-dimensional features, learning the impact of each feature can lead to over-exploration and thus inefficiency. To address this challenge, it becomes crucial to leverage an adaptive modeling approach to support the exploration process and to effectively resolve the uncertainty in feature importance. We propose a new approach using variable selection techniques to learn both the optimal model specification and the action-selection strategy. We further enhance model interpretability via feature decomposition, to effectively identify both relevant and irrelevant features. Among relevant features, we discern between two types: common features, which have the same influence on consumer behavior for all actions, and hence do not impact the personalized policy, and action-specific features, whose impact differs across the possible actions and hence do affect the policy. Our method allows firms to run cost-efficient and interpretable bandit algorithms with high-dimensional contextual data.