Xiaolin Li - On Campus - Room T020
Title: “Is Relevancy Everything? A Deep Learning Approach to Understand the Coupling of Image and Text”
Abstract: Firms increasingly use a combination of image and text description when displaying products or engaging consumers. Existing research has examined consumers' response to text and image separately, but has yet to systematically consider the semantic relationship between them and its impact on consumers' choice. In this research, we examine how the congruence between image- and text-based product representation affects consumer preference by adopting a multi-method approach. First, to measure the image-text congruence, we propose a state-of-the-art two-branch neural network model based on wide residual networks (WRN) and BERT. We apply this deep-learning method to individual-level consumption data from an online reading platform and discover a U-shape effect for image-text congruence: consumers prefer a product when the image-text congruence is either high or low, but not at the medium level. We further conduct lab experiments to validate the causal effect of this finding and explore underlying mechanisms with an online study. Our research contributes to the literature of consumer information processing both methodologically and substantively. It provides actionable managerial implications to marketing practitioners and online content creators on how to pair images and text.