Download pdf - Trajectory-based Mobile Advertisingfaculty.chicagobooth.edu/.../marketing/pdf/MobileTraj_Beibei2015.pdf · Trajectory-based Mobile Advertising . Anindya Ghose Beibei Li Siyuan Liu1

Trajectory-based Mobile Advertising

Anindya Ghose Beibei Li Siyuan Liu1

Stern School of Business New York University [email protected]

Heinz College Carnegie Mellon University [email protected]

Smeal College of Business Penn State University

[email protected]

(Last Updated December, 2015)

Abstract

Rapid improvements in the precision of mobile technologies make it possible for advertisers to go beyond using the real-time static location and contextual information about consumers. In this study, we propose a new mobile advertising strategy that leverages full information on consumers’ offline movement trajectories from different mobility dimensions. To analyze the effectiveness of this new trajectory-based mobile advertising strategy, we design a large-scale randomized field experiment in a large shopping mall that examines the redemption rates of mobile coupons. Our experiment results are validated based on 83,370 unique user responses for a 14-day period in June 2014. We find that trajectory-based mobile advertising can lead to higher redemption probability, faster redemption behavior, and higher transaction amount from customers compared to other kinds of advertising, all of which facilitate higher revenues for the focal store as well as the overall shopping mall. Interestingly, however trajectory-based mobile advertising becomes significantly less effective in boosting the revenues of the shopping mall during the weekend. This finding suggests that customers are likely to be in an “exploratory” shopping mode during the weekend and are likely to engage in “impulsive purchases.” However, closely targeted mobile advertising can have the inadvertent impact of reducing impulse purchase behavior suggesting that marketers need to be careful when implementing mobile advertising strategies. Finally, trajectory-based mobile advertising is especially effective in attracting high-income consumers, suggesting the potential of mobile advertising in approaching high-end customers to achieve better customer lifetime value. On a broader note, our work can be viewed as a first step to study the digitization of individual offline behavior and how it can be linked to a better understanding individual preferences and decision-making.

1 Author names are in alphabetic order.

mailto:[email protected]

1. Introduction

Smartphone usage is expected to exceed 6.1 billion users worldwide by 2020 (Ericsson 2014).

The proliferation of mobile and sensor technologies has contributed to the rise of location-based

mobile advertising. Such advertising can enable businesses to deliver information to mobile

users in real time about offers in close proximity to them. Recent studies using randomized field

experiments have causally shown that mobile advertisements based on static location and time

information can significantly increase consumers’ likelihood of redeeming a geo-targeted mobile

coupon (Molitor et al. 2014, Luo et al 2014, Fong et al. 2014), that mobile ads have a synergistic

relationship with PC ads (Ghose et al. 2013), that expiry length of mobile coupons influences

their redemption rates (Danaher et al. 2015), and that understanding consumers’ context, in

particular the crowdedness of their environment, is important for marketers to increase mobile

marketing effectiveness (Andrews et al. 2015).

Beyond the real-time snapshot about the static geographical location and the consumer

contextual information, the overall mobile trajectory of each individual consumer can provide

even richer information about consumer preferences. “Trajectory” hereby refers to the physical

behavioral trace of an individual’s offline movement. For example, it can include information on

the locations where the individual has been to in the past, at what time, for how long, as well as

the associated contexts. Considering the significant search costs of consumers in the offline

world, such physical behavioral trace of individuals can be highly informative in revealing

consumer preferences for real-time decision making. This information is analogous to the search-

click stream data that we have been studying in the online environment. Mobile and sensor

technologies allow us to digitalize such individual behavioral trajectory in the offline

environment. Hence, extracting and measuring the digital trace of consumers’ offline behavior

has become increasingly critical for businesses today in order to understand consumers’ inherent

preferences to improve user digital experiences and business marketing strategies.

In this study, we propose a new mobile advertising strategy that infers consumer preferences

by leveraging full information on consumers’ offline moving trajectories from four different

mobility dimensions: temporal duration, spatial dispersion, semantic information, and movement

velocity. We extract these multidimensional trajectory features from large-scale, fine-grained

user-level behavioral data. We learn individual preferences from the trajectory information,

using statistical and machine learning methods such as kernel-based similarity functions, dense

sub-graph detection algorithms for graph-based clustering, and collaborative filtering. Finally, to

examine the effectiveness of this new trajectory-based mobile advertising strategy, we conduct a

randomized field experiment in one major large shopping mall in Asia in June 2014. Our

experiment results are validated based on 83,370 unique user responses for a 14-day period. Our

follow-up econometric analyses from both group level and individual user level demonstrate

high consistency in our final results.

Our main findings are the following. First, we find that trajectory-based mobile advertising

can significantly increase the likelihood of a consumer redeeming a mobile promotion at the

focal advertising store. On average, the coupon-redemption rate is 34.78% higher when

compared to mobile advertising based on real-time location information, and it is 93.75% higher

when compared to mobile advertising based on purely random recommendations. Second, the

trajectory-based mobile advertising leads to the fastest redemption behavior from customers. On

average, the time elapsed between receiving the mobile promotion and redeeming it under

trajectory-based mobile advertising is approximately one-third the time elapsed under the static

location-based mobile advertising (4.55 vs. 12.83 minutes), and one-fourth the time under the

purely random-based mobile advertising (4.55 vs. 16.43 minutes). Third, our follow-up user

survey shows that trajectory-based mobile advertising leads to the highest overall satisfaction

rates and the highest future-willingness-to-redeem rates from the customers. Fourth,

interestingly, our findings indicate that trajectory-based mobile advertising has a significant and

positive direct effect on the revenues of the focal advertising store. However, its impact on the

overall revenues of the shopping mall is significantly lower during the weekend (as opposed to a

weekday), compared to pure location-based advertising or random advertising strategies.

Customers are likely to be in an “exploratory” shopping stage during the weekend and are likely

to engage in “impulse purchases.” However, closely targeted mobile advertising tends to have

the inadvertent effect of significantly reducing the impulse purchase behavior. This finding

suggests that businesses and marketers need to be careful when implementing mobile advertising

strategies, depending on different business scopes. Finally, we also find that trajectory-based

mobile advertising is especially effective in attracting high-income consumers, suggesting the

potential of mobile advertising in approaching high-end customers to achieve better customer

lifetime value.

Our major contributions can be summarized as follows. First, we demonstrate the value of

mining large-scale, fine-grained offline mobile trajectory information on understanding

individual user preferences, and the importance of leveraging such information to improve the

effectiveness of digital marketing. Second, we are able to establish a link between individuals’

offline behavioral trajectories and their digital behaviors. To the best of our knowledge, our work

is the first to bridge the understanding of individual behavioral path and preference from both

physical and digital worlds. Third, our analyses based on a combination of field experiments and

surveys allow us to quantify the economic effects of the new trajectory-based mobile advertising

approach from a causal perspective. Advertisers can learn from our results in order to improve

the design and effectiveness of their mobile marketing strategies. Finally, our interdisciplinary

approach incorporates methodologies from large-scale statistical and machine learning,

econometric modeling, field experiments, and surveys and paves a path on which future studies

can build to further analyze similar problems.

This paper is organized as follows. Section 2 discusses related work. Section 3 discusses how

we extract multidimensional mobility features from granular individual mobile trajectory data.

Section 4 introduces how we design a new mobile-recommendation approach based on the

extracted trajectory features using large-scale statistical and machine learning methods. Section 5

discusses the design of our field experiment. Section 6 discusses the econometric analyses and

final experimental results, and Section 7 concludes.

2. Related Literature

Our study builds on the following three streams of research.

Mobile Marketing and Location-based Advertising. Our paper is related to mobile and

location-based advertising. Recent studies have shown that mobile ads have a synergistic

relationship with PC ads (Ghose et al. 2013b), and geographical proximity matters more for

mobile ads than for PC ads (Ghose et al. 2013a). Using randomized field experiments,

researchers have causally shown that mobile advertisements based on static location and time

information can significantly increase consumers’ likelihood of redeeming a geo-targeted mobile

coupon (Molitor et al. 2014, Luo et al. 2014, Fong et al. 2015). For example, Molitor et al.

(2014) have shown that higher the discount from mobile coupons and the closer the consumers

are to the physical store offering the coupon, the more likely they are to download the mobile

coupons. Luo et al. (2014) have found that temporal targeting and geographical targeting

individually increase sales. However, the sales effects of employing these two strategies

simultaneously are not straightforward, and advertisers need to carefully choose both temporal

and spatial dimensions when designing mobile targeting strategies. Furthermore, Fong et al.

(2015) have focused on the effectiveness of competitive locational targeting, and found that

competitive locational targeting can produce increasing returns to promotional discount depth.

Sahni & Nair (2015) use a field experiment to manipulate whether advertising is tagged as a

native ad in a mobile setting.

More recently, studies have shown that understanding consumers’ hyper-context, such as the

crowdedness of their immediate environment, is critical for marketers to measure mobile

marketing effectiveness (Andrews et al. 2015). In particular, the authors have found that the

more crowded the customer’s current location environment, the more likely the customer will

respond to a mobile ad. Danaher et al. (2015) show that besides location and time of delivery,

how long coupons are valid (expiry length) can influence redemption rates, because redemption

times for mobile coupons are much shorter than for traditional coupons. Our paper is distinct

from all the existing studies in that instead of using static location or context information, it

leverages the full historical information on consumers’ offline trajectories from four different

mobility dimensions, including temporal duration, spatial dispersion, semantic information, and

movement velocity, to infer consumer preferences, formulate a recommendation system and

thereafter serve targeted mobile advertising in the form of carefully curated mobile coupons.

Previous studies have examined consumer perceptions and attitudes toward mobile location-

based advertising (e.g., Brunner and Kumar 2007; Xu et al. 2009). Gu (2012) examines both the

short- and long-term sales effects of location-based advertising. Bart et al. (2014) study mobile

advertising campaigns and find they are effective at increasing favorable attitudes and purchase

intentions for higher- (versus lower-) involvement products, and for products that are seen as

more utilitarian (vs. more hedonic).

Spatial-Temporal Mining and Trajectory Clustering. Second, our study builds on the

spatial-temporal mining and trajectory-clustering literature from the machine learning literature.

Trajectory clustering is a diverse research area, with significant diversity in clustering measures

and the end goals. Researchers have studied trajectory, using a variety of measures, ranging from

mining frequent trajectory patterns for activity monitoring (Liu et al. 2012), probability function

of time (Gaffney et al. 1999), behavior correlation representation (Xiang et al. 2010), density-

based distance function (Nanni et al. 2006) and uncertainty measurement of trajectories (Pelekis

et al. 2011). Different similarity measures (e.g., time and location distances) and clustering

methodologies have their strengths and weaknesses. In contrast to most prior work, our method

is able to handle multiple information sources (not just movement trajectories, but also the

semantics of the underlying space) and apply a general metric-based learning framework to the

clustering problem. Studies have used trajectory-based clustering for different broad objectives,

such as discovering common sub-trajectories (Lee et al. 2007) and identifying spatial structures

(Ng et al. 2002). But such work is based purely on spatial locations, making extending it to

incorporate semantic, velocity, or other information that may contain distinctive markers of real

community interaction difficult. It is also related to the community-detection literature from

machine learning and computer science. Communities in networks/graphs are groups of vertices

within which connections are dense, but between which connections are sparser. Four types of

methods primarily exist: hierarchical clustering (Huang et al. 2010), similarity in edge-

betweenness scores (Leskovec et al. 2010), counts of short loops (Newman et al. 2004), and

voltage differences in resistor networks (Shi et al. 2011). However, these existing methods focus

on detection given a network structure and social-link distance between nodes, which are

difficult to capture from physical mobile trajectories. Instead, in our study, we focus on detecting

communities of similar users based purely on their movement trajectory patterns.

Behavior-Based Recommendation. Finally, our work is related to the stream of literature on

recommendation systems, especially behavior-based recommendation. Link, content, and

location can be viewed as the results of users’ different behaviors, but little previous work builds

trajectory community models to provide the online recommendation. In recommender systems,

behavior models are proposed for different purposes, such as effects of behavior monitoring and

perceived system benefits (Nowak et al. 2012), navigational patterns to model relationships

between users (Esslimani et al. 2011), effect of context-aware recommendations on customer

purchasing behavior and trust (Adomavicius et al. 2011), and utility query recommendation by

mining users’ search behaviors (Ghose et al. 2012, 2014). Compared with the previous studies,

one unique feature in this paper is that we aim to model individual preferences based on the

large-scale and granular information extracted from individuals’ heterogeneous offline behavior

using mobile trajectories and offline contexts.

3. Extraction of Multidimensional Mobility Features from Individual Trajectories

In this section, we discuss how we characterize individual mobility by extracting unique

movement features from various dimensions of individuals’ mobile trajectories. Building upon

prior literature (Liu et al. 2013), we focus on four different dimensions of mobility features:

temporal duration, spatial dispersion, semantic information, and movement velocity. Through

these four mobility dimensions, we aim to capture similar patterns in individuals’ physical

movement from different perspectives. Note that this step allows us to learn consumer behavior

not only through static locational or contextual proximity information, but also through dynamic

movement similarity from the underlying mutual interaction or shared relationship.

3.1 Temporal Duration

• We define temporal duration to contain information on the starting and ending time of the

mobile trajectory, as well as the day-of-the-week index.

• More specifically, for each consumer, we extract a vector with three different temporal

features: the starting time of a consumer’s trajectory, the ending time of this trajectory, and

the day index. These temporal features aim to capture the temporal activity pattern for real-

life communities. To measure the similarity between two user trajectories on their temporal

dimension, we adopt a similar approach as in Liu et al. (2013) using a temporal kernel.

3.2 Spatial Dispersion

Spatial dispersion measures the spatial alignment of different user trajectories. The close

alignment of two trajectories might indicate high behavioral similarity between the two users.

Note that to account for the popularity of the location, we inversely weigh the spatial similarity

in proportion to the crowdedness of a specific location. Intuitively, this approach is similar to

TF-IDF in text mining (e.g., Manning et al. 2008).

More specifically, our method builds on the Global Alignment Kernel (GAK) to measure the

spatial similarity between two trajectories (Cuturi 2011, Liu et al. 2013). The intuition is to

capture the spatial closeness between two individuals over time. However, the popularity of a

location can potentially bias GAK. For example, if customers A, B, and 100 other customers are

waiting in a concourse area, the spatial closeness between A and B becomes less informative of

the similarity between them, because this concourse is clearly a popular location for almost

everyone. However, if A and B are the only two customers in the concourse, this spatial

closeness can instead reveal significant information about the similarity between them. Based on

this intuition, we apply the Global Alignment Kernel with Inverse Proportion method (GAK-IP)

(Liu et al. 2013), which weighs the spatial similarity in inverse proportion to how many other

people are co-located within the nearby area.

3.3 Semantic Information

Semantic information aims to capture the contextual information related to the mobile

trajectory. For example, it contains the stationary probabilistic distribution of individuals’ visits

to different stores in the mall, time spent at each store, time spent to transit from one store to

another, and the transition probability between two stores.

More specifically, let X denote the set of all trajectories. Our goal is to measure the traverse

statistic on the sites, and use this statistic to measure the semantic correlation of user trajectories.

Let L denote the total number of spatially distinct sites. We can extract the following features of

the sites visited by an individual user.

Markov state transition. We construct the Markov state transition matrix A ∈ 𝑅𝐿×𝐿, where

A(𝑠𝑎, 𝑠𝑏) represents the transition probability from site 𝑠𝑎 to site 𝑠𝑏 . To calculate A, we first

collect all the pairs from the overall trajectory set X. Then we count the number of occurrence of

each transition pair. Finally, we conduct the column normalization of A , satisfying

∑ A(𝑠𝑎, 𝑠𝑏)𝑠𝑎 = 1.

Temporal intervals. We measure the time spent at each site and the time taken to transit

from site 𝑠𝑎 to site 𝑠𝑏 to capture the “level of interest” shown by the users (e.g., when a shop is

very “interesting,” the shoppers may choose to stay longer), and the convenience of moving from

site 𝑠𝑎 to site 𝑠𝑏, which indicates the semantic relation of two sites.

3.4 Movement Velocity

Finally, movement velocity contains information about the speed of moving subjects (users).

The information encoded in the velocity pattern of moving subjects is critical. However, we face

two challenges when modeling the velocity pattern. The first challenge is that the overall length

of each individual trajectory is different, which causes difficulty in directly measuring their

pairwise similarity from a velocity aspect. The second challenge is that even within the same

individual mobile trajectory, velocity can vary largely at different times and locations; therefore,

constructing the direct measurement is hard. To account for these challenges and to make

velocity comparable across heterogeneous individual trajectories, we normalize the velocity by

applying a temporal pyramid matching method (Liu et al. 2013). This method is inspired by the

normalization method in calculating the image similarity in image classification while

accounting for the different scales of resolution (Lazebnik et al. 2006).

More specifically, each trajectory has a velocity vector 𝑣𝑘 with unequal lengths. We

uniformly quantize the velocity into L levels. Given 𝑣𝑘 with length 𝑙𝑘 , we calculate the

normalized histogram ℎ𝑘(0) on 𝑣𝑘 . Then we equally divide 𝑣𝑘 into two parts 𝑣𝑘 →

[𝑣𝑘(1),𝑣𝑘(2)], where both 𝑣𝑘(1) and 𝑣𝑘(2) are also velocity vectors with length 𝑙𝑘/2. We then

calculate the normalized histogram ℎ𝑘(1) and ℎ𝑘(2) on 𝑣𝑘(1) and 𝑣𝑘(2) , respectively, and

normalize them so that ∑ℎ𝑘(1) + ∑ℎ𝑘(2) = 1. Consequently, we further equally divide 𝑣𝑘(1)

or 𝑣𝑘(2) into two parts again and calculate the histograms in the same way.

We continue this process until we achieve a predefined level. We concatenate all the

histograms with predefined weights. Based on this approach, we can extract a velocity histogram

ℎ𝑘 of equal length with coarse-to-fine temporal resolution. The pair-wise similarity between user

trajectories 𝑘 and 𝑘′ can be calculated using different similarity functions, such as histogram

intersection or chi-square kernel.2

4. A New Mobile-Recommendation Approach

In this section, we discuss how we design a new mobile-recommendation strategy based on

the multidimensional mobility features we extract as described above.

4.1 Measuring Consumer Similarity from Multiple Trajectory Dimensions

We first discover consumers with similar behavioral paths based on their mobile trajectories.

This step aims to identify groups of similar consumers based on the behavior-driven mobility

features of individuals as they move together in a shopping mall. It allows us to learn consumer

preferences not only through static spatial or contextual proximity information, but also dynamic

movement similarity from the underlying mutual interaction or shared relationship.

We focus on inferring consumer similarity based on a combination of the four dimensions of

mobility features we extract from the previous stage. Specifically, based on the above four

dimensions of mobility features, we calculate the consumer similarity by combining the features

as follows:

𝑆(𝑖, 𝑖′) = ∑ 𝛼𝑚𝑆𝑚(𝑖, 𝑖′)𝑀𝑚=1 ,𝛼𝑚 ≥ 0,∑ 𝛼𝑚 = 1𝑀

𝑚=1 , [1]

where 𝑆(𝑖, 𝑖′) denotes the similarity of consumer 𝑖 and consumer 𝑖′, 𝑀 denotes the number of

dimensions of mobility features (here 𝑀 = 4), 𝑆𝑚(𝑖, 𝑖′) denotes the similarity on the m-th

dimension of mobility features, and 𝛼𝑚 denotes the pre-assigned weights reflecting the specific

interests of the problem domain.3

2 Similar to the spatial dimension, we have also applied inverse weighting and normalization to the other dimensions to control for the popularity of certain temporal durations, travel speed, or site transitions. 3 In this study, we obtain the weight 𝛼𝑚 using two different approaches. First, we assume an equal weight of 0.25 for each dimension. Alternatively, we are able to learn the weight using machine learning methods. In particular, we

4.2 Using Graph-based Clustering to Identify Similar Consumers

Based on the pairwise similarity of consumers derived from the previous steps, we can

cluster similar individuals from their pairwise similarities. The main goal of this step is to

identify clusters of consumers where the consumers within a cluster are similar to each other

with regard to their mobile trajectories but dissimilar to consumers not in the cluster.

To do so, we use a graph-based clustering method. In particular, we apply the Markov

Clustering Algorithm (MCL) for dense sub-graph detection (Van Dongen 2012). It is an

unsupervised learning method that allows us to leverage the network structure to extract groups

of similar items. MCL has several advantages (Satuluri et al. 2010) over distance-based

clustering algorithms such as k-means (MacQueen 1967) and hierarchical clustering (Eisen et al.

1998). First, unlike the k-means-based algorithm, MCL is less sensitive to the initial starting

conditions. Second, MCL does not take any default number of clusters as an input. Instead, it

allows the internal structure of the network to determine the granularity of the cluster. Third,

compared with many state-of-the-art network clustering algorithms, MCL is more noise tolerant

and more effective at discovering the cluster structure (Brohee and Helden 2006, Liu et al. 2013).

More specifically, we first construct an undirected probabilistic graph of individual

trajectories (an example is shown in Figure 1), where each node in the graph represents a

consumer’s trajectory, and the weight on each edge between two nodes represents the pairwise

similarity between two consumers. Therefore, if two consumers are very similar to each other in

construct a training data set by manually rating the overall pairwise similarity between two trajectories on a scale from 0 to 1. Then, we use logistic regression to learn the corresponding weights based on the training set. For model evaluation, we use 10-fold cross-validation to avoid overfitting. We find the two approaches give us very consistent results. Hence, we applied equal weights to the four mobility dimensions in our experiment.

their trajectory patterns, the weight on the edge between the two corresponding consumer nodes

would be very high. Our goal is to detect a set of highly connected sub-graphs (cliques) from the

graph where the weight on the edge between each pair of two nodes in the sub-graphs is

relatively high (i.e., dense sub-graph). The basic intuition of the MCL algorithm is based on the

idea of a random walk. The probability of visiting a connected node is proportional to the weight

on the edge. In other words, the random walk will stabilize inside the dense regions of the

network after many steps. The stabilized regions shape the clustered sub-graph and reflect the

intrinsic structure of the network. The sub-graphs hence represent the identified clusters of

similar consumers.

4.3 Mobile Recommendation

With the detected clusters of similar consumers from the previous steps, we then target

advertising promotions by offering recommendations to a consumer from stores that are most

frequently visited or products that are most frequently purchased by similar consumers. This

approach is similar to the collaborative filtering approach widely used in traditional

recommender systems.

In practice, recommendations are achieved by calculating the ratings of the consumers for the

stores. More specifically, the rating of a consumer for a store is a measurement of one’s interest

in that store. It is defined as a weighted sum of time and money the consumer spent in that store.

Given consumer 𝑖 and store 𝑗, one common approach to predict the rating 𝑅�(𝑖, 𝑗) is to average

the ratings of similar consumers on store j weighted by their similarity information. Thus, the

average rating can be calculated by

𝑅�(𝑖, 𝑗) =∑ 𝑅(𝑖′,𝑗)𝑆(𝑖,𝑖′)𝑁𝑖𝑖′=1

∑ 𝑆(𝑖,𝑖′)𝑁𝑖𝑖′=1

, [2]

where 𝑁𝑖 denotes the number of similar consumers to consumer 𝑖 , and 𝑅(𝑖′, 𝑗) denotes the

observed rating of consumer 𝑖′ on store 𝑗.

More specifically, our recommendation approach consists of four distinct phases:

1. Model trajectory similarities from four distinct mobility feature dimensions: semantic

properties of the locations, spatial proximity to other objects, temporal duration of the

trajectory, and movement velocity at different timescales;

2. Compute a weighted similarity measure that linearly combines the similarity measures

along each dimension;

3. Cluster similar consumers using graph-clustering techniques (e.g., dense sub-graph

detection algorithms) over a graph with edges corresponding to the computed pairwise

similarity score;

4. Make recommendations to a consumer regarding most frequently visited stores or most

frequently purchased products by similar consumers.

5. A Field Experiment on Trajectory-based Mobile Advertising

To examine the effectiveness of the trajectory-based mobile advertising strategy, we

designed and executed a large-scale randomized field experiment in collaboration with one of the

largest shopping malls in Asia in June 2014.

5.1 Experimental Setting

The shopping mall contains over 300 stores spanning 1.3 million square feet. On average, it

attracts over 100,000 visitors daily. At the entrance of the shopping mall, if a consumer wanted

to enjoy free WiFi, she was required to complete a Form A with information on age, gender,

income range, credit card type (gold, platinum, gift card, others), and phone type (iPhone,

Android, others). At each store, when the consumer purchased a product, she was required to

complete a Form B, which involved similar socio-demographic information plus the amount

spent and whether the purchase was related to a mobile coupon. We cross-validated Form A and

Form B to check for the accuracy of the individual-level information.

Once the consumer connected to the WiFi, we were able to track the detailed mobile

trajectory information during her visit to the shopping mall with precise time and location stamps.

Finally, when the consumer left the mall, we conducted a short follow-up survey via the mobile

phone asking whether or not she followed the mobile recommendation, whether she wanted to

follow such recommendations in the future, overall satisfaction with the shopping experience,

and additional personal information (first-time visitor or not, WiFi user or not, shop alone or

with others, money spent in the focal advertising store, total money spent in the mall).4

Figure 2 provides an example of movement trajectories of individual customers traveling

upstairs and downstairs in the shopping mall. More specifically, the trajectories of customers

contain information such as what kind of stores the customers visited, how long they stayed in

each store, the transition probability between two stores, how fast they were walking, and so on.

We are then able to generate mobile recommendations based on the four dimensions of mobility

features extracted from the trajectory information as described in the previous section.

5.2 Profiling Stores Based on Crowd-sourced Data

To better contextualize the mobile trajectories and to capture detailed store-profile

information in the shopping mall, we had to resort to crowdsourcing techniques. We applied a

similar approach as in (Guo et al. 2013). In particular, we deployed ShopProfiler, a shop-

4 Note that in the follow-up survey, the control group was not asked the two mobile-related questions (i.e., whether she followed the mobile recommendation, and whether she wanted to follow such recommendations in the future). We used users’ self-reported mobile redemption behavior from the survey as a source for verification of the redemption data we collected from the store sales. Furthermore, the survey based metrics like “satisfaction rate” and “future-willingness-to redeem” are useful snippets of data in that they can indicate a long-term effect of the mobile advertising, which is a topic of considerable discussion amongst practitioners.

profiling system on crowdsourcing data. ShopProfiler only relies on sensor readings from mobile

devices. The process of data collection is automatic and running in the background. Inertial

sensor readings reflect human movement information such as acceleration, heading, and speed.

Microphone and WiFi modules provide additional information on surrounding conditions. From

the customer’s point of view, mobile-sensing data in a mall contain information about what

shops that customer visited, how long she stayed, and how fast she was walking. From the shop’s

point of view, mobile-sensing data reveal information about the shops’ inside layout and how

many people visit the shops in a particular time period. Through mining and learning mobile-

sensing data, we capture unique features of different shops and categorize them into different

types (e.g., restaurant, books, supermarket, electronics, fashion, etc.). Furthermore, a complete

profile of shops contains specific location and precise brand names. We extract the brand-name

information from the Service Set Identifier (SSID) from each store using the text-mining method.

We provide more details on the ShopProfiler in Online Appendix A.

5.3 Randomized Experiment Design

We designed our randomized experiment to contain the following four groups:

1) Control Group (C): Do not send any mobile promotion;

2) Treatment Group 0 (T0− Random): Send mobile promotion from randomly selected store;

3) Treatment Group 1 (T1− Location): Send mobile promotion based on real-time location

information;

4) Treatment Group 2 (T2− Trajectory): Send mobile promotion based on consumers’

trajectory information.

We sent mobile coupons by using short message service (SMS) texts. Note that to control for

the potential bias introduced by the stores and products, we randomized the participation among

252 stores from various categories including fashion, dining, supermarket, and so on. To control

for the potential bias introduced by the coupon type, we considered different coupon designs

with regard to both format and price discount, and randomized these coupon designs among the

four experimental groups. For example, for the same store, we randomized the level of price

discount (e.g., 10% off, 20% off, 30% off, or 50% off). For the same level of price discount, we

also randomized the coupon format (e.g., “price 50% off” vs. “buy one get one free”) to

minimize the potential bias the coupon format might introduce.5 Moreover, to make sure our

results were comparable across groups, we considered the same set of mobile promotions (in

terms of both format and price discount) in T0 as the ones used in T1 and T2. The only

difference is that the mobile promotions were sent randomly in T0, whereas they were tailored in

T1 and T2.

Note that to design real-time location-based mobile promotions (T1), we used a similar

approach as the one used in the previous studies (e.g., Spiekermann et al. (2011), Ghose et al.

(2013), Luo et al. 2014). In particular, we define “distance to a store” as the mobile user’s

physical distance from the center of the store. We sent the real-time location-based mobile

promotion to a consumer based on the store that had the shortest distance to the consumer at the

time of sending the coupon.

Finally, to control for any potential bias introduced by the timing of the coupon, we

randomized the timing of when the mobile coupon was sent. Note that for efficiency and

effectiveness of the recommendation, we conducted trajectory mining based on a large pool of

historic individual consumer trajectories collected by the shopping in the past one year. This

process allowed us to quickly identify trajectory similarity when a new customer walked into the

5 For robustness check, we also ran separate logit regressions using subsamples under different coupon designs. We found the results stay qualitatively consistent among different cases.

shopping mall. Moreover, to avoid “cold start,” we waited for a random time period (>=10 mins)

after the customer walked into the mall before sending mobile coupons. It is also important to

note that these m-coupons are tied to a specific mobile phone number and cannot be exchanged

between individuals. This alleviates concerns about any potential interference between units

from possible exchanges.

On each day, we randomly assigned approximately 6,000 unique consumers who visited the

shopping mall to one of the above four groups. To account for potential daily variation in a week,

we conducted the same experiment for 14 consecutive days over two weeks from June 9, 2014,

to June 22, 2014. Our experiment results are based on 83,370 unique user responses for a 14-day

period.6 For a better understanding of our data, we present the definitions and summary statistics

of all variables in Table 1.

6. Main Results

In this section, we discuss our experimental results based on different levels of econometric

analyses. We demonstrate first our results from group-level analyses on the mean treatment

effect. Then, we discuss our findings from individual-level analyses on the distribution of the

treatment effect. Finally, we discuss several robustness tests that we conduct to further account

for unobserved heterogeneity for better interpretation of our findings.

6.1 Group-level Analyses and Findings

First, we conduct group-level analyses. We compare daily group means (based on 14-day

average) on consumer coupon-redemption rate, time elapsed until redemption, money and time

6 Due to privacy policies of the mall, we could not identify repeat customers if they visit the shopping mall multiple times during the experimental period. In this study, we treated each individual trajectory as a unique customer. In reality the proportions of customers who visit the same mall more than once in the 2 week period is likely to be very small.

spent in store, total money and time spent in the mall To examine the statistical significance in

the difference in-group means, we conduct a one-way ANOVA test. The results from group-level

analyses are shown in Table 2a.

We find that on average, the new trajectory-based mobile advertising can lead to a

statistically significant increase in the coupon-redemption rate, and higher overall spending in

the shopping mall compared with all the baseline strategies. In particular, we find that on average,

the trajectory-based mobile advertising strategy leads to a 34.78% increase in the coupon-

redemption rate when compared with the static location-based advertising strategy, and a 93.75%

increase when compared with the random advertising strategy.

Interestingly, we also find that the new advertising strategy can lead to significantly lower

amount of time customers spend in the focal advertising store (9.82m vs. 13.24m/28.19m), but

higher revenues ($56.78 vs. $41.25/$23.50). This finding indicates that trajectory-based mobile

advertising can help make customers’ shopping experience more efficient. We also note that

based on group mean-level comparison, random advertising strategies on average perform the

worst. Such strategies can lead to lower customer satisfaction rates due to the potential annoying

effect from the improper ads.

Furthermore, to understand how the treatment effect may vary across different demographic

subgroups, we compare the mean treatment effects by breaking down the overall subject

population into different demographic subgroups. In particular, we are most interested in

comparing the coupon-redemption rates across different age and income subgroups. Again, we

conduct a one-way ANOVA test to examine the statistical significance in the difference in group

means. The results from different demographic subgroups are shown in Table 2b.

First, we find that on average, the younger age group (i.e., 20-30) is more responsive to

mobile advertising, whereas the older age group (i.e., 40-50+) is less responsive, regardless of

the type of mobile ads. Second, our results show that on average, customers with lower monthly

income (i.e., 2k-5k) are more active in responding to mobile promotions. However, they are not

sensitive to the type of mobile ads. This finding is reasonable because low-income customers are

often highly price sensitive; hence, any mobile ads that offer price promotions would attract

them. On the contrary, interestingly, customers with higher monthly income (i.e., 11k-50k) on

average are not as responsive to random ads (2%) or regular static location-based mobile ads

(9%). However, they are highly attracted by mobile trajectory-based ads (36%). Our findings

indicate the potential of trajectory-based mobile advertising in attracting high-end customers to

achieve better customer lifetime value. In particular, these high-income customers are usually the

“challenging type” for mobile advertising. They are likely to be extremely sensitive to the quality

of mobile targeting. They will not respond to a mobile ad just because it offers a lower price,

unless it is carefully designed and is a good fit for their personal preferences.

6.2 Individual-level Analyses and Findings

Beyond the group-level analyses, our unique data set acquired from the field experiment also

allows us to conduct individual-level analyses on the effect of mobile advertising on consumer

mobile coupon redemption and purchase behavior. In particular, we observe individual-level

consumer characteristics, mobile advertising response, and purchase behavior. Such data help us

further examine the distribution of the treatment effect of mobile advertising, as well as its

interaction with consumer heterogeneity.

6.2.1 Individual Mobile Coupon-Redemption Rate

First, we aim to examine the effect of different mobile advertising strategies (i.e., random,

static location-based, and trajectory-based) on the likelihood of consumer mobile coupon-

redemption behavior. To do so, we apply a Logit model at the individual consumer level and

model the consumer coupon-redemption rate as a function of consumer characteristics and

different mobile advertising strategies. To account for the potential variation in the mobile

advertising effects induced by the consumer heterogeneity, we include interaction effects

between consumer characteristics and different mobile advertising strategies.

More specifically, we model the utility for consumer i to redeem a mobile coupon as follows:

𝑈𝑖 = 𝑈�𝑖 + 𝜀𝑖

[3] where Xi is an individual-specific vector representing characteristics of consumer i (e.g., age,

gender, income level, credit card type, first-time visitor, shop alone, phone type.), Ti is an

individual-specific vector containing three binary indicators for three treatment groups (T0 −

Random, T1 − Location, T2 − Trajectory), and Di represents other individual-specific control

variables for consumer i (e.g., day of the week, time of day, coupon type, advertising store

category). 𝜀𝑖 is an individual stochastic error term that captures any randomness during

consumer i’s decision process. We assume the error term follows the type I extreme value

distribution.

Hence, the probability of consumer i redeeming a mobile coupon is

. [4] We provide the estimation results from the Logit model in Table 3. First, we find that on

average, trajectory-based mobile advertising outperforms all the baseline advertising strategies at

the individual consumer level. In particular, mobile trajectory-based ads show the highest

= 𝛼 + 𝛽𝑿𝑖 + 𝛾𝑻𝑖 + 𝜆𝑫𝑖 + 𝛿𝑻𝑖 × 𝑿𝑖 + 𝜑𝑻𝑖 × 𝑫𝑖 + 𝜀𝑖 , 𝜀𝑖~ i.i.d., EV(0,1),

𝑃𝑃𝑖(𝑅𝑅𝑅𝑅𝑅𝑅 = 1) =𝑅𝑒𝑒 (𝑈�𝑖)

1 + 𝑅𝑒𝑒 (𝑈�𝑖)

significant and positive effect on customer coupon-redemption rates (𝛾2=2.3381), compared to

the corresponding effects by location-based ads (𝛾1=1.0068) and by random ads (i.e., baseline).7

Second, we find significant differences in coupon-redemption behavior at different times. On

average, customers are more likely to redeem a mobile coupon during weekends than during

weekdays, and they are more likely to respond to a mobile coupon in the afternoon and evening

than in the morning.

Interestingly, our model with interaction effects demonstrates significant heterogeneity in the

treatment effect. In particular, we find that although on average, mobile trajectory-based ads

perform the best in increasing coupon-redemption responses, they become 85.78% less effective

during the weekends (i.e., interaction effect between T2 and Weekend is -2.0057, whereas the

average T2 effect is 2.3381). On the other hand, a random advertising strategy becomes much

more effective during the weekends and for first-time visitors. In particular, random advertising

outperforms all the advertising strategies during the weekends, showing a significant and

positive interaction effect (i.e., 1.8696) on coupon-redemption probability. Meanwhile, we also

find a similar trend for first-time visitors. These findings are intriguing. They seem to suggest

that weekend customers and first-time visitors tend to react to the random ads more frequently

compared to the targeted ads.

One potential reason is that customers who visit the shopping mall during the weekends tend

to be “explorers,” who often walk around the mall randomly and may not have a concrete

purchase plan beforehand. As a result, they may be more open to random ads, which can help

them with their explorative and variety-seeking habits in the shopping mall. Moreover, previous

marketing and psychology literature has shown that when customers are in the unplanned

7 Note that when studying the effect of mobile advertising on coupon redemption rate, we used the “random coupon” group (T1) as the baseline because participants in control group (C) did not receive any mobile coupon by experimental design.

purchasing stage, they are more likely to engage in impulse purchases (e.g., Stern 1962, Engel

and Blackwell 1982), that a positive relationship exists between variety-seeking behavior and

impulse buying (e.g., Sharma et al. 2010), and that a consumer’s propensity to purchase on

impulse receives a further impetus when she sees a random item on sale (Ramaswamy and

Namakumar, 2009). Therefore, being exposed to a random promotion can significantly increase

the likelihood of impulse purchases for customers who are in an unplanned shopping and

explorative stage.

On the other hand, targeted ads tend to directly lead the customers to the focal advertising

stores. As a result, they may constrict the scope and physical range of exploration, and reduce the

probability of impulse purchases, especially for customers who are in an unplanned shopping

stage. Previous literature has found that shorter in-store travel length has a negative effect on

consumers’ in-store impulse buying behavior (Hui 2013a, 2013b), and that historical-behavior-

based targeting may lead to less variety-seeking behavior from consumers (e.g., Fleder and

Honsanagar 2009). Our findings are in line with these previous studies indicating that marketers

need to carefully design the targeted campaign based on the shopping context and mental stage

of customers.

To further validate our findings, we examine the consumer spending and the length of

consumer in-mall travel distance in addition to the mobile-coupon-redemption rate. We find very

consistent evidence. We will discuss these results in the following subsections.

6.2.2 Individual Spending in Focal Advertising Store

In addition to the mobile ads’ effect on the individual-level coupon-redemption rate, we

conduct analyses on individual consumer spending in the focal advertising store. We model

consumer i’s spending in the focal advertising store, 𝑆𝑖𝑠𝑡𝑜𝑟𝑒, as the following:

[5]

where 𝜀𝑖𝑠𝑡𝑜𝑟𝑒~ i.i.d., N(0,1). We provide the estimation results from this model in Column 2 in

Tables 4.

On average, we find that trajectory-based mobile advertising outperforms all the baseline

advertising strategies in individual consumer spending in the focal advertising store, with 2.81

times the effect of static location-based ads (13.0159 vs. 4.6343) and 5.76 times the effect of

random ads (13.0159 vs. 2.2608). Similarly, we find significant time-level and day-level

heterogeneity. On average, customers tend to spend more in the focal advertising store during

weekends and in the afternoon and evening. Meanwhile, we also find that on average, female

customers tend to spend significantly more money male customers at the focal store. In addition,

age has a diminishing negative effect on focal-store spending.

On the other hand, interestingly, we did not find significant interaction effects between

various mobile advertising strategies and different consumer characteristics. This finding seems

to indicate the absence of any significant heterogeneity in the direct effect of mobile ads on

individual consumer spending at the focal advertising store. In other words, the focal advertising

store always benefits from well-designed mobile ads. Trajectory-based advertising leads to the

highest increase in focal-store spending, followed by static location-based advertising and then

random advertising.

6.2.3 Individual Total Spending in Shopping Mall

Meanwhile, we conduct individual analysis on consumer total spending in the entire

shopping mall. In particular, we model the overall spending of consumer i in the mall, 𝑆𝑖𝑚𝑎𝑙𝑙, as

the following:

𝑆𝑖𝑠𝑡𝑜𝑟𝑒 = 𝛼𝑠𝑡𝑜𝑟𝑒 + 𝛽𝑠𝑡𝑜𝑟𝑒𝑿𝑖 + 𝛾𝑠𝑡𝑜𝑟𝑒𝑻𝑖 + 𝜆𝑠𝑡𝑜𝑟𝑒𝑫𝑖 + 𝛿𝑠𝑡𝑜𝑟𝑒𝑻𝑖 × 𝑿𝑖 + 𝜑𝑠𝑡𝑜𝑟𝑒𝑻𝑖 × 𝑫𝑖 + 𝜀𝑖𝑠𝑡𝑜𝑟𝑒 ,

𝑆𝑖𝑚𝑎𝑙𝑙 = 𝛼𝑚𝑎𝑙𝑙 + 𝛽𝑚𝑎𝑙𝑙𝑿𝑖 + 𝛾𝑚𝑎𝑙𝑙𝑻𝑖 + 𝜆𝑚𝑎𝑙𝑙𝑫𝑖 + 𝛿𝑚𝑎𝑙𝑙𝑻𝑖 × 𝑿𝑖 + 𝜑𝑚𝑎𝑙𝑙𝑻𝑖 × 𝑫𝑖 + 𝜀𝑖𝑚𝑎𝑙𝑙 ,

[6] where 𝜀𝑖𝑚𝑎𝑙𝑙~ i.i.d., N(0,1). The corresponding results are shown in Column 3 in Tables 4.

Again, we find consistent evidence on the average treatment effects of mobile ads. On

average, trajectory-based mobile advertising is the most effective in increasing consumer total

spending in the shopping mall (𝛾2= 2.0003), compared to the corresponding effects by static

location-based ads (𝛾1= 1.4912) and by random ads (𝛾0= -0.7527). Meanwhile, we find

consistent and significant positive effects from weekend customers and female customers. We

also find that age has a diminishing negative effect on consumer overall spending in the mall.

Interestingly, we find significant interaction effects between various mobile advertising

strategies and different consumer characteristics on individual total spending at the entire mall

level. This finding suggests significant heterogeneity in the indirect effect of mobile ads on

individual consumer spending in the mall. In particular, we find that male customers are more

sensitive to well-designed mobile ads. This result is in line with previous findings on why people

make purchases (Underhill 1999), where the author finds that men prefer better guidance while

shopping. Our result suggests that well-designed mobile advertising can help provide better

shopping guidance. Moreover, we find that high-income customers are more sensitive to well-

designed mobile ads. Trajectory-based ads become significantly more effective for such

customers than for others. This result is consistent with our previous finding from the group-

level analyses. It demonstrates the great potential of trajectory-based mobile advertising in

attracting high-end customers in achieving better customer life-time value.

Moreover, we find consistent evidence that trajectory-based mobile advertising becomes the

least effective during the weekend, followed by static location-based advertising. In particular,

trajectory-based ads are 49.6% less effective in attracting individual consumer total spending in

the mall, and static location-based ads become 16.8% less effective. By contrast, random

advertising becomes the most effective in attracting individual total spending during the weekend,

with its effect more than doubled (i.e., interaction effect between T0 and Weekend is 0.8995,

whereas the mean T0 effect is 0.7527). These results are highly consistent with our previous

finding in the case of coupon-redemption rates, indicating weekend customers are much more

attracted to random ads than to trajectory-based or static location-based ads. The potential reason

is intuitive and in line with what we discussed previously. Weekend customers can be in the

exploration stage without a clear purchase plan, and hence they are likely to make impulse

purchases. However, targeted mobile ads may constrict the scope of exploration and reduce the

probability of impulse purchases, because they tend to directly lead the customers to the focal

advertising stores. By contrast, random ads are likely to increase the mobility range of customers

and facilitate their exploration in the mall. Consequently, random ads may lead to more impulse

spending by the customers. Our analysis in the next subsection further supports this finding by

looking into the total in-mall travel distance of customers.

6.2.4 In-Mall Total Travel Distance

To further validate our explanation, we look into the in-mall total travel distance for each

consumer during the weekend. The in-mall travel distance can largely indicate a customer’s

mobility range. We compare the average of this individual total travel distance across all four

experimental groups.

The results are shown in Table 5. We find that on average, an individual travels

approximately 1 kilometer in the shopping mall (C). Trajectory-based mobile advertising (T2)

results in the shortest average individual travel distance of 427.2 meters in the mall. Static

location-based advertising (T1) results in an average individual travel distance of 756.4 meters.

By contrast, random advertising (T0) results in the longest individual travel distance of 1304.1

meters, which is more than three times the length of that of the trajectory-based ads group.

This result provides further evidence to support our previous findings. It indicates that

trajectory-based advertising indeed can significantly reduce the mobility range of customers

during the weekend, whereas random advertising on average can increase the total travel distance

of an individual customer in the shopping mall. Interestingly, our findings demonstrate high

consistency with the previous marketing research on retailing and in-store consumer behavior.

Previous studies combining archival data analysis using an RFID (Radio-Frequency

Identification) in-store tracking system with a randomized field experiment has found that longer

customer in-store travel distance can lead to a higher probability of unplanned purchases (Hui et

al. 2013a, 2013b). Our results dovetail with the previous literature. Our findings indicate

significant heterogeneity in the mobile advertising effect depending on the shopping context.

Targeted mobile trajectory-based ads may not always work the best. They may reduce the

amount of impulse purchases from customers, especially during the weekends. Therefore,

businesses must understand the heterogeneity in the effect of different mobile ads. Marketers

should carefully take into account the business scope (e.g., revenues for a focal store vs. entire

shopping mall) when designing mobile advertising strategies.

6.3 Robustness Tests

To verify the robustness of our results, we further conduct different robustness tests. We

provide the detailed discussions below.

6.3.1 Robustness Test I: Store-level Heterogeneity

Moreover, different stores may provide different products, which may vary significantly in

category, price, quality, brand affinity, and so on. In our experimental setting, we observe

different stores, such as restaurants, cosmetics, supermarkets, and so forth. Randomization of

store participation across different control and treatment groups in our experiment can alleviate

such concerns to some extent, but potential store-level unobservables may still exist. For

example, trajectory-based mobile ads could be more effective than random ads for cosmetics

stores, though not necessarily for restaurants or supermarkets. To better control for the potential

store-level unobservables, we conduct panel data analysis to examine daily revenues using a

store-level fixed-effects model.

In particular, we model the overall revenues for store j on day t as a function of the number

of different mobile promotions sent by the store (𝑁𝑗𝑡𝑇1, 𝑁𝑗𝑡𝑇2, 𝑁𝑗𝑡𝐶2 ), a weekend dummy, interaction

effects, and a store-level fixed effect (𝜁𝑗):

𝑅𝑗𝑡 = 𝜃0 + 𝜃1𝑁𝑗𝑡𝑇1 + 𝜃2𝑁𝑗𝑡

𝑇2 + 𝜃3𝑁𝑗𝑡𝐶2 + 𝜃4𝑊𝑅𝑅𝑘𝑅𝑛𝑅𝑗𝑡 + 𝜃5𝑁𝑗𝑡

𝑇1 × 𝑊𝑅𝑅𝑘𝑅𝑛𝑅𝑗𝑡 + 𝜃6𝑁𝑗𝑡𝑇2 ×

𝑊𝑅𝑅𝑘𝑅𝑛𝑅𝑗𝑡 + 𝜃7𝑁𝑗𝑡𝐶2 × 𝑊𝑅𝑅𝑘𝑅𝑛𝑅𝑗𝑡 + 𝜁𝑗 + 𝜀𝑗𝑡. [7]

The estimation results from the store-level fixed-effects model are highly consistent with our

previous findings from both group-level and individual-level analyses. The results from the

store-level fixed-effects model are shown in Column 2 in Table 6. For a robustness test, we also

consider a store-level random-effects model. We found the results from both fixed-effects and

random-effects models stay very consistent. The results from the store-level random-effects

model are shown in Column 3 in Table 6.

In particular, our analyses show that on average, trajectory-based mobile advertising has the

highest positive effect on a store’s daily revenues after accounting for the potential store-specific

unobserved factors. Meanwhile, store revenues are significantly higher during the weekends than

during the weekdays. Interestingly, we find that although trajectory-based advertising becomes

less effective during the weekends compared to the weekdays, it always outperforms other

advertising strategies at store level. This result also supports our previous finding that the focal

advertising store always benefits from well-designed mobile ads.

6.3.2 Robustness Test II: Day-level Heterogeneity

To further test the robustness of our results, we conduct additional day-level analyses by

separately conducting the group-level and individual-level analyses for each day. Our day-level

separate analyses show high consistency with the pooled 14-day results. In particular, we find

that on average, mobile trajectory-based ads outperform all strategies across all days. Meanwhile,

we observe a significant decrease in the effect of the trajectory-based ads during the weekends,

compared to the corresponding effect from the weekdays. By contrast, we observe a significant

increase in the effect of the random ads during the weekends.

6.4 Summary of Findings

All levels of analyses demonstrate consistent findings. In particular, our main results can be

summarized as the following. First, we find that trajectory-based mobile advertising can

significantly increase the likelihood of a consumer redeeming a mobile promotion at the focal

advertising store. On average, the coupon-redemption rate is 34.78% higher when compared to

mobile advertising based on real-time location information, and it is 93.75% higher when

compared to mobile advertising based on purely random recommendations. Second, the

trajectory-based mobile advertising leads to the fastest redemption behavior from customers. On

average, the time elapsed between receiving the mobile promotion and redeeming it under

trajectory-based mobile advertising is approximately one-third the time elapsed under the static

location-based mobile advertising (4.55 vs. 12.83 minutes), and one-fourth the time under the

purely random-based mobile advertising (4.55 vs. 16.43 minutes). Third, our follow-up user

survey shows that trajectory-based mobile advertising leads to the highest overall satisfaction

rates and the highest future-willingness-to-redeem rates from the customers. Fourth, trajectory-

based mobile advertising has a significant and positive direct effect on the revenues of the focal

advertising store. It also increases the overall shopping mall revenues during weekdays. However,

it has a significant and negative indirect effect on the overall revenues of the shopping mall

during the weekend. This finding suggests customers are likely to be in an exploratory shopping

stage during the weekend and are likely to make impulse purchases. However, targeted mobile

trajectory-based ads tend to constrict consumer focus and significantly reduce the impulse

purchase behavior. This finding suggests businesses and marketers need to be careful when

implementing mobile advertising strategies, depending on different business scopes. Finally, we

also find that trajectory-based mobile advertising is especially effective in attracting high-income

consumers, suggesting the high potential of mobile advertising in converting customers with a

higher expected lifetime value.

7. Conclusion and Future Work

The proliferation of mobile and sensor technologies make it possible to leap beyond the real-

time snapshot of the static location and context information about consumers. In this study, we

propose a novel mobile advertising strategy that infers consumer preferences by leveraging full

information on consumers’ offline moving trajectories from four different mobility dimensions.

To measure the effectiveness of this new kind of mobile advertising, we conduct a large-scale

randomized field experiment in a major shopping mall in Asia based on 83,370 unique user

responses for a 14-day period.

We find that by extracting and incorporating the overall offline behavioral trajectory of each

individual consumer, we are able to significantly improve the performance of mobile advertising.

In particular, our results show that on average, trajectory-based mobile advertising can help

businesses achieve higher coupon-response rate and higher revenues compared to the existing

baseline location-based advertising strategies.

Meanwhile, our study also reveals significant heterogeneity in the mobile advertising effect.

Targeted mobile trajectory-based ads may not always perform the best. They may reduce the

amount of impulse purchases from customers, especially during the weekends. Therefore,

businesses would be better off reflecting on the heterogeneity in the effect of different kinds of

mobile ads on different days of the week.

On a broader note, to our knowledge, our paper is the first to analyze the digital trace of

individuals’ offline shopping behavior and how it can be linked to better understand individual

preferences and decision-making. Our work can be viewed as a first step to study the digitization

of offline behavior at a large-scale and granular level. We demonstrate the value of leveraging

mobile and sensor technologies to digitize, measure, understand, and predict individual

behavioral trajectory in the physical environment to improve user digital experiences and

business marketing strategies.

Our paper has some limitations, which can serve as fruitful areas for future research. First,

because of technical limitations of our GPS tracking system, we could recruit only customers

who were interested in accessing Wi-Fi, which resulted in approximately 80% of the customers

in the shopping mall.8 However, this number could potentially improve in the future with a

tracking system based on more advanced sensor technologies (e.g., wearable devices). Second, in

the current analysis, we were able to control for various observed individual characteristics, such

as age, income, gender, and so on. However, individual-level unobserved heterogeneity might

still exist. In the future, we can incorporate random coefficient models to better account for such

8 We obtained this percentage based on the shopping mall’s statistics of its historical customer visits and Wi-Fi usage at the daily average levels.

individual-specific unobservables. Third, currently our recommendations are based on similarity

between customers. In the future, we could potentially experiment with alternative

recommendation strategies, for example, recommendation based on dissimilarity between

customers. Fourth, due to privacy policy of the shopping mall, we could not identify repeated

customers if they visit the shopping mall multiple times during the 14-day experimental period.

In this study, we treated each individual trajectory as a unique customer. In the future, it would

be interesting if we can identify return customers or the same customers who visit different

shopping malls, to study the individual long-term learning behavior facilitated by the mobile

advertising interventions.

References

• Adomavicius, G., Mobasher, B., Ricci, F., Tuzhilin, A. 2011. Context-Aware Recommender Systems. AI Magazine. 35(4), 217-253.

• Andrews, M., Luo, X., Fang, Z., and Ghose, A. 2015. Mobile Ad Effectiveness: Hyper-Contextual Targeting with Crowdedness forthcoming, Marketing Science. Forthcoming.

• Bart, Y, Stephen, A, and Sarvary, M. 2014. Which Products are Best Suited to Mobile Advertising? A Field Study of Mobile Display Advertising Effects on Consumer Attitudes and Intentions. Journal of Marketing Research. 51(3), 270-285.

• Brohee, S. and Jacques, V. H. 2006. Evaluation of Clustering Algorithms for Protein Interaction Networks. BMC Bioinformatics, 7 (1), 488.

• Cuturi, M. 2011. Fast Global Alignment Kernels. Proceedings of the 2011 International Conference on Machine Learning. 929-936.

• Danaher, P., Smith, M., Ranasinghe, K., and Danaher, T. 2015. Where, When and How Long: Factors that Influence the Redemption of Mobile Phone Coupons. Journal of Marketing Research. Forthcoming.

• Eisen, M. B., Spellman, P. T., Brown, P. O., and Botstein, D. 1998. Cluster Analysis and Display of Genome-wide Expression Patterns. Proceedings of the National Academy of Sciences, 95 (25), 14863–14868.

• Ericsson. 2014. Ericsson Mobility Report. Nov. 2014. • eMarketer. 2014. Worldwide Mobile Phone Users: H1 2014 Forecast and Comparative

Estimates. http://www.emarketer.com/Article/Smartphone-Users-Worldwide-Will-Total-175-Billion-2014/1010536

• Engel, J. and Blackwell, R. 1982. Consumer Behavior. Chicago: Dryden Press.

http://www.emarketer.com/Article/Smartphone-Users-Worldwide-Will-Total-175-Billion-2014/1010536

http://www.emarketer.com/Article/Smartphone-Users-Worldwide-Will-Total-175-Billion-2014/1010536

• Esslimani, I., Brun, A., and Boyer, A. 2009. From Social Networks to Behavioral Networks in Recommender Systems. Proceedings of The 2009 International Conference on Advances in Social Network Analysis and Mining. 143–148.

• Fleder, D. and Hosanagar, K. 2009. Blockbuster Culture's Next Rise or Fall: The Impact of Recommender Systems on Sales Diversity. Management Science, 55(5), 697-712.

• Fong, N. M., Fang, Z., and Luo, X. 2015. Real-Time Mobile Geo-Conquesting Promotions. Journal of Marketing Research. Forthcoming.

• Gaffney, S. and Smyth, P. 1999. Trajectory Clustering with Mixtures of Regression Models. Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 63-72.

• Ghose, A., Ipeirotis, P. and Li, B. 2012. Designing Ranking Systems for Hotels on Travel Search Engines by Mining User-Generated and Crowdsourced Content. Marketing Science. 31(3), 493-520.

• Ghose, A., Ipeirotis, P. and Li, B. 2014. Examining the Impact of Ranking on Consumer Behavior and Search Engine Revenue. Management Science. 60(7).

• Ghose, A., Han, S. and Park, S. 2013a. Analyzing the Interdependence Between Web and Mobile Advertising: A Randomized Field Experiment. Working paper, NYU.

• Ghose, A., Goldfarb, A., and Han, S. 2013b. How is the Mobile Internet Different? Search Costs and Local Activities, Information Systems Research, 24(3), 613-631.

• Guo, X., Chan, Eddie C. L., Liu, C., Wu, K., Liu, S., and Ni, M. L. 2014. ShopProfiler: Profiling Shops with Crowdsourcing Data. Proceedings of the 33rd Annual IEEE International Conference on Computer Communications. 1240 – 1248.

• Gu, B. 2012. Quantifying the Dynamic Sales Impact of Location-based Mobile Advertising Technologies. Working Paper, Arizona State University.

• Huang, J., Sun, H., Han, J., Deng, H., Sun, Y., and Liu, Y. 2010. Shrink: a Structural Clustering Algorithm for Detecting Hierarchical Communities in Networks. Proceedings of the 19th ACM International Conference on Information and Knowledge Management. 219-228.

• Hui, S., Inman, J., Huang, Y., and Suher, J. 2013a. Estimating the Effect of Travel Distance on Unplanned Spending: Applications to Mobile Promotion Strategies. Journal of Marketing, 77 (March), 1-16.

• Hui, S., Huang, Y., Suher, J. and Inman, J. 2013b. Deconstructing the ‘First Moment of Truth’: Understanding Unplanned Consideration and Purchase Conversion Using In-Store Video Tracking. Journal of Marketing Research, 50 (4), 445-462.

• Lee, J.-G., Han, J., and Whang, K.-Y. 2007. Trajectory Clustering: a Partition-and-Group Framework. Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data. 593-604.

• Leonard, H. 2013. There Will Soon Be One Smartphone for Every Five People in the World. http://www.businessinsider.com/15-billion-smartphones-in-the-world-22013-2.

• Leskovec, J., Lang, K. J., and Mahoney, M. W. 2010. Empirical comparison of algorithms for network community detection. Proceedings of the 19th International Conference on World Wide Web. 631-640.

• Liu, S, Wang, S., Jayarajah, K., Misra, A., and Krishnan, R. 2013. TODMIS: Mining Communities from Trajectories. Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. 2109-2118.

http://www.businessinsider.com/15-billion-smartphones-in-the-world-22013-2

• Liu, Y., Zhao, Y., Chen, L., Pei, J., and Han, J. 2012. Mining Frequent Trajectory Patterns for Activity Monitoring Using Radio Frequency Tag Arrays. IEEE Transactions on Parallel and Distributed Systems, 23 (11), 2138–2149.

• Luo, X., Andrews, M., Fang, Z., and Phang, Z. 2014. Mobile Targeting. Management Science. 60 (7), July, 1738-1756.

• MacQueen, J. 1967. Some Methods for Classification and Analysis of Multivariate Observations. Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, 1 (14), 281- 297.

• Manning, C. D., Raghavan, P., Schutze, H. 2008. Scoring, Term Weighting, and the Vector Space Model. Introduction to Information Retrieval. 100.

• Molitor, D., Reichhart, P., Spann, M., and Ghose, A. 2014. Measuring the Effectiveness of Location-Based Advertising: A Randomized Field Experiment. Working paper, NYU.

• Nanni, M. and Pedreschi, D. 2006. Time-focused Clustering of Trajectories of Moving Objects. Journal of Intelligent Information Systems. 27(3), 267 – 289.

• Newman, M. 2004. Detecting Community Structure in Networks. The European Physical Journal B. 28(2), 321-330.

• Ng, R. T. and Han, J. 2002. Clarans: A Method for Clustering Objects for Spatial Data Mining. IEEE Transactions on Knowledge and Data Engineering. 14(5), 1003 – 1016.

• Nowak, M. and Nass, C. 2012. Effects of Behavior Monitoring and Perceived System Benefit in Online Recommender Systems. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 2243–2246.

• Pelekis, N., Kopanakis, I., Kotsifakos, E. E., Frentzos, E. and Theodoridis, Y. 2011. Clustering Uncertain Trajectories. Knowledge and Information Systems. 28(1), 117-147.

• Ramaswamy, V. S. and Namakumari, S. 2009. Marketing Management (4th Edition.). New Delhi. McMillan.

• Sahni, N. and Nair, H. (2015). Native advertising: Evidence from mobile advertising field experiment. Presented at Marketing Science (Baltimore).

• Venu, S., Parthasarathy, S., and Ucar, D. 2010. Markov Clustering of Protein Interaction Networks with Improved Balance and Scalability. Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology. 247-256.

• Venkatesh,S., and Balasubramanian, S.. 2009. Mobile Marketing: A Synthesis and Prognosis. Journal of Interactive Marketing, 23 (2), 118-129.

• Venkatesh, S., Venkatesh, A., Hofacker, C. and Naik, P. 2010. Mobile Marketing in the Retailing Environment: Current Insights and Future Research Avenues. Journal of Interactive Marketing, 24 (2), 111-120.

• Sharma, P., Sivakumaran, B. and Marshall, R. 2010. Impulse Buying and Variety Seeking: A Trait-correlates Perspective. Journal of Business Research, 63, 276-83.

• Shi, C, Yu, P. S., Cai, Y., Yan, Z., and Wu, B. 2011. On Selection of Objective Functions in Multi-objective Community Detection. Proceedings of the 20th ACM International Conference on Information and Knowledge Management. 2301-2304.

• Spiekermann, S., Rothensee, M., Klafft, M. 2011. Street Marketing: How Proximity and Context Drive Coupon Redemption. Journal of Consumer Marketing. 28(4), 280–289.

• Stern, H. 1962. The Significance of Impulse Buying Today. Journal of Marketing. April, 59-62.

http://nlp.stanford.edu/IR-book/pdf/06vect.pdf

• Paco, U. 1999. Why We Buy: The Science of Shopping. Random House Audio. • Stijn, V. D., Cei, A.. 2012. Using MCL to Extract Clusters from Networks. Bacterial

Molecular Networks. 281-295. • Xiang, T. and Gong, S. 2006. Beyond Tracking: Modelling Activity and Understanding

Behaviour. International Journal of Computer Vision. 67(1), 21-51.

Table 1. Definitions and Summary Statistics of Variables

Variable Definition Mean Std. Dev. Min Max C Control Group, Do nothing 0.2472 0.4436 0 1 T0 Treatment Group 0, Random promotions 0.2434 0.4291 0 1 T1 Treatment Group 1, Location-based ads 0.2517 0.4546 0 1 T2 Treatment Group 2, Trajectory-based ads 0.2577 0.4656 0 1 Sunday Whether the visit was on Sunday 0.1416 0.3681 0 1 Monday Whether the visit was on Monday 0.1406 0.3413 0 1 Tuesday Whether the visit was on Tuesday 0.1433 0.3399 0 1 Wednesday Whether the visit was on Wednesday 0.1418 0.3405 0 1 Thursday Whether the visit was on Thursday 0.1418 0.3404 0 1 Friday Whether the visit was on Friday 0.1431 0.3397 0 1 Saturday Whether the visit was on Saturday 0.1477 0.3754 0 1 Morning Whether the visit was in the morning 0.3217 0.4013 0 1 Afternoon Whether the visit was in the afternoon 0.3407 0.4900 0 1 Evening Whether the visit was in the evening 0.3376 0.4762 0 1 TimeElapse Time elapse between receiving a coupon and

redeeming it 11.0285 9.3735 0 36

Male Whether the customer is male customer 0.3475 0.4762 0 1 Age Age of the customer 37.9407 10.8149 18 64 Income Monthly Income (1000RMB) 16.9538 8.0364 3 33 FirstTimeVisit Whether the customer is first-time visitor 0.0227 0.1488 0 1 Credit Type Indicator: Gold, Platinum, Gift Card, or Others -- -- -- -- Phone Type Indicator: iPhone, Android, or Others -- -- -- -- Shopping Alone Whether the customer is shopping alone 0.2612 0.4395 0 1 Store Category Indicator: Fashion, Beauty, Home, Kids, Grocery,

Restaurant, Electronics, or Others -- -- -- --

Coupon Type Indicator for different coupon designs -- -- -- -- Redeem Whether the customer redeemed the coupon 0.2484 0.4973 0 1

FutureRedeem Whether the customer is willing to redeem the coupon in the future

0.3451 0.4754 0 1

TimeSpentStore Total time spent in the focal advertising store (min) 16.2987 19.0283 0 51 TimeSpentMall Total time spent in the mall (min) 60.7137 36.2377 9 273 Satisfaction Satisfaction rate of customer 2.8813 1.7551 0 5 𝑆𝑖𝑠𝑡𝑜𝑃𝑅 Customer i’s total spending in the focal

advertising store (RMB) 45.9112 98.7585 0 5028

𝑆𝑖𝑅𝑎𝑙𝑙 Customer i’s total spending in the mall (RMB) 140.8533 1812.76

0 15742 𝑅𝑗𝑡 Store j’s total revenue on day t (RMB) 4522.746

30012.1

0 63500

# of Users in Control Group (Daily Average) 1472 # of Users in Treatment Group 0 (Daily Average) 1449 # of Users in Treatment Group 1 (Daily Average) 1499 # of Users in Treatment Group 2 (Daily Average) 1535 Time Period: 6/9/2014-6/22/2014 (14 days)

Total # of Observations: 83,370

Table 2a. Group-Level Comparisons (Daily Mean, 14-Day Period)

Group Redeem Rate

Future Redeem

Rate

Money Spent

in Focal Store

($)

Total Money

Spend in Mall ($)

Time Elapse

Until Redee

m

Satisfaction Rate

Time Spent

in Focal Store (min)

Total Time Spent

in Mall (min)

C -Control -- -- -- 84.98 -- 2.6 -- 46.75 T0-Random 16% 21% 23.50 88.19 16.43 2.1 28.19 56.72 T1-Location 23% 34% 41.25 166.87 12.83 3.2 13.24 63.85 T2-Trajectory 31% 56% 56.78 193.06 4.55 4.3 9.82 71.98

P<0.05 (ANOVA)

Table 2b. Demographic Subgroup-Level Redemption Rate Comparisons

(Daily Mean, 14-Day Period)

Group /Redeem Rate

Age (20-30)

Age (30-40)

Age (40-50+)

Income9 (2k-5k)

Income (6k-10k)

Income (11k-50k)

C -Control -- -- -- -- -- -- T0-Random 25% 17% 6% 34% 15% 2% T1-Location 24% 20% 10% 33% 23% 9% T2-Trajectory 21% 23% 15% 32% 27% 36%

P<0.05 (ANOVA)

Table 5. Total In-Mall Travel Distance during the Weekends (Daily Mean)

Group Total In-Mall Distance (Meters) C -Control 1002.8

T0-Random 1304.1 T1-Location 756.4

T2-Trajectory 427.2 P<0.05 (ANOVA)

9 Income is measured in RMB at monthly level.

Table 3. Logit Model Estimation Results on Consumer Redemption Probability

Variables Coefficient (Std. Err.) T0-Random −− T1-Location 1.0068 (.2851) ***

T2-Trajectory 2.3381 (.2747) *** Weekend 0.3612 (.0129) *** Afternoon 0.4326 (.0165) *** Evening 0.3010 (.1069) **

Male -0.0027 (.0264) Age -1.1987 (1.0025)

Age^2 0.3968 (2.4472) Income 0.5758 (.4521)

Income^2 -0.1882 (.2168) FirstTimeVisitor Yes

Credit Type Yes Phone Type Yes

Shopping Context Yes Advertising Store Category Yes

Coupon Type Yes T0-Random × Weekend 1.8696 (.0305) *** T1-Location × Weekend -0.0091 (.0259)

T2-Trajectory × Weekend -2.0057 (.0286) *** T0-Random × Male 0.0113 (.0301) T1-Location × Male -0.0237 (.0254)

T2-Trajectory × Male 0.0092 (.0357) T0-Random × Income -0.0356 (.0284) T1-Location × Income 0.0026 (.0242)

T2-Trajectory × Income 0.0170 (.4693) T0-Random × FirstTimeVisit 1.1648 (.0940) * T1-Location × FirstTimeVisit -0.0702 (.0841)

T2-Trajectory × FirstTimeVisit -0.0168 (.0894) *** P<0.001, ** P<0.05, * P<0.1 Total # of Observations: 83,370

Table 4. Estimation Results on Consumer Spending in Focal Store (I) and in the Mall (II)

Variables Coefficient I Coefficient II T0-Random 2.2608 (1.0236) ** 0.7527 (.0378) *** T1-Location 4.6343 (1.2117) *** 1.4912 (.3774) ***

T2-Trajectory 13.0159 (1.2169) *** 2.0003 (.3628) *** Weekend 1.1870 (.7896) * 0.6723 (.0486) *** Afternoon 1.2018 (.5101) ** 1.0312 (.4238) ** Evening 1.7918 (.6357) ** 1.0033 (.5044) *

Male -0.7663 (.3775) * -0.1138 (.0484) ** Age -0.4680 (.2659) * -0.0238 (.0074) ***

Age^2 0.0063 (.0034) * 0.0045 (.0008) *** Income -0.1749 (.2637) -1.0361 (.6691)

Income^2 0.0791 (.1209) 0.4640 (.3020) FirstTimeVisitor Yes Yes

Credit Type Yes Yes Phone Type Yes Yes

Shopping Context Yes Yes Advertising Store Category Yes Yes

Coupon Type Yes Yes T0-Random × Weekend 0.0135 (.0132) 0.8995 (.0615) *** T1-Location × Weekend -0.0012 (.0128) -0.2502 (.0595) ***

T2-Trajectory × Weekend 0.0137 (.0126) -0.9923 (.0587) *** T0-Random × Male 0.0076 (.0131) 0.0617 (.0109) ** T1-Location × Male -0.0035 (.0127) 0.0735 (.0590)

T2-Trajectory × Male -0.0130 (.0125) 0.1138 (.0582) ** T0-Random × Income 0.0014 (.0122) 0.0933 (.0566) * T1-Location × Income 0.0079 (.0125) 0.1442 (.0578) **

T2-Trajectory × Income -0.0058 (.0124) 0.2334 (.0573) *** T0-Random × FirstTimeVisit -0.0085 (.0401) 0.1941 (.1864) T1-Location × FirstTimeVisit -0.0272 (.0399) -0.0870 (.1855)

T2-Trajectory × FirstTimeVisit -0.0249 (.0382) 0.0828 (.1774) *** P<0.001, ** P<0.05, * P<0.1 Total # of Observations: 83,370

Table 6. Estimation Results from Store Fixed/Random Effect Models on Store Daily Revenues

Variables Coefficient Fixed-Effect Coefficient Random-Effect # of Random Coupons -0.1566 (.0092) *** -0.1596 (.0091) ***

# of Location-based Coupons 0.0140 (.0081) * 0.0147 (.0080) * # of Trajectory-based Coupons 0.1562 (.0059) *** 0.1566 (.0059) ***

Weekend 0.9149 (.0735) *** 0.9177 (.0735) *** #Random × Weekend 0.0287 (.0145) ** 0.0288 (.0145) ** #Location × Weekend -0.0060 (.0029) * -0.0057 (.0029) *

#Trajectory × Weekend -0.0346 (.0096) *** -0.0343 (.0095) *** *** P<0.001, ** P<0.05, * P<0.1

Total # of Observations: 14,112 (252 Stores * 14 Days * 4 Groups)

Figure 1. Example of Graph-Based Trajectory Clustering

Figure 1 illustrates an example of the output from our graph-based clustering method. We first construct an undirected probabilistic graph of individual trajectories, where each node in the graph represents a consumer’s trajectory, and the weight on each edge between two nodes represents the pairwise similarity between two consumers. Therefore, if two consumers are very similar to each other in their trajectory patterns, the weight on the edge between the two corresponding consumer nodes would be very high. Our goal is to detect a set of highly connected sub-graphs (cliques) from the graph where the weight on the edge between each pair of two nodes in the sub-graphs is relatively high (i.e., dense sub-graph). The basic intuition of the MCL algorithm is based on the idea of random walk. The probability of visiting a connected node is proportional to the weight on the edge. In other words, the random walk will stabilize inside the dense regions of the network after many steps. The stabilized regions shape the clustered sub-graph and reflect the intrinsic structure of the network. The sub-graphs (as highlighted in different colors in Figure 1) hence represent the identified clusters of similar consumers.

Figure 2. Example of Mobile Trajectories of Consumers in a Large Shopping Mall

Figure 2 visualizes an example of movement trajectories of individual customers traveling upstairs and downstairs in the shopping mall. Each line in the figure represents a trajectory from an individual consumer.

Online Appendix A: Profiling Store Based on Crowdsourcing

To better contextualize the mobile trajectories and to capture detailed store-profile

information in the shopping mall, we applied a similar approach as in Guo et al. (2013). In

particular, we deployed ShopProfiler, a shop-profiling system on crowdsourcing data.

ShopProfiler only relies on sensor readings from mobile devices. The process of data collection

is automatic and runs in the background. Inertial sensor readings reflect human movement

information such as acceleration, heading, and speed. Microphone and WiFi modules provide

additional information of surrounding conditions. From the customer’s point of view, mobile-

sensing data in a mall contain information about what shops that customer visits, how long she

stays, and how fast she is walking. From a shop’s point of view, mobile-sensing data reveal

information about the inside layout of the shops and how many people visit the shops in a

particular time period. Through mining and learning mobile-sensing data, we capture unique

features of different shops and categorize them into different types (e.g., restaurant, books,

supermarket, electronics, fashion, etc.). Furthermore, a complete profile of shops contains

specific location and precise brand names. We use text mining to extract the brand-name

information from the Service Set Identifier (SSID) from each store.

More specifically, ShopProfiler has four modules: (1) Collecting sensor data: Sensor data are

collected from inertial sensors, microphones, and a WiFi module while people walk around in

shopping malls. Inertial sensors including accelerometer and gyroscope provide sufficient

information for tracking customers. In addition, acoustic information reflects environmental

features. Moreover, a WiFi module captures data from all available access points (APs). (2)

Capturing movement pattern: Customer traces in the mall reveal features of what kind of shops

the customers visit, how long they stay, and how fast they are walking. On the other hand, from a

shop’s point of views, traces show the layout inside shops and how many people visit the shops

in a particular time period. (3) Differentiating basic units: We first differentiate between shop

and corridor, which are the two elemental units in typical malls. ShopProfiler extracts unique

features from movement patterns to differentiate between these basic units. Then, we propose a

novel gradient-based room-boundary-detection algorithm that reduces the false positive rate by

traditionally applying the density-based algorithm. (4) Categorizing shops and discovering brand

names: ShopProfiler adopts SVM to categorize shops (i.e., restaurants, electronics, books, etc.).

This module is based on the data collected in module 1 and analysis of movement patterns in

module 2. Furthermore, ShopProfiler performs an AP localization algorithm via crowdsourcing

and matches AP to floor outline. Finally, we use SSID mining to label a shop’s brand name.

In Figure 3, we provide an example of the resulting floor plan after the profiling process

below for illustration.

Figure 3. Example of the Resulting Floor Plan after the Profiling Process in the Mall