11
P REDICTABILITY STATES IN HUMAN MOBILITY Diogo Pacheco BioComplex Laboratory Department of Computer Science University of Exeter [email protected] Marcos Oliveira BioComplex Laboratory Department of Computer Science University of Exeter, UK [email protected] Zexun Chen Business School University of Edinburgh, UK [email protected] Hugo Barbosa BioComplex Laboratory Department of Computer Science University of Exeter, UK [email protected] Brooke Foucault Welles Northeastern University Boston, MA, USA [email protected] Gourab Ghoshal Department of Physics and Astronomy University of Rochester, NY, USA [email protected] Ronaldo Menezes BioComplex Laboratory Department of Computer Science University of Exeter, UK [email protected] January 6, 2022 ABSTRACT Spatio-temporal constraints coupled with social constructs have the potential to create fluid predictabil- ity to human mobility patterns. Accordingly, predictability in human mobility is non-monotonic and varies according to this spatio-socio-temporal context. Here, we propose that the predictability in human mobility is a state and not a static trait of individuals. First, we show that time (of the week) explains people’s whereabouts more than the sequences of locations they visit. Then, we show that not only does predictability depend on time but also the type of activity an individual is engaged in, thus establishing the importance of contexts in human mobility. 1 Introduction Human beings are routine-oriented to the extent that lack of predictability in daily mobility patterns is linked to high levels of stress [1, 2]. This change-averse behaviour leads to people having well-defined routines, which allows for high predictability in daily mobility patterns. Human trajectories have been shown to exhibit regularities at multiple scales, despite the inherent complexity that exists in the choices humans can make for the routes of their daily travels. Indeed, the analysis of large populations via mobile phone data has suggested the possibility of predicting up to 93% of human movement [3]. The predictability of human mobility, however, tells us only part of the story, since it neglects spatio-temporal constraints and the social embedding behind mobility regularities. Therefore, this work demonstrates that mobility predictability should be seen as a transient state, rather than a trait of individuals. The understanding of mechanisms governing human travelling behaviour is crucial to a variety of domains such as epidemic modelling [4], traffic management [5], and national security [6], to name but a few [7]. The modelling of human predictability as a state dependent on activity being performed (spatio-social) and the time of such activity (temporal) can lead to better decisions within the aforementioned domains as it offers a finer, more detailed, view of human dynamics. arXiv:2201.01376v1 [physics.soc-ph] 4 Jan 2022

arXiv:2201.01376v1 [physics.soc-ph] 4 Jan 2022

Embed Size (px)

Citation preview

PREDICTABILITY STATES IN HUMAN MOBILITY

Diogo PachecoBioComplex Laboratory

Department of Computer ScienceUniversity of Exeter

[email protected]

Marcos OliveiraBioComplex Laboratory

Department of Computer ScienceUniversity of Exeter, UK

[email protected]

Zexun ChenBusiness School

University of Edinburgh, [email protected]

Hugo BarbosaBioComplex Laboratory

Department of Computer ScienceUniversity of Exeter, UK

[email protected]

Brooke Foucault WellesNortheastern University

Boston, MA, [email protected]

Gourab GhoshalDepartment of Physics and Astronomy

University of Rochester, NY, [email protected]

Ronaldo MenezesBioComplex Laboratory

Department of Computer ScienceUniversity of Exeter, UK

[email protected]

January 6, 2022

ABSTRACT

Spatio-temporal constraints coupled with social constructs have the potential to create fluid predictabil-ity to human mobility patterns. Accordingly, predictability in human mobility is non-monotonic andvaries according to this spatio-socio-temporal context. Here, we propose that the predictability inhuman mobility is a state and not a static trait of individuals. First, we show that time (of the week)explains people’s whereabouts more than the sequences of locations they visit. Then, we show thatnot only does predictability depend on time but also the type of activity an individual is engaged in,thus establishing the importance of contexts in human mobility.

1 Introduction

Human beings are routine-oriented to the extent that lack of predictability in daily mobility patterns is linked to highlevels of stress [1, 2]. This change-averse behaviour leads to people having well-defined routines, which allows for highpredictability in daily mobility patterns. Human trajectories have been shown to exhibit regularities at multiple scales,despite the inherent complexity that exists in the choices humans can make for the routes of their daily travels. Indeed,the analysis of large populations via mobile phone data has suggested the possibility of predicting up to 93% of humanmovement [3].

The predictability of human mobility, however, tells us only part of the story, since it neglects spatio-temporal constraintsand the social embedding behind mobility regularities. Therefore, this work demonstrates that mobility predictabilityshould be seen as a transient state, rather than a trait of individuals. The understanding of mechanisms governing humantravelling behaviour is crucial to a variety of domains such as epidemic modelling [4], traffic management [5], andnational security [6], to name but a few [7]. The modelling of human predictability as a state dependent on activity beingperformed (spatio-social) and the time of such activity (temporal) can lead to better decisions within the aforementioneddomains as it offers a finer, more detailed, view of human dynamics.

arX

iv:2

201.

0137

6v1

[ph

ysic

s.so

c-ph

] 4

Jan

202

2

A PREPRINT - JANUARY 6, 2022

Several factors in our daily lives, such as work schedules and biological processes, restrict our travelling behaviour.For instance, our internal circadian and ultradian (i.e. less than 24h) rhythms have a direct impact on our activityschedules and therefore on our mobility patterns [8, 9, 10, 11, 12]. Interestingly, a sudden absence of constraints cancompletely alter the predictability levels and rhythms in human mobility (e.g., as in the recent COVID-19 lock-downprocedures [13] where people were not bound to usual constraints and constructs). These constraints are likely to shapethe uncertainty on the whereabouts of people, which in turn brings more demand to services such as transport networks,grocery stores, and hospitals; the lack of predictability reduces our ability to plan based on demand.

Guessing that a person will be at home on a Tuesday at 4am will most likely be a correct guess for most individualsunder “normal” circumstances. The same cannot be said to be true for the same individual at 11am on Saturday morning,in fact, it may be much harder to know a person’s whereabouts during times in which they are not bound by typicaldaily rhythms. Such variations lead to predictability being a temporal/transient state of individuals. In all certainly, thisstate also depends on social and spatial aspects but differences that are imposed by socio-spatio aspects are capturedalso by predictability (or lack there of) in the temporal dimension. For instance, the presence of someone at a pub at12pm or 9pm carries is not only a temporal difference; the location of the pub (spatio) and who will be there with thisperson (social) is related to the time of the event.

In this work, we describe the temporal regularities of the theoretical predictabilities of human mobility, and examinetheir different frequency and time components. Our results suggest that in addition to the daily routines, mobilitydiversity is also marked by periods of approximately 12h and 6h, which correspond to the second and fourth harmonicsof our internal circadian rhythm. These findings suggest that the processes responsible for our visitation regularities aregoverned by our internal biological cycles beyond the sleeping and feeding needs, evidenced by predominance of the12h periods over the 8h and 6h cycles. We argue that such patterns should lead to the predictability in human mobilityto be considered a transient state rather than a fixed characteristic of individuals.

2 Results

In a seminal study in human mobility, Song et al. [3] used cellphone data and proposed an information theoreticapproach to estimate mobility predictability based on the uncertainty of visiting a number of different locations.Knowing the sequence of visitations (entropy-rate) in addition to their frequencies (Shannon entropy) was the keyto estimate predictability. They showed one’s next visited location could be predictable at most 93% of the time. Inthe present work, the locations of social media check-ins are used to approximate one’s mobility trajectory. Thisdata is, however, fundamentally different from cellphone data. For instance, the location of a mobile user is given bythe triangulation of tower positions, whereas for social media users it represents the latitude/longitude coordinatesof the place checked-in. Also, check-in data is always intended to be provided while cellphone users can be trackedcontinuously without explicit consent, e.g., while receiving a call (see more details in the Materials and Methodssection).

Before introducing the new concepts of the proposed predictability state, we validate our data by replicating Song etal. [3]. Figure 1 shows that for our datasets the predictabilities derived from sequence of check-ins (visited locations)Πc peak around 40%. They are indeed higher than the one based on the frequency of check-ins Πu, around 20%. It isworth noting that these figures are markedly lower than the ones previously reported, 93% [3] and 71% [14]. Suchdifference can be explained by the fact that, in our work, locations are determined by the places in which users havechecked-in without any spatial coarse-graining. As pointed out by Ikanovic and Mollgaard [14], the predictabilitymeasure is dependent on the spatial resolution such that when the spatial resolution ∆s→∞, the predictability Π→ 1.Conversely, when ∆s→ 0, the predictability Π→ 0.

2.1 Predictability as a time-dependent state of individuals

Predictability based on trajectory (sequence of visited location) indirectly encapsulates a temporal order, but it ignoresany time property larger than the one used to define the trajectory itself. That is, if a trajectory is the sequence ofvisited locations during one day, the days themselves are not always distinguished. Moreover, the time interval inwhich humans move between locations is not constant over time. Sleeping, eating, working are just a few examples ofactivities constrained by distinct duration, time, and space. Therefore, a 93% predictability of the next location turnsout to be relatively meaningless to predict when it will be visited. Moreover, it reveals the predictability as a transient,time-dependent, state rather than a constant property of an individual. We can go further and say that the transiencyarises from time, space, and social aspects, although in this work we focus on the first two aspects.

Traditionally, two trajectories would be considered identical if the sequence of visited places is the same, regardlessof which day of the week they were performed. We break the visited location trajectories into weekday-hour bins

2

A PREPRINT - JANUARY 6, 2022

A

B

Figure 1: Although humans’ choices seem complex and unpredictable, Song et al. [3] have exposed human mobility be-ing fairly predictable when considering temporal sequence in addition to frequency of visitation. Here, we replicate thisfinding using the three datasets that capture human mobility. The distributions of (A) entropy H and (B) predictabilityvalues. Sequences predict more than frequencies Πc > Πu and are peaked around 40% for all three datasets.

to represent the time dimension embedded in the data beyond a single day. For instance, these trajectories would berepresented differently if they are performed in distinct days of the week or shifted within a day (see Materials andMethods for more details).

In this work, we measure how typical it is for someone to be in a specific place at a specific hour in a specific day ofthe week, rather than how typical a sequence of visitation throughout any day is. We measure the predictability Π ofthe 168-hour independent bins based on the frequency of visited locations in that particular bin. With an information-theoretic measure instead of a simpler quantity such as the relative location frequencies or the pure location diversity,we have encapsulated in a single quantity both the location diversity and their relative frequencies.

Figure 2 shows the predictability timeline for different groups of users in three datasets. It reveals a remarkable 24hperiodic patterns of predictability. We analysed the impact of user heterogeneity by grouping them based on thenumber of unique visited locations S. As one would expect, the predictability decreases as the number of visited placesincreases. We also grouped users based on their radius of gyration—the geographical coverage based on the coordinatesof the visited places—and their engagement level with the social media platform—the monthly average number of dayswith mobility trajectories, i.e., day with at least two check-ins. In all cases, the results show it is harder to predict themobility of users with more diverse routes; regardless of how we measure this diversity. Moreover, the predictability ishigher during nighttime, peaking around 4-5 AM when most people are sleeping at home, regardless of dataset.

Although all datasets are from location-based social network platforms, the predictability amplitude in the Weeplacedata is much larger than what we observed in the other datasets. A possible cause is the fact that the Weeplace data wasprovided by Foursquare users who were interested in visualising their check-in history. It is reasonable to believe thatthese users were, on average, more active (in terms of their check-in history) than a regular LBSN user. In fact, themedian number of check-ins of BrightKite and Gowalla users were 11 and 25 respectively, whereas for the Weeplacethis figure was 329 check-ins. Also, a median Weeplace user has visited 131 single locations while for BrightKite andGowalla these numbers were 8 and 19 unique locations respectively.

To better understand the generalisation of the results shown in Figure 2, we also analyse the aggregated patterns of theentire population. Figure 3-A shows the analysis of the wavelet power spectrum and determines the significance (dashedline) of the 24h and 12h cycles for the populations in all of our datasets. Not surprisingly, this analysis reveals thatthe circadian period (approximately 24h) is the most prominent component of the predictability regularity. However,revealing the 12h component as the second-strongest component was not expected (i.e., the circasemidian period). Forinstance, given the working schedules and the sleeping cycles, one would expect finding significance for the 8h period

3

A PREPRINT - JANUARY 6, 2022

as well. The third-strongest component is centred approximately around the 6h regime during the day, even though thesignal is not significant at the population level.

By combining the strongest component with the variance of predictability observed throughout the week, we reveala strong spectral agreement among all datasets. Figure 3-B shows the standardised predictability timeline for eachdataset based on the sample mean of the predictabilities Pi across all users within each time bin t. Though the threedatasets are from different sources, they seem to capture the same regularities. Such finding suggests that the temporalvariation of the mobility diversity is activity-independent and therefore is likely to be a characteristic manifestation ofthe underlying human dynamics.

An alternative hypothesis to the influence of the 12h rhythm is that these periods are in fact rooted in population-levelheterogeneity on the activity routines. For instance, the 12h component could be explained by the same 24h rhythms if alarge proportion of the users had their 24h schedules offset by 12h. To test for this hypothesis, we performed the waveletanalysis on individual-level data. Interestingly, this analysis reveals the 6h period to be more prevalent than the 12h.Figure 3-C shows the percentage of users and their strongest component. Gowalla users are more distinct from others asthey are less likely to have circadian periods and for having significantly more 6h periods. A possible explanation forthis observation is the trips feature offered by this platform, incentivizing same category visits, such as pub crawls.

As already mentioned, the emergent circadian pattern across groups of individuals and datasets, not surprisingly, revealsthe predictability peaking at night when people tend to go (and stay at) home. Yet, this result brings up a new aspect ofpredictability that was never considered before. Such finding implies that, for instance, stating that “an individual is

Figure 2: Time-dependent predictabilities. The average predictability Πu across all users exhibits daily peaks (4-5am)and secondary peaks (12-5pm) of predictability throughout the time series. For all datasets, Πu is lower for groupsvisiting more unique locations S.

4

A PREPRINT - JANUARY 6, 2022

A

B

C

Figure 3: Predictability distribution over three datasets – A. Estimated global wavelet spectrum. B. Standardisedpredictability over time. C. Stacked bar chart for the most strong component period (24, 12, 6 hours) of each individual.

80% predictable” must be interpreted in an averaged sense. Missing from this is the instantaneous changes in a person’spredictability state over time.

Figure 4-A shows how time-granularity affects the predictability. As the time window decreases, from a week-day(seven 24-hour bins per week) to a week-day-hour (hundred and sixty-eight 1-hour bins) representation, the averagepredictability increases. Interestingly, if one only distinguishes the days of the week (i.e., 24h bins), the averagepredictability is equivalent to Song’s time-frequency predictability [3]. Figure 4-B explores the effects of different timewindows in our datasets. It shows the average predictability monotonically decreasing until the 24-hour window, whereit seems to stabilise. Specifically for Gowalla, predictability sharply drops when the time window is bigger than 3h asthe strong 6h-period component shown in Figure 3-C might be scattered.

A B

Figure 4: Effects of time-granularity in the limits of predictability. As the time window decreases, the averagepredictability increases.

5

A PREPRINT - JANUARY 6, 2022

Figure 5: Effects of activity type on the limits of predictability. In all subplots, the dotted-lines represent thedistribution of individual predictabilities given their full trajectories. Solid-lines represent distributions for the sameindividuals but when limiting their trajectories to locations of distinct types.

2.2 Predictability as a context-dependent state of the individual

The temporal dependency in peoples’ whereabouts arguably results from daily routines due to individual, social, andspatio-temporal constraints. Likewise, we expect that the predictability in human mobility depends on the context. Forexample, it is plausible that people are more predictable about their workplace than the restaurants they visit.

To investigate the role of context in predictability, we examine the places individuals visit over time using the Weeplacesdataset. The rightmost panel in Figure 1B shows the distribution of predictabilities when the trajectories are composedby all data available. For this investigation, they are the baselines. Figure 5 shows distributions of predictabilities whentrajectories are filtered to only contain places of specific categories (e.g., food, shop, home). For instance, a context-dependent trajectory for food could be: breakfast-place-A, lunch-place-B, coffee-place-C, etc; while a full baselinetrajectory could be: home-place-X, breakfast-place-A, work-place-Y, lunch-place-B, coffee-place-C, gym-place-Z, etc.As expected, Figure 5 shows the limits of predictability (solid-lines) can increase or decrease depending on context(comparison against the dotted-lines, i.e., full trajectories). For example, activities related to home and work are morepredictable than the average baseline. In contrast, leisure activities (i.e., nightlife and entertainment) are less predictablethan the average activity. This finding suggests that knowing about the context that people are embedded can inform usabout their predictability.

2.3 Estimating users’ predictability

To understand the role of context on human mobility, we investigate the extent to which context helps us estimate anindividual’s predictability. Precisely, we analyse how well we can estimate an individual’s predictability based on thisindividual’s context preference. For instance, we want to understand whether knowing that a person goes shoppingwill often inform us about this person’s overall predictability. We first represent this context preference by using therelative frequency that an individual stayed at places from each category. Then, we use linear regression to determinethe predictability of an individual based on context preference.

Figure 6 shows that such a simple model can estimate well an individual’s predictability (R2 = 0.424, MAE = 0.045).Figure 6A plots individual predictabilities Πc (estimated based on full trajectories over multiple days, e.g., home,restaurant, work, restaurant, etc.) against the estimated predictabilities Π̂c (based on context profiling, e.g., 80%home/work, 18% food, 1% nightlife, etc). Figure 6B shows the residuals are normally distributed and centered on zero.This result implies that having a piece of coarse-grained information about individuals (i.e., context preference) informsus about an individual’s characteristic—their intrinsic mobility uncertainty—that we would need to have fine-graineddata (i.e., check-in data) to calculate.

6

A PREPRINT - JANUARY 6, 2022

3 Discussion

Meeting people’s needs requires governments, industries, and other stakeholders to be able to plan for demands (e.g.,hospital admissions, public transportation, store opening times, etc.). The predictability of human movement is at theheart of planning, hence more accurate modelling should lead to better planning and better decision-making. Previousresearch models predictability as aggregate value for an individual, which fails to capture the temporal variations ofsuch regularities for different times of the day and days of the week. In this work, we model predictability as a state thatis time-dependent but also context-dependent. With this definition, we show that the state of predictability of individualsvaries significantly and hence should not be seen as an intrinsic characteristic of the individuals.

The main implications of this work is that planing of activities related to human mobility (e.g. city events, epidemicmodelling, road maintenance) need to consider time-space variations of individual activities. Furthermore, during periodsof restrictions, such as the COVID-19 pandemic, the understanding and characterisation of these time-spatial variationscan aid governments to make the correct decisions. In 2020 and 2021, many governments imposed curfew/lockdownmeasures to citizens after certain hours (e.g., Spain, Colombia) in a “blanket” way. Effective curfews depend on the time

A B

Figure 6: Knowing the type of activity people engage helps us to estimate their predictability in mobility.

500 1000 1500 20000.000.020.040.060.080.10

MAE

Random sampling

500 1000 1500 2000

Sequential samplingTrajectory-basedContext-based

Figure 7: Mean absolute error. until 2000

1000 2000 3000 4000 50000.000.020.040.060.080.10

MAE

Random sampling

1000 2000 3000 4000 5000

Sequential samplingTrajectory-basedContext-based

Figure 8: Mean absolute error. until 5000

7

A PREPRINT - JANUARY 6, 2022

and the location, and the consideration of such variations can lead to a better approach where not all areas are treatedequally. Using predictability as a state and using the states for planning could lead to more just/equitable outcomes.

This work has limitations, but in most cases they are related to a dependency to high-resolution data. We havedemonstrated here that even using a somewhat mediocre data resolution, we can characterise the state of predictabilityof individuals. With access to higher resolution data, such as the ones being collected as part of “track-and-trace”systems in certain countries, our modelling can lead to more accurate characterisations. Many governments andcompanies (as part of “data-for-good” efforts) are starting to open their datasets to scientists which will naturally lead tobetter urban analytics including human dynamics modelling.

Materials and Methods

Data

We use data from different location-based social networking services (Table 1). Brightkite and Gowalla were twopopular social networking sites that existed from 2007 until 2012. Weeplaces was a website in that users could uploadtheir check-in activities from other social network services (e.g., Facebook Places, Foursquare). These datasets containusers’ check-in activity including user identification, location coordinates (i.e., latitude and longitude), and time stamp.Additionally, the Gowalla dataset contains a description (i.e., the category) of the locations (e.g., nightlife, outdoors).All datasets are publicly available.

Table 1: Data from location-based social networking sites.

Dataset Users Records Period

Brightkite [15] 58, 228 4, 491, 143 Apr/2008–Oct/2010Gowalla [16] 107, 002 6, 405, 492 Feb/2009–Oct/2010

Weeplaces [16, 17] 15, 799 7, 658, 368 Nov/2003–Jun/2011

Uncertainty and predictability in human mobility

To study individuals’ mobility, we use their check-in activity to create trajectories, described as time series of the form

X = {x (1) , x (2) , . . . , x (T )} ,

where x(t) ∈ V is a place, and V is the set of visited places. In our work, we want to investigate the dynamics of thistime series and its embedded uncertainty. Specifically, we examine the visitation preferences of an individual and theemerging patterns in these preferences.

To investigate an individual’s visitation preferences, we analyze the probability p(i) of visiting a location i ∈ V andthe number of locations N = |V| visited by this individual. The value of N tells us about the extent that an individualvisits places broadly. However, N neglects visitation frequency, missing the existence of favorite locations. To accountfor frequency, we are also interested in the spread of the probability distribution p(i). To do so, we use the Shannonentropy Sunc(p) of the random variable X , defined as the following:

Sunc(p) = −∑i∈V

p(i) log2 p(i), (1)

and expressed in bits. The entropy Sunc quantifies the uncertainty regarding the places that a specific user visits overtime. For example, when an individual visits one place only, Sunc(p) is zero because of the low uncertainty about thisindividual’s whereabouts. In contrast, an individual without any favourite location has the highest uncertainty and themaximum entropy, which occurs when p(i) = 1/N for ∀i. In this hypothetical case, the entropy simplifies to

Srand(p) = log2N, (2)

and represents the uncertainty of a visitation preference following a uniform distribution. That is, the uncertaintyregarding this individual is the same as guessing randomly. Note that the Shannon entropy captures the uncertainty ofan individual without accounting for any temporal correlation or patterns of visitation, since it neglects the sequence inwhich events (i.e., visitation) take place.

To investigate emerging patterns in the visitation preferences of individuals, we can understand X as the result ofprocesses that generate such sequences. Precisely, we analyse the source entropy rate hµ of a stochastic process [18, 19].

8

A PREPRINT - JANUARY 6, 2022

The entropy rate tells us the uncertainty of a trajectory while discounting for the recurrent patterns in it. In our case, hµquantifies the irreducible uncertainty of an individual even when we learn about their visitation patterns that emergethroughout time. In our study, we use the Lempel–Ziv compression algorithm to estimate the entropy rate of anindividual, following a previous work [3]. We call this estimate the time-correlated entropy and denote it as S.

Not only do these entropy values enable us to assess the uncertainty in the mobility of individuals, but they also helpus to characterise the extent to which an individual’s trajectory is indistinguishable from random—they enable usto estimate the limits of predictability of an individual’s mobility. In our work, we are interested in estimating theprobability Π of correctly predicting future locations of an individual, given a past series of observations. Song et al. [3]showed that Π is subject to Fano’s inequality and has an upper bound, denoted as Πmax. This upper bound revealsthe theoretical upper limit to predict the future location of an individual by restricting ourselves to the trajectory dataonly. For example, Πmax = 0.7 tells us that an individual’s trajectory exhibits an intrinsic uncertainty that makes theirbehaviour indistinguishable from random 30% of the time. This randomness restricts the predictive power of anyalgorithm seeking to predict the future locations of this individual. Though it is infeasible to measure Πmax directly, thequantity has an explicit relationship with the the time-correlated entropy:

S = Hb(Πmax) + (1−Πmax) log2(N − 1), (3)

where Hb(x) is the binary entropy function, defined as H(x) = −x log2 x+ (1− x) log2 (1− x). To estimate Πmax ofan individual, we need to solve Eq. (3) using a numerical solver, given that we know the number of locations N andentropy S. Similarly, we can estimate the hypothetical predictability of an individual if this individual lacked favoritelocations or visitations patterns. To estimate these special, respectively, we replace S with Srand and Sunc to find theirrespective Πrand and Πunc.

We are also interested in the temporal dependencies of these limits of predictability,thus we analyze individuals atdifferent moments of the week. Specifically, we split the trajectory of each individual into time slots representing thetime of the week. We create 168 slots (i.e., 24 hours × 7 days of the week) and define Xt=t0 as a random variablerepresenting the places that an individual visits at the time slot t = t0 ∈ [1, . . . , 168]. We measure entropy andpredictability limits of this random variable regarding each individual in the datasets, which enables us to construct theirrespective time series of predictability. We note that, in our approach, the idea of a mobility trajectory as an arbitrarilylong sequence of visits is demoted in favor of a routine-oriented perspective. Thus, the sequential information—onwhich the entropy rate leverages—is less relevant.

Wavelet analysis

To understand the temporal dimension of human mobility predictability, we use the continuous wavelet transformto describe the regularities in the time series of the individuals. With the wavelet transform, we extract both timeand frequency components from a time series. The method has a long history of successful applications to a varietyof domains such as climate prediction [20], digital image processing [21], and crime dynamics analysis [22]. Thewavelet transform decomposes a time series using functions, called wavelets that dilate (scale) to capture differentfrequencies and that translate (shift) in time to include changes with time. We can define the wavelet transform of adiscrete sequence Y = {y(1), y(2), . . . , y(N)} having observations with a uniform step δt as the following:

WY (s, n) =

√δt

s

N∑t=1

y(t)ψ∗[

(t− n)δt

s

], (4)

where the ‘∗’ denotes the complex conjugate and s is the wavelet scale. The wavelet transform can be seen as thecross-correlation between the time series y (t) and a set of functions ψ∗s,τ (t), distributed over time and having differentwidths [23]. By varying the scale s and translating over time (i.e., varying n), we have a representation of the amplitudeof the different periodic features of Y and how they vary with time.

To examine the overall periodicity in Y , we evaluate the average of WY c (s, n) over n:

W2

(s) =1

N

N∑n=1

|WY c (s, n) |2, (5)

called the global wavelet spectrum, which provides us with an unbiased estimate of the true power spectrum [24]. Weanalyze its statistical significance by using the method developed by Torrence and Compo [25], which tests the waveletpower against a null model that generates a background power spectrum Pk. The test is given by:

D

(|WX (s, n) |2

σ2X

< p

)=

1

2Pkχ

2ν , (6)

where ν = 2 for complex wavelets (our case) [25].

9

A PREPRINT - JANUARY 6, 2022

Acknowledgements

This work was supported by the US Army Research Office under Agreement Number W911NF-17-1-0127.

References

[1] Gary W Evans, Richard E Wener, and Donald Phillips. The morning rush hour: Predictability and commuterstress. Environment and behavior, 34(4):521–530, 2002.

[2] Amadeu Quelhas Martins, David McIntyre, and Christopher Ring. Aversive event unpredictability causesstress-induced hypoalgesia. Psychophysiology, 52(8):1066–1070, 2015.

[3] Chaoming Song, Zehui Qu, Nicholas Blumm, and Albert-László Barabási. Limits of predictability in humanmobility. Science (New York, N.Y.), 327(5968):1018–21, feb 2010.

[4] Duygu Balcan, Vittoria Colizza, Bruno Gonçalves, Hao Hu, José J Ramasco, and Alessandro Vespignani.Multiscale mobility networks and the spatial spreading of infectious diseases. Proceedings of the NationalAcademy of Sciences of the United States of America, 106(51):21484–9, dec 2009.

[5] Pu Wang, Timothy Hunter, Alexandre M Bayen, Katja Schechtner, and Marta C. González. Understanding roadusage patterns in urban areas. Scientific reports, 2:1001, jan 2012.

[6] Rikard Laxhammar. Conformal Anomaly Detection: Detecting Abnormal Trajectories in Surveillance Applications.2014.

[7] Hugo Barbosa, Marc Barthelemy, Gourab Ghoshal, Charlotte R James, Maxime Lenormand, Thomas Louail,Ronaldo Menezes, José J Ramasco, Filippo Simini, and Marcello Tomasini. Human mobility: Models andapplications. Physics Reports, 734:1–74, 2018.

[8] M. Stupfel and A. Pavely. Ultradian, circahoral and circadian structures in endothermic vertebrates and humans.Comparative Biochemistry and Physiology – Part A: Physiology, 96(1):1–11, 1990.

[9] Frank A.J.L. Scheer, Kenneth P. Wright, Richard E. Kronauer, and Charles A. Czeisler. Plasticity of the intrinsicperiod of the human circadian timing system. PLoS ONE, 2(8), 2007.

[10] Jameson L. Toole, C. Herrera-Yaque, Christian M. Schneider, and Marta C. González. Coupling human mobilityand social ties. Journal of The Royal Society Interface, 12(105):20141128–20141128, 2015.

[11] Christian M. Schneider, Vitaly Belik, Thomas Couronne, Zbigniew Smoreda, Marta C. González, and ThomasCouronné. Unravelling daily human mobility motifs. Journal of The Royal Society Interface, 10(84):20130246,2013.

[12] Samiul Hasan, Christian M. Schneider, Satish V. Ukkusuri, and Marta C. González. Spatiotemporal Patterns ofUrban Human Mobility. Journal of Statistical Physics, 151:304–318, 2012.

[13] Clodomir Santana, Federico Botta, Hugo Barbosa, Filippo Privitera, Ronaldo Menezes, and Riccardo Di Clemente.Analysis of socioeconomic aspects related to mobility patterns in the uk during the covid-19 pandemic, 2020.https://covid19-uk-mobility.github.io/Second-report, Last accessed on 2020-06-01.

[14] Edin Lind Ikanovic and Anders Mollgaard. An alternative approach to the limits of predictability in humanmobility. EPJ Data Science, 6(1), 2017.

[15] Eunjoon Cho, Seth A Myers, and Jure Leskovec. Friendship and mobility: user movement in location-based socialnetworks. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and datamining, pages 1082–1090, 2011.

[16] Yong Liu, Wei Wei, Aixin Sun, and Chunyan Miao. Exploiting geographical neighborhood characteristicsfor location recommendation. In Proceedings of the 23rd ACM International Conference on Conference onInformation and Knowledge Management, pages 739–748, 2014.

[17] Ramesh Baral and Tao Li. Maps: A multi aspect personalized poi recommender system. In Proceedings of the10th ACM conference on recommender systems, pages 281–284, 2016.

[18] Thomas M. Cover and Joy A. Thomas. Elements of Information Theory. John Wiley & Sons, Dordrecht, 2 edition,2006.

[19] Juhi Kulshrestha, Marcos Oliveira, Orkut Karacalik, Denis Bonnay, and Claudia Wagner. Web routineness andlimits of predictability: Investigating demographic and behavioral differences using web tracking data, 2020.arXiv:2012.15112.

10

A PREPRINT - JANUARY 6, 2022

[20] D. Nalley, J. Adamowski, B. Khalil, and A. Biswas. Inter-annual to inter-decadal streamflow variability in Quebecand Ontario in relation to dominant large-scale climate indices. Journal of Hydrology, 536:426–446, 2016.

[21] J-P Antoine, Pierre Carrette, Romain Murenzi, and Bernard Piette. Image analysis with two-dimensionalcontinuous wavelet transform. Signal processing, 31(3):241–272, 1993.

[22] Marcos Oliveira, Eraldo Ribeiro, Carmelo Bastos-Filho, and Ronaldo Menezes. Spatio-temporal variations in theurban rhythm: the travelling waves of crime. EPJ Data Science, 7(1):29, 2018.

[23] Bernard Cazelles, Mario Chavez, Guillaume Constantin de Magny, Jean-Francois Guégan, and Simon Hales.Time-dependent spectral analysis of epidemiological time-series with wavelets. Journal of the Royal Society,Interface / the Royal Society, 4(15):625–36, 2007.

[24] Donald P. Percival. On estimation of the wavelet variance. Biometrika, 82(3):619–631, 1995.[25] Christopher Torrence and Gilbert P Compo. A Practical Guide to Wavelet Analysis. Bulletin of the American

Meteorological Society, 79(1):61–78, jan 1998.

11