Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Duke University
A Mathematical Analysis of Crime
Paper 2
Benjamin C. Lawrence
Math 89S: Mathematics of the Universe
Professor Hubert Bray
November 1, 2016
Introduction
In the TV show “Numb3rs,” famous mathematician Charlie Eppes aids his FBI agent
brother in fighting crime by using advanced mathematics to analyze past crimes in hopes of
preventing future ones. In the episode “The O.G.,” an FBI agent is shot and killed while working
undercover in a local gang. The police jump to the immediate conclusion that his cover was
blown and that they need to find justice for the execution. Charlie, however, claims that gang
territories can be modeled in ways similar to that of plant growth and that he can use this
knowledge to narrow down the most likely gang who shot the agent; thereby, allowing law
enforcement to determine whether his cover was, in fact, blown. While this show was intended
as fiction, it turns out that mathematicians have started to develop methods of analyzing gang
movements, although not by analyzing plant life. Instead, they apply the equations used to
predict seismic activity.
The Hawkes Process
To solve a math problem correctly, one usually has to start somewhere seemingly
unrelated to the original problem and work his way towards the solution. For example, the
wave equations that seismologists use today were originally developed from analyzing the
motion of a violin string. Once mathematicians understood this one dimensional wave motion
they could advance to two and three dimensions until they had the formulas that describe
earthquakes [13]. In the same way, to understand gang movements, mathematicians first
needed to understand how events of the past affect events of the future (i.e., how what one
gang does affects how other gangs react).
This task was accomplished by studying a point process known as a Hawkes Process. A
point process is a “probabilistic model for random scatterings of points on some space X often
assumed to be a subset of Rd for some d” [4]. The Hawkes Process is special because it is a
“self-exciting, temporal point process” [14]. ”Self-exciting” simply means that events of the past
make events of the future more likely, and a temporal point process is one whose events are
discrete instead of continuous. The general equation for a Hawkes Process is
λ (t )=μ (t )+ ∑i: τ i<t
v (t−τ i)
where λ ( t ) is the expected rate at which events are to occur, μ ( t ) is the background rate of the
process, the τ i are the temporal points occurring before time t , and v is the clustering density
function of the process. The function v can also be considered an excitation function because it
is the part of the equation which makes future events more likely based on the occurrences of
previous events [14].
The first major use of this function was to study the after-effects of earthquakes and the
likelihood of future earthquakes based upon the size of the most recent ones. In their paper for
assessing point models for earthquake forecasts, Andrew Bray and Frederic Schoenberg
concluded that earthquakes create aftershocks that are dependent upon the original
earthquake, thus a Hawkes Process can be used to analyze their likely frequencies [10].
The Poisson Process
Another form of point process is the Poisson Process. The major difference between the
Poisson Process and the Hawkes Process is that the Poisson Process assumes events to be
completely random. The frequency, duration, and intensity of events that have already
occurred have no effect on events of the future. This process is useful because it can analyze
“scenarios where we are counting the occurrences of certain events that appear to happen at a
certain rate, but [in reality are] completely at random” [11].
The formal definition of the Poisson Process is,
“Let λ>0 be fixed. The counting process {N (t ) ,t∈ [0 ,∞ )} is called a Poisson Process with rates λ if all the following conditions hold:
1. N (0 )=0
2. N (t ) has independent increments;
3. The number of arrivals in any interval of length τ>0 has Poisson(λτ) distribution.”
In this case, λ represents the same quantity as in the Hawkes process. Both are the rate at
which events occur. The first condition simply states that the number of events that have taken
place at time zero is zero since no events have occurred yet. The second condition is the
distinguishing point between the Hawkes and Poisson Processes. In the Hawkes Process,
condition two says that N (t ) has dependent events, while the events are independent in the
Poisson Process. Finally, the third condition is that the arrival, or occurrence, of an event is
considered a Poisson random variable with parameter λτ [11].
It is important to understand the Poisson Process because it is used to analyze and
predict car accidents, requests for documents on the internet, and locations of users in a
wireless network [11]. Therefore, when mathematicians first started to analyze crime, they
originally looked at using a Poisson Process because crime was assumed to be a set of random
events. But further analysis of data showed that the predicted crimes using a Poisson Process
were not coinciding with what was actually occurring. Accordingly, mathematicians Stomakhin
and Egesdal started using a Hawkes Process to obtain more accurate predictions.
Statistical and Stochastic Modeling of Gang Rivalries in Los Angeles
In hopes of better understanding gang movements in the Hollenbeck area of Los
Angeles, mathematician Egesdale and his colleagues used a Hawkes Process to model the
actions of rival gangs over time, and they used a form of agent based modeling for gangs’
actions through space. They combined these two models to create graphs that allowed
mathematicians to predict the likely times and locations of future crimes [2].
To test their hypothesis, they analyzed gang related crimes between 1999 and 2002.
They limited their data to the 33 gangs who had committed over four crimes against each other
so as to ensure there was enough data for each gang. Figure 1 shows the territories of
Hollenbeck’s census block groups. Note, the government census website defines a census block
as, “Statistical areas bounded by visible features such as roads, streams, and railroad tracks,
and by nonvisible boundaries such as property lines, township, school district, county limits and
short line-of-sight extensions of roads” [12]. In this paper the nonvisible boundary lines are
gang territory boundaries.
When Egesdale and his colleagues first started working on their paper, they used a
Poisson Process to analyze the given data. However, when they compared the predicted time
intervals between crimes using this process to actual time intervals, their predictions were not
accurate. Therefore, they switched to a self-excitation process and chose the Hawkes Process.
In this case, the Hawkes Process is modeled by:
λ ( t )=μ ( t )+k0∑t>ti
g(t−t i; w)
where μ ( t ) is the same as the original function, but the clustering density function is now g¿)
and multiplied by a scaling factor of k 0 which depends on the effect of the crime [2]. Egesdale’s
team then used this equation to simulate what the crime rates of the Locke Street and Lowell
Street gangs would be over the three-year period. Surprisingly, the Hawkes Process was very
good at predicting the crime rates. The top row of Figure 3 shows the actual crime rate
between the two gangs and the bottom row shows the predicted rate using the Hawkes
Process. Although not perfect, this analysis turned out to be far more accurate than the original
Poisson Process [2].
Once Egesdale and his team had established a method of predicting crimes over time
they moved to analyzing crimes spatially using agent based modeling. Agent based modeling
essentially analyzes how a predetermined set of agents interact with each other. The
components to an agent based model are (1) agents (e.g., a group of gang members since
individual members’ crimes might not be associated with the gang); (2) relationships between
agents (e.g., the rivalries between gangs); and (3) a framework for simulating interactions [3].
Also, Egesdale assumed that each agent acts individually to “exhibit the behavior of interest”
[2]. The analysis was then divided into a variety of sections, each delving into different aspects
of how a gang moves in space. For example, one portion of the experiment looked at gangs as a
network and how the physical distance between territories affected rival gangs’ abilities to
commit crimes against each other. Another portion assessed how the constantly changing
strength of a rivalry affects two gangs’ desire to commit crimes. So if two gangs were strong
enemies but on the opposite sides of the city, they would not be as likely to commit crimes as
those closer to each other but with a weaker rivalry (See Figure below) [2].
Egesdale and his team concluded that, thus far, the most efficient and accurate method
of modeling gang movement in time is the Hawkes Process and in space is agent based
modeling. Interestingly, Egesdale also concluded that individual member movements are too
erratic to predict mathematically. Therefore, if law enforcement were to use this method of
analysis, it is only useful for predicting the crimes of gangs in general and not those of individual
members within the gangs.
Reconstructing Missing Data in Gang Related Crimes
Egesdale’s methods of predicting gang activity are useful for future events, but many
times a crime occurs and law enforcement has no way of knowing who was involved. Therefore,
when Alexey Stomakhin read Egesdale’s paper, he decided to employ the same method of
mathematical analysis but used it to look at the past as well as the future.
In his paper, Stomakhin analyzed the movements of gangs in the Los Angeles area,
except he studied crimes that already happened instead of trying to predict future ones. In the
figure below, let α , β, and γ be agents (or gangs) involved in a series of crimes. The black dots
represent crimes and the gangs involved, white dots represent unsolved crimes, and the
vertical bar shows that there is no knowledge of which gang were involved. Stomakhin’s goal
was to show that a Hawke’s analysis could predict which gangs were most likely involved in a
given crime by analyzing the temporal distributions of the crimes between certain gangs over
time [1].
Originally, mathematicians used a process similar to that discussed by Egesdale: use
data before the crimes to predict what crimes would likely have taken place during the times of
the white dots and use that to determine the most likely gangs involved. The problem with this
method is that it does not account for the evolution of a gang network. Therefore, Stomakhin
used the Hawkes equation
λαβ (t )=μαβ (t )+θαβ∑tiαβ<t
ωαβ e−ωαβ(t−t i
αβ)
where the subscript denotes a given agent pair (note that αβ could have been replaced with
any agent pair) and θ is the scaling factor here instead of the k 0 that was used before. An
exponentially decaying function is used for the clustering density because the longer that time
passes from the original crime, the less likely a gang is to retaliate [1].
The largest problem with using this form of Hawkes Process is that the parameters for
this equation usually are unknown (due to missing data), making it hard to establish an accurate
density rate. One way to circumvent this issue is to “use the complete events to estimate
parameters, use these parameters to estimate participants in unknown events, and then use
these estimates to re-estimate parameters” [1]. The problem with this method is that there is a
great deal of estimation involved, which tends to skew the actual probabilities. As a solution,
Stomakhin and his team focused less on the probabilities of each gang being involved and more
on the order of likelihood of possible gangs. Law enforcement is not very interested in whether
a gang had a 90% or 93% probability of being the one who committed the crime. They just want
to know who was the most likely to have done it, which makes ranking the gangs more efficient
than finding their exact probabilities [1].
In the end, Stomakhin concludes that using a Hawkes Process to predict those involved
in a past crime is mathematically possible. However, he concedes that his work is still in
progress and the aforementioned issue of obtaining the correct parameters along with lack of
accurate data are both possibilities for error. Nevertheless, law enforcement is better off using
a Hawkes Process to choose who to question for a given crime because it was shown to always
be better than random guessing [1].
Extended Application
The Hawkes Process is not just limited to analyzing gang violence. Martin Short of UCLA
read Egesdale’s paper and decided to apply it to burglary in the Long Beach, CA area.
Conventional wisdom states that once someone has been burglarized, he is more likely to be
burglarized again than someone who never has been. Short decided to test this claim
mathematically by applying a Hawkes Process to burglary events to see if past burglaries
affected future ones. Based upon his analysis, he concluded that once a house (his study only
looked into houses but no other forms of living) had been burglarized, that house and those
around it had an increased probability for being burglarized again. Therefore, it is true that past
burglaries affect the probabilities of future ones in a Hawkes Process-like manner [6].
Based on these results, it may behoove law enforcement to start analyzing all forms of
crime using point processes, starting with the Hawkes Process and Poisson Process. These
techniques would allow law enforcement to more accurately and efficiently solve past crimes
and prevent future ones.
Conclusion
As the world relies more and more on science, it is inevitable that crime will eventually
be analyzed entirely mathematically. Although it has not yet been perfected, Egesdale and
Stomakhin have shown that gang crimes, both past and present, can be predicted
mathematically, and Martin Short has shown that extensions of these processes can be applied
to burglary. So while society might not yet be at the level of exactness that Charlie Eppes was in
“Numb3rs,” it will not be long before all crimes will be analyzed in a method similar to how he
does.
WORKS CITED
[1] Alexey Stomakhin, Martin B Short, Andrea Bertozzi. “Reconstruction of missing data in socialnetworks based on temporal patterns of interactions.” iopscience.iop.org. Web. 28October 2011
[2] Mike Egesdal, Chris Fathauer, Kym Louie, Jeremy Neuman. “Statistical and StochasticModeling of Gang Rivalries in Los Angeles.” www.siam.org. Web.
[3] Charles Macal and Michael North. “Introduction to Agent-based Modeling and Simulation.”Argonne National Laboratory. www.mcs.anl.gov. Web. 29 November 2006.
[4] Weisstein, Eric W. "Poisson Process." From MathWorld--A Wolfram WebResource. http://mathworld.wolfram.com/PoissonProcess.html
[5] Weisstein, Eric W. "Poisson Distribution." From MathWorld--A Wolfram WebResource. http://mathworld.wolfram.com/PoissonDistribution.html
[6] M. B. Short, M. R. D’Orsogna, P.J. Brantignham, G.E. Tita. “Measuring and Modeling Repeatand Near-Repeat Burglary Effects.” SpringerLink. http://link.springer.com. Web. 20 May 2009.
[7] Oxford Dictionary. “Repeat Victimization.” www.oxfordbibliographies.com. Web. 29 June2011
[8] FindLaw. “Burglary Overview.” http://criminal.findlaw.com. Web. 2013.
[9] Vocabulary.com. “heterogeneity.” https://www.vocabulary.com. Web.
[10] Andrew Bray and Frederic Schoenber. “Assessment of Point Process Models for EarthquakeForecasting.” The ArXiv. https://arxiv.org. Web. 20 December 2013.
[11] Introduction to Probability, Statistics and Random Processes. “Basic Concepts of thePoisson Process.” Web. https://www.probabilitycourse.com.
[12] Rossiter, Katy. “What are census blocks?” US Government. http://blogs.census.gov. Web.20 July 2011.
[13] Stewart, Ian. In Pursuit of the Unknown: 17 Equations that Changed the World. BasicBooks, 9 February 2012. Print.
[14] Stover, Christopher. "Hawkes Process." From MathWorld--A Wolfram Web Resource,created by Eric W. Weisstein. http://mathworld.wolfram.com/HawkesProcess.html