Upload
poala-junsak
View
220
Download
0
Embed Size (px)
Citation preview
7/24/2019 Interana-Understanding Event in Event Data
1/12
Understanding Event in
Event Data
eBook
7/24/2019 Interana-Understanding Event in Event Data
2/12
What is Event Data
Breaking Down Event Data
What Makes Event Data Different?
Where Does Event Data Come From?
Analysis Perfect For Event Data
Challenges of Event Data
Summary
3
4
6
8
9
10
11
Table of Contents
7/24/2019 Interana-Understanding Event in Event Data
3/12
What is Event Data 3Understanding Event in Event Data
By denition, event data is data from Any identiable occurrence that has
signicance for system hardware or software. User-generated events include
keystrokes and mouse clicks, among a wide variety of other possibilities. Events
describes an action performed by or associated with an entity at a certain time.
Event data is a continuous stream of actions that reveals the patterns of events
people, products, and machines make over time. It helps describe when and how
things happen. Event data is the foundation for behavioral analytics; enabling
understanding of how customers behave and products are used.
Event data is simply any data point that has a timestamp, entity, and attributes
of an action. As simple as that sounds, events are at the heart of many
businesses. Clickstreams, logs, data from IoT devices, sensor data, and more
are all event data. A mouse click is an event; it happens at a point in time and
its context includes attributes such as where the entity clicked and what was
clicked.
Analysis of event data is based on key concepts about chronologically ordered
data and its relationship to the world.For example, event data is generated by
an entity who follows a path through a conversion ow, taking action at certain
points along the way. If we examine the events of all entities that went through
the conversion ow, we can understand their behavior and start to answer
questions such as:
What are the characteristics of entities that converted or dropped o?
Why did some entities take longer to convert, and why?
What happened between each step of the conversion ow?
What is Event
Data?
7/24/2019 Interana-Understanding Event in Event Data
4/12
Breaking Down Event Data 4Understanding Event in Event Data
So what does event data look like? Each piece of event data has three key pieces
of information: a timestamp, one or more entities, and attributes.
Timestamp: Just like it sounds, it records at what point in time the action took
place.
Entity: Who took the action. This could be a person, machine, sensor, etc.
Attributes: These are inherent characteristics that describe what happened,
like a click or a call. The more properties and information captured here, the
richer the data.
Here is a simple example of an event captured on a website in JSON:
{timestamp: 2015-06-31T13:50:00-0600, id: 05632,
attributes: { type: click, page: request_demo,
previous_page: product_tour, session_length: 1060,
browser: chrome, ip_address: 10.0.0.1, ip_region:
united states, ip_state: california, ip_city: san
francisco}}
Lets take this one step further and explore a conversion ow for an e-commerce
site. Lets look at some high level events in the ow:
Event #1:Shopper D (the entity) follows a link from your advertisement on
a 3rd party website
Event #2:Views a suggested item on your site using the quick-view feature
Event #3:Views your sizing guide
Event #4:Selects the sweater shown in the advertisement
Event #5:Selects size large
Event #6:Checks out with a credit card
Each of these events can be represented by a dierent shaped marker on a
timeline.
Breaking Down
Event Data
7/24/2019 Interana-Understanding Event in Event Data
5/12
Breaking Down Event Data 5Understanding Event in Event Data
Each event above has several important attributes. Some attributes of Event #1
above are:
The timestamp: exactly when the shopper clicked through to the site (when)
The entity (Shopper D)
The session ID (this is context, or the how: - the event happened within a
dened session)
The advertisement location (more about how the event happened)
The item pictured in the ad (another attribute that provides context)
Attributes of event #2 (views a suggested item) include:
The timestamp: exactly when the shopper viewed the suggested item
(when)
The entity (again, Shopper D)
The session ID (how)
The item viewed (context)
7/24/2019 Interana-Understanding Event in Event Data
6/12
What Makes Event Data Diferent? 6Understanding Event in Event Data
Event Data is Attribute-Rich
Event data can have hundreds of attributes that describe each event. Because
we use event data to discover behavior patterns, we want to have the full context
for every event. Every attribute we store is context we can analyze; this makesevent data rich. For Shopper D in the example above, we can store attributes
like rst and last names, birth date, gender, favorite color, home town, and
preferred payment method. Then we could dene a cohort of shoppers who
are over 50 and whose hometown is New York, and follow their behavior over
time. Another reason events can have hundreds of properties is that they may
describe not just one entity, but multiple entities involved in a single event. The
attributes of each entity become part of the event data. For every transaction on
an e-commerce site there may be a supplier, a vendor, a shopper and a 3rd party
payer (credit card company, PayPal), any of whom may participate in a given
event during the transaction.
Event Data is Massive
For most companies, it is their fastest growing type of data. But why is it so big?
Event data captures the actions that an entity takes over time, so for every one
entity, you could have tens of thousands of actions. Imagine a popular wearables
company with hundreds of thousands of devices in the market. Each wearable
device could generate thousands of rows of event data daily, quickly adding up to
billions of events in just a short period of time.
Event Data is Denormalized
In an event data store, data is structured but never normalized. This is unlikea relational database, in which redundant data is normalized and referenced
from a single location in a single table. Every time a value changes, the previous
value is overwritten and only the last update is available. But, when we analyze
event data, we want to know the state of the world at the moment of the event.
For example, imagine storing data from an anemometer, which measures
windspeed. The meter takes a reading every 30 seconds, and the windspeed
value is automatically updated in the weather database. In this case, we will
always know how fast the wind was blowing in the last 30 seconds, but we will
never know how the windspeed has changed over the last hour. This is why,
in an event data store, data is always appended and never updated. Every
windspeed event is stored permanently. For a weather station that measuresnot just windspeed but also temperature, humidity, barometric pressure and
precipitation, every attribute is stored for every sensor reading. Only when event
data is denormalized can we use it to nd patterns and gain insight into change
over time.
Event Data can be Schemaless:
As mentioned earlier, dierent types of events and even individual events of
the same type may have dierent numbers of attributes. In other words, the
data does not necessarily follow a particular schema. Since event data may be
schemaless or adhere loosely to a schema, storing event data does not require
What Makes
Event Data
Different?
7/24/2019 Interana-Understanding Event in Event Data
7/12
What Makes Event Data Diferent? 7Understanding Event in Event Data
a declared schema and accepts any number of attributes per event. A time
attribute and an entity attribute are required for each event; any other attributes
can be arbitrary. For example, while a group is running, their activity trackers
could record 5 attributes: distance, stride length, heart rate, and speed. But,
when they start to walk, their activity trackers may only capture two attributes:
heart rate and stride length.
Event Data is Connected by Time:
Event data has a native concept of time and illustrates the connections between
related events in a specied time period. This makes it easy to combine multiple
data streams, because they all have time in common. For example, three
separate data streams from mobile logs, web logs, and purchase history have
time as a common reference and can thus be merged into a single source for
even richer insights.
7/24/2019 Interana-Understanding Event in Event Data
8/12
Where Does Event Data Come From? 8Understanding Event in Event Data
Event data is everywhere and produced in just about every company today.
Remember, it is produced from the actions and interactions people or machines
have with applications and products such as:
Websites
Servers
Sensors
Automobiles
Home/Building Automation
Wearables
Smart Appliances
Connected Electronics
Call Detail Records
Engineers and developers can capture just about any action or interaction
that is made by an application, product, or machine. It is stored in les such as
clickstreams and logs.
Where Does Event
Data Come From?
7/24/2019 Interana-Understanding Event in Event Data
9/12
Analysis Perfect for Event Data 9Understanding Event in Event Data
Root Cause Examines what precipitates an event and is often used to solve
problems or identify catalysts. Focuses on why an event happened.
A/B Testing A form of hypothesis testing with two variants to show how
they are similar or how they dier. Experiment results frequently informproduct direction.
Growth Uncovers what and how entities are communicating/interacting
with products and services so that businesses can use this information to
develop ways to foster growth of the business.
Retention Reveals how often something is used and how often the entity
returns over time. Often, this is explored by tracking a rate across dierent
entity groups.
Conversion Tracks how an entity(s) moves through a pre-determined path
and locates where along the path the entity takes an action. Typical toolsused in this process are funnels.
Engagement Method for looking at how much an entity is using a product
or service. Typical metrics used are average session length, daily/weekly/
monthly active use.
Churn Commonly known as attrition, turnover or defection, churn is the
measurement of the likelihood of an entity disengaging. In addition to this
probability, another is the exact point where (in usage ow) and when (in
time) this happens.
Analysis Perfect
for Event Data
7/24/2019 Interana-Understanding Event in Event Data
10/12
Challenges of Event Data 10Understanding Event in Event Data
Challenges of
Event Data
Most companies struggle with event data because they are using technologies
meant for relational data. Traditional RDBMS (Relational Database Management
Solutions) are based on indexes to make point lookup fast, always trying to
minimize the number of rows that need to be scanned. This works great when
an index matches the workload, but for the most part, scanning indexes is slow.
This is especially prevalent when we consider the massive volumes of event data
that need to be analyzed. This can make query times range from a few hours to
days depending on the complexity and the length of time being scanned.
Remember, with event data, time is a rst order principal. You need to be able to
scan all rows within a specic time period. A solution built for event data should
assume massive scanning workloads to make queries ecient.
Additionally, RDBMS are usually queried with SQL or another query language
designed for relational data. Again, these query languages are great for a point
lookup, but struggle when asking questions about events over periods of time.It almost always requires multiple scans and computations that can make them
slow and inecient - not to mention the complexity in writing them.
When performing analytics on event data, the query language should
have primitives that turn many-step processes into a single pass to allow
for maximum eciency. Using a RDBMS to analyze event data brings two
predominant challenges to the business. The rst has to do with scale. Event
data is massive in scale and traditional relational databases do not store and
analyze this data eciently; there should be no disincentive to log as many
events as possible. Instead, businesses sample from the event data potentially
losing valuable attributes and then wait hours or days for results. The secondchallenge is that the complexities inherent in query languages often prevent
many business teams like, Product or Marketing, from accessing data to generate
needed insights. Rather, business teams rely on data teams to query event data
often providing incomplete answers because this process not iterative; it is one
question at a time.
Using a RDBMS to store and analyze event data is a little like using a screwdriver
to pound in a nail. You can get it done, but it isnt the best idea.
7/24/2019 Interana-Understanding Event in Event Data
11/12
Summary 11Understanding Event in Event Data
Investments in big data technologies are expected to top 60% in 2014. The
question is not whether big data is here, it s how big will this data get? Much
of this data is event data, growing by the millions daily and overwhelming
businesses.
Interana is a purpose-built solution for event data at scale. The full stack
conguration consists of a highly scalable backend which is combined with a
visual and interactive frontend to deliver comprehensive analytics on event data.
Consequently, Interana scales to trillions of events, while keeping query times to
just seconds.
Questions about conversion, retention, root cause analysis and more across
endless dimensions are a few short clicks away with behavior-based tools such
as cohorts, funnels, and sessions. With event data at the core of the solution,
Interana provides behavioral analytics to help companies unlock the insights
they need to create new opportunities to grow their customer base, deepenengagement, and maximize retention in their products and services. Redening
self-service, Interana has done the hard work by eliminating the need to
generate long and complicated queries that take hours to write and even longer
to run. We aim to make data part of everyones day.
Summary
7/24/2019 Interana-Understanding Event in Event Data
12/12
68 Willow Road
Menlo Park, CA 94025
www.interana.com