[email protected] NL +31 20 522 44 66 US +1 888 263 3917 WWW.BLOOMREACH.COM WHITEPAPER The BlꝏmReach Web Relevance Engine

WHITEPAPER The BloomReach Web Relevance Enginego.bloomreach.com/rs/bloomreach/images/WebRelevanceEngine-eBook.pdfit immediately gets to the task of learning the ins-and-outs of your

  • Upload

  • View

  • Download

Embed Size (px)

Citation preview

Page 1: WHITEPAPER The BloomReach Web Relevance Enginego.bloomreach.com/rs/bloomreach/images/WebRelevanceEngine-eBook.pdfit immediately gets to the task of learning the ins-and-outs of your

[email protected] +31 20 522 44 66US +1 888 263 3917






The BloomReach Web Relevance Engine

Page 2: WHITEPAPER The BloomReach Web Relevance Enginego.bloomreach.com/rs/bloomreach/images/WebRelevanceEngine-eBook.pdfit immediately gets to the task of learning the ins-and-outs of your



The continuously learning Web Relevance Engine (WRE)

uses natural language processing and machine-learning

to take the data from your site, your customers’ behavior, your

content & products, and BloomReach’s understanding of web-

wide demand, and weave it into the most valuable solution for

your business. We’ve forged and sharpened the sword, all you

have to do is pick it up and swing.

The process of taming this data and turning it into revenue-

driving action is one that BloomReach has been perfecting

for nearly a decade, and our deep understanding of language

and user intent is available right out of the gate for our

customers. Of course, once you plug in the BloomReach WRE

it immediately gets to the task of learning the ins-and-outs of

your particular business - and then learns and learns and learns.

The more data fed into the WRE, the further our powerful

BloomReach applications can take you.

BloomReach gathers the data from your site, pairs it with

the data from relevant web-wide sources, lets the machine

get to work, and provides you with insights to optimize your

business’s digital experience. Sounds great in theory, but how

does BloomReach’s continuously-learning WRE actually do this?

Gathering Site Content:

The BloomReach WRE starts by identifying and understanding

the content and products on your site. This is done through two


Customer Product Feed - This is the data that you, as a

BloomReach customer, provide us. This information includes

titles, product attributes, prices, inventory, out of stock items,

and language usage most important and unique to your

business. The majority of BloomReach customers provide an

updated product feed daily.

T oday’s ever-connected world

produces a never-ending stream

of data. For companies that serve

millions of visitors a day, have

dozens of different channels and touchpoints,

and especially for commerce companies with

thousands - or millions - of products, gathering

this data can be overwhelming. And once you

have it, you have to actually do something with

it. You have to gather the right data, understand

it, and apply it in the right way to drive a more

successful online experience.

The BloomReachWeb RelevanceEngine turns data into value.

Page 3: WHITEPAPER The BloomReach Web Relevance Enginego.bloomreach.com/rs/bloomreach/images/WebRelevanceEngine-eBook.pdfit immediately gets to the task of learning the ins-and-outs of your



Site Crawl - BloomReach uses every tool in our toolkit to deeply

understand your site. We look across your site for updates,

crawling it similar to the way a search engine would. We also

observe through a BloomReach listening pixel on your site

which gives a clear picture of how your visitors interact across

the digital experience. Once we’ve fetched all the data, we

parse it - looking at the title, heading, content, products, and

components to create a module for each piece that we fetch.

This module is what is our machine learning algorithms use

to organize, canonicalize, and build relationships within your

site. Parsing also flags pages that are 404 and sends these to a

blacklist - making it easy to keep large, multi-page and -product

environments clean.

Merge Pipeline:

The next step is to merge the Customer Product Feed and Site

Crawl data together and then clean it. As part of the cleaning

we will drop any duplicates and merge multiple versions

of content. Once we have the merged, clean data for your

environment, we pass it on for processing.

Data Processing:

This is where BloomReach’s natural language processing (NLP)

and core machine-learning come into play. Our algorithms take

in the gathered data, determine the optimal logic and order,

and produce actionable insights to increase your environment’s

performance. Key ways this occurs is through:

Link Generation - The WRE simulates a process akin to a search

engine that determines what URLs have a strong relationship

and should link to each other. For example, if a page has a

generated tag of “Eco Friendly” (URL 1) our algorithms run it


Site Crawl

Customer Site

Link Generation Link Optimization

BloomReach Data Store Quality Review

Gather Merge Process

Feed Merge Data

Pull & Parse Data

Clean Data

Page 4: WHITEPAPER The BloomReach Web Relevance Enginego.bloomreach.com/rs/bloomreach/images/WebRelevanceEngine-eBook.pdfit immediately gets to the task of learning the ins-and-outs of your



through our search and find that a page

with the tag “Recycled” (URL 2) also

performs well for “Eco Friendly” - we

know we can link URLs 1 & 2 to each

other. The machine-learning algorithms

create these relationships for each URL,

creating a relationship map.

Link Optimization - We narrow down the

best relationships to power elements of

the page such as Related Searches and

Related Products/Content. First, we use

these relationships to link pages that are

most relevant to each other. Additionally,

we maximize traffic to your site by linking

URLs that are crawled every day by

search engines to URLs that don’t have as

much traffic - boosting them up. We also

link thematic pages to drive traffic and

improve efficiency.

Continuous Self-Learning

The BloomReach data pipeline runs automatically in a continuous loop. The only manual action needed is to turn it on or off, then sit back and let it learn and learn and learn. Every day - and every night - our machine-learning algorithms get to know your business and your visitors more in-depth.

The WRE’s intelligence means that is able to adapt to context. For instance, a user who types the letter ‘s’ on a department store site and the WRE might offer “shoes” as a site search query. Type in ‘s’ on an athletic team gear site and “Seattle Seahawks” is a more relevant query.

Similarly, machine learning provides the engine with the ability to adapt to different seasons. Typing the letter ‘v’ on a florist’s site in February is likely to produce “Valentine’s Day” as a query. Type ‘v’ a few months later and the auto suggestion is more likely to be “violets.”



reclaimedwood bowl


reusableshopping bags

ceramictravel mug

energy saverappliances

home composting kit

recycled glass vase

recycle bin

organic robe



organic skinny jeansrecheargable



organic gift basket





node of interest

Page 5: WHITEPAPER The BloomReach Web Relevance Enginego.bloomreach.com/rs/bloomreach/images/WebRelevanceEngine-eBook.pdfit immediately gets to the task of learning the ins-and-outs of your




Our capacity to link your content is accomplished by a deep

understanding of content and user behavior, which combine

to give a holistic view of the user experience. We reach

this understanding in a multitude of ways, with the leading

strategies including:

Synonym Database

The heart of the WRE is a deep understanding of language.

Our algorithms are continuously refining and increasing our

knowledge of language and intent, and the foundation of

that understanding is the unique way BloomReach handles


To show why the way we use synonyms is so

critical, we need to look at how BloomReach

uses natural language processing. Outside

of BloomReach (mainly in academic settings)

when people talk about NLP they refer to

breaking down text by tagging the noun, verb,

adjective, etc. However, understanding a

search query is different, as people don’t tend

to type complex sentences in a search box.

But they do describe attributes of products

and services in many different ways and

combinations - which is where our synonym

dictionary comes into play.

Building this dictionary is not trivial. It has to

be both large and clean, meaning you have

to weed out a lot of noise caused by creative

marketing terms (ie. “thundercloud” is not a

color). Of course, you could perform manual quality control of

your entire dictionary, but doing so efficiently and at scale is

expensive and difficult. While leaving a dictionary completely

up to computers means you may miss some of the subtle

nuances or the terms unique to specific industries. To handle

this, BloomReach uses algorithms to extract likely synonym

candidates and has human filters on top of that to ensure the

quality of our library. We have years worth of synonyms - over

15 million pairs - that are immediately available to you when

you flip the WRE on and, through crawling, pixel data, product

feeds, and queries BloomReach continuously develops this

dictionary with the language your customers are specifically

using. Of course, while used to optimize your environment -

customer-specific site data and product data is not shared.

Web-Wide Data

With 100 million pages and up to 10 terabytes processed daily, all of the BloomReach algorithms and applications are built on a strong foundation of data. In addition to BloomReach customer sites, the WRE also crawls a myriad of other public sites such as Wikipedia, industry blogs, and competitors’ sites to gain further understanding of language, context, and consumer intent.

For example, if we have a client in the home goods sector, we will crawl several other sites in that vertical and target customer market to understand the language used by that audience and the marketers who speak to them.

The Red Dress

For an idea of just how much there is to learn about consumer language and behavior, consider an exercise BloomReach conducted, asking participants to describe a red dress. The first 500 users who took the quiz came up with 129 ways to describe the dress’s color, 194 ways to describe its neckline and 275 ways to describe its belt, which some might say defies description. Luckily, with BloomReach’s WRE synonym database, including the 1,077 different colors it understands, our algorithms can find the relationship between these variations to ensure your users get what they’re looking for - no matter how they say it.

Page 6: WHITEPAPER The BloomReach Web Relevance Enginego.bloomreach.com/rs/bloomreach/images/WebRelevanceEngine-eBook.pdfit immediately gets to the task of learning the ins-and-outs of your



Interpreting Search

When a user types in a search, the WRE scans the query to

break it up into attributes and the product. If a visitor enters

“black office chair mesh back” we scan the phrase, match it to

our site data and dictionary, and determine the core product (or

content). Then we determine the surrounding attributes. In the

case here, this would look like <color><style><product><style>.

We run the product and attributes against our synonym

dictionary, compare that with your product & content feed, and

use performance data, ranking weighting, and user preference

to return the most relevant results.

Collaborative Filtering

You can learn a great deal about user intent with aggregate

behavioral data. For example, if a site search query has a

high volume yet few conversions, it can be valuable to look

at what those shoppers did next. Did a significant number of

them navigate to a particular category or product page? Did

they refine their search query and try again? This type of

information is a key way the WRE determines complimentary

content and products.

One of the techniques the WRE uses is a process called

“collaborative filtering,” which utilizes data to cluster

recommended products. Collaborative filtering clusters

visitors around shared preferences using vector mathematics

and identifies specific users who are within a natural cluster,

but who have not yet seen recommendations related to the

cluster’s shared preferences. The WRE uses NLP to interpret

the attributes (and their synonyms) for the products and

content in the cluster and identifies other pieces that should

be in the cluster based on their attributes. The system then

offers the recommendations. The visitor’s interaction with the

recommendation helps refine the statistical clustering.

Behavioral modeling is also used to identify areas on a site

where demand is going unmet. For example, if product

information does not contain a commonly used synonym for

that product, yet that synonym is used by shoppers as a site

search query, the WRE can learn that new term term and use it

for recommendations and results.


Every individual visitor to your digital experience has their

own personality, likes, dislikes, needs, and language style - and

they expect you to deliver recommendations relevant to them,

whether they are just starting their research or returning to a

site on a different device altogether.

Luckily, a visitor’s engagement across your digital touch points

reveals a great deal about their evolving intent. They signal

their intent by typing queries, sure, but also through the email

links they click, particular pages they open on particular devices,

shopping cart additions, and their click through behaviour.

Given a small amount of engagement - say, three page views -

the WRE can begin to learn the affinities of an individual user.

And not only is each person different, their circumstances and

needs change minute-by-minute. A user visiting an insurance

site when there is bad weather in their area may like to see a

fact sheet of weather damage covered by home insurance, the

same user visiting the same site on their mobile phone abroad

might be more interested in a direct link to file a traveler’s

insurance claim.



Category page Product page Page associated concept tag T

Series of clicks by users for thesame keyword query

User 1

User 2

User 3

Product page P1 was alreadytagged with a concept

We propagate the same tag to Categorypage C1, since it occurs with P1 often

and within a few clicks.


P1 P2 P3









P1 P2 P3









P1 P2 P3









Page 7: WHITEPAPER The BloomReach Web Relevance Enginego.bloomreach.com/rs/bloomreach/images/WebRelevanceEngine-eBook.pdfit immediately gets to the task of learning the ins-and-outs of your



This combination of circumstantial data, aggregate and

individual user behavior, and the WRE’s understanding of

product and content relationships, can provide a contextually

relevant experience to each individual - whether they are

known or anonymous

BloomReach’s understanding of language and search intent,

combined with our algorithmic knowledge of user behavior

and performance data is how the WRE data pipeline returns

contextually relevant related searches, products, and content to

your visitors.

Our ApplicationsThe Web Relevance Engine is the beating heart behind the

data-driven personalization, relevance, and ranking within all

BloomReach applications. Each application weaves together

the data-based intelligence from our algorithms to deliver the

insights and automated actions that are the most impactful to

your business.

BloomReach Experience leverages the WRE to bring together

a deep understanding of an organization’s content with a

contextual understanding of an individual’s intent. Provide

relevant experiences to your visitors - whether they are logged-

in or anonymous - across every device, channel, and touchpoint

they interact with. BloomReach Experience’s separation of that

content from its presentation means you can go beyond simply

offering visitors the relevant pages - but can personalize each

individual component for a truly agile experience.

2 individuals, same demographicFemale, 28, College grad, Mountain View, CA, Income >$80k

Cross-Channel Data

Your audience uses multiple devices, and they expect their experience to be consistent across those devices. In an ideal world, your visitors would be logged in across devices, but we don’t live in that world (in fact, our research found that only around 1% of eCommerce customers are logged in across desktop, tablet, and smartphone).

To overcome this challenge, the WRE uses pattern detection to connect anonymous users across multiple devices using a number of behavioral and technical signals.

This cross-device connection can be useful for subtly personalizing the experience of a shopper and for proving the “mobile influence” for visitors who browse on the smartphone, yet convert on the desktop.

Page 8: WHITEPAPER The BloomReach Web Relevance Enginego.bloomreach.com/rs/bloomreach/images/WebRelevanceEngine-eBook.pdfit immediately gets to the task of learning the ins-and-outs of your



BloomReach Digital Experience Marketecture

BloomReach Personalization

Search Merchandising Testing &Targeting

Insights Open Integrations

BloomReach Experience

Content Management


Forms Languages

Trends Relevance Experiments Open Integrations


BloomReach AI Algorithmic Optimization (SEO, Channel, Device) Predictive Analytics (Behavioral, Transactional)Machine Learning (Relevance, Testing, Performance) Natural Language Processing (Word Sense Disambiguation, Semantic Interpretation)

BloomReachOpen APIs




BloomReach Marketplace




Thematic Pages




AggregatedAggregated SearchSearchOwnedOwnedPurchasedPurchased FeedsFeeds UGCUGCOwnedOwned

FeedsFeeds UGCUGC


SocialSocial TransactionalTransactional


Using the WRE, BloomReach Personalization optimizes and

personalizes product discovery for every visitor. A shopper’s

path to purchase may include site search, navigating to category

pages, and engaging with product recommendations and that

journey may take place across multiple devices. BloomReach

Personalization learns from every interaction with a shopper to

better tailor the product mix, ranking, and recommendations

to suit her expressed affinities. The result is an improved,

frictionless customer experience that measurably impacts


BloomReach Organic uses the WRE to increase crawlability,

improve site content and cluster products on high quality

thematic pages. Each of these helps our customers capture and

convert consumer demand from search engines. The WRE also

helps identify additional opportunities, facilitates creating new

category pages (which can also be used for other marketing

channels, like email, paid search or social), and continuously

monitors the health of those pages - both from a technical

and consumer perspective. Achieving quality coverage for

organic search at scale is a challenge that necessitates having

technology like the WRE.

Experience is the battleground for today’s digital businesses.

The bar for this experience continues to rise, not only does your

audience expect every interaction to be as intuitive as flipping

a switch, they want it all tailored to them in the moment -

getting exactly what they want right now.

The Web Relevance Engine gives you the power to do just

that. Harnessing the power of big data, machine learning, and

natural language processing to deliver the relevant results

consumers have come to expect in the era of always-on search

and discovery. The WRE is driving business success, while

delivering the promise of a relevant and reliable web.

Page 9: WHITEPAPER The BloomReach Web Relevance Enginego.bloomreach.com/rs/bloomreach/images/WebRelevanceEngine-eBook.pdfit immediately gets to the task of learning the ins-and-outs of your



BloomReach is a Silicon Valley firm that brings businesses the first open and intelligent Digital Experience Platform (DXP). BloomReach drives customer experience to accelerate the path to conversion, increase revenue, and generate customer loyalty.

With applications for content management, site search, page management, SEO optimization and role-based analytics, BloomReach is a central location for all players who manage customer experience to come together and intelligently drive business outcomes. BloomReach’s Web Relevance Engine (WRE) algorithmically understands content and users, matching demand and intent data from across the web. BloomReach’s industry-leading tools unlock the powerful creativity of humans to improve omnichannel customer experiences at scale. Together, our users and our intelligent tools generate millions of dollars of proven incremental sales.

BloomReach’s portfolio of customers include: Neiman Marcus, Staples, REI, Mailchimp, and Autodesk. Created in 2009, BloomReach is headquartered in Mountain View, CA with offices worldwide and is backed by investment firms Bain Capital Ventures, Battery Ventures, NEA, Salesforce Ventures and Lightspeed Ventures.

About BloomReach

[email protected]


+1 888 263 3917


+31 20 522 4466


+44 20 35 14 99 60


+1 877 414 4776


+91 80 42 278 526