Upload
property-portal-watch
View
137
Download
0
Embed Size (px)
Citation preview
Web Scrapers and Your Property Portal: High Risk Lessons
Speaker
Rami EssaidCEODistil Networks
Awards and Analyst Recognition
“Distil’s ability to analyze behavior provides the best chance of detecting
and blocking bot-driven attacks.”
5 Stars across the board.“Verdict: For monitoring the impact of bots on a network this is the tool one
needs.”
The only anti-bot solution to be included in Gartner’s Online Fraud
Detection Market Guide
Ovum puts Distil Networks On The Radar. “Clear innovation compared to
similar services.”
Fortune 500 & Alexa Global 10,000 CustomersEcommerce
Travel
Publishers
Directories
Traditional Media
Marketplace
Services
Distil Protects Over 50 RE Portals Globally!
Protecting Your Data
Enhancing Your Data
Cleaning-Up Your Data
Protecting Your Data
A Brief Intro to Bots and Web Scraping
What Is Web Scraping?
Web ScrapingAlso known as screen scraping, web scraping is the act of copying large amounts of data from a website – either manually or with an automated program.
Legitimate ScrapingScraping can sometimes be benevolent and totally acceptable. For example, the search engine bots that index your website
Malicious ScrapingA systematic theft of intellectual property accessible on a website, including pricing, content, images, and proprietary data
Who is behind Web Scraping?
CompetitorsContent Theft
Competitive IntelPrice Scraping
AggregatorsStart-ups
Unauthorized Middlemen
HackersContent for Fake Pages
Search EnginesGoogle
BingYahooBaidu
Bad Bots Cause the Majority of Website Problems
In 2015 the most targeted verticals were digital publishing and real estate. Real Estate sites saw a 300% increase in
bad bot traffic!
Traffic by Type of Site, 2014 vs 2015
Bad Web Scraping
Web scraping is the act of taking content from a website with the intent of using it for purposes outside the direct control of the site owner.
It can be used to○ Steal intellectual property○ Gain competitive advantage○ Create aggregation or meta-sites○ Perform market research○ Damage SEO rankings
Alexa – monitor traffic levels
SE Ranking – track search rankings
InfiniGraph – watch social media trends
Open Site Explorer – monitor backlinks
SpyFu – view advertising keywords
Moat – find where ads are running
iSpionage – organic search keywords
Compete PRO – get demographic info
Quantcast – view audience insights
SpyOnWeb – see behind the curtain
Cheap scraping software
Inexpensive cloud computing resources
Botnet-as-a-Service
What is Contributing to the Growth in Web Scraping?
Freelancer.com RatesScraping three real estate sitesData Manipulation (de-duping, etc.)Importing into new software
Average Cost - $130 USD
The Going Rate for Scraping Less than $130/day
Posting Stolen Data is Quick and Easy due to Turnkey Platforms
Real Estate Portal Platforms start at $299
Scraped Data$130
The Cost of Replicating your Website
Classified Ad Website$299
$429
Bottom LineScrapers scrape because they are making money with your listings!
And the Real Estate industry is left with...
Higher CostsLost Revenues
Why Bots / Scraping is a Problem in Real Estate
Case Study
Enhancing Your Data
Delivering a Clear Picture of Your Web Traffic
Low Resolution Fingerprint
“Unactionable”
Hi-Def Fingerprint“Actionable”
Hi-Def Fingerprinting Eliminates Blind Defense
IP AddressHeader & User Agent InformationCookie Browser
200+ Attributes of data Navigator, WebGL, Plugins, Audio, Video, etc.
Tamper proofing layer
Hi-Def Fingerprint
That Majority of Bad Bots Now Use Multiple IP Addresses
Bots which dynamically rotate IP addresses, or distribute attacks are significantly harder to detect and mitigate
Sticky Bot Tracking With No Impact On Real UsersDevice FingerprintingFingerprints stick to the bot even if it attempts to reconnect from random IP addresses or hide behind an anonymous proxy or peer-to-peer network
Tracks distributed attacks that would normally fly under the radar
Without Distil
With Distil
Without Impacting Users Sharing the Same IPAvoids blocking residential users or organizations that might share the same NAT as the bot or botnet
Case Study
Cleaning-Up Your Data
In 2015 the most targeted verticals were digital publishing and real estate. Real Estate sites saw a 300% increase in
bad bot traffic!
Traffic by Type of Site, 2014 vs 2015
Web scraping hurts your KPIs...Slowdowns, downtime, and poor user experiencesIncrease in costs (infrastructure and people)Distortion of web analyticsDigital ad fraud, reputation and trust (bad leads)
How Web Scrapers Impact KPIs
Majority of Bots are Advanced Persistent Bots (APBs)
APBs have one or more of the following abilities:
AdvancedMimick human behaviorLoad JavaScriptLoad external resourcesSupport cookiesBrowser automation (Selenium, PhantomJS)
Persistent Dynamic IP rotationDistribute attacks across IP addressesHide behind anonymous and peer-to-peer proxies 2016 Distil Bad Bot
Report
Loading Assets & Bots Mimicking Humans % of bots able to load external assets (e.g.
JavaScript) % of bots able to mimic
human behavior
These bots will skew marketing tools such as (Google Analytics, A/B testing,
conversion tracking, etc.)These bots will fly under the radar of
most security tools
Bots Throw Off Analytics
Impressions and Clicks Remain the Biggest Targets
Impressions(CPM/CPV)
Clicks(CPC)
Search$18.8B
86% digital spend
Display$7.9B
Video$3.5B Mobile
$6.2B$6.2B
Leads(CPL)
Sales(CPA)
Lead Gen$2.0B
Other$5.0B
• classifieds• sponsorship• rich media
estimated fraudnot at risk
$42.5B $7B
Bots Don't Buy Houses
35
Case Study
The Only Easy and Accurate Way to Protect Web Applications from Bad Bots, API Abuse, and Fraud.
Detect and Distil Traffic
No Longer Blind DefenseComplete Visibility into False Positives
17 million CAPTCHAs served
78 solved
False Positive Rate = 0.00000458
www.distilnetworks.com/trial/Offer Ends: October 30, 2016
Two Months of Free Service + Traffic Analysis
www.distilnetworks.com
QUESTIONS….COMMENTS?I N F O @ D I S T I L N E T W O R K S . C O M
1.866.423.0606OR CALL US ON