32
FREE data available. * * Just scrape it

Scraping

  • Upload
    pamspb

  • View
    147

  • Download
    0

Embed Size (px)

Citation preview

FREE data available. *

* Just scrape it

Public vs. Private data

“Paid” sources

SurveysResearch and experiments

Official statisticsInternal data

Get whatever you need, whenever you need.

What is scraping?

HTML/CSS

Dynamic sites?

AJAX, REST, SOAP, RSS

And APIs too?

Documents?

How?

In whatever way you prefer

PythonPerlC#Java

So hard?

Tools“Scraper” chrome extension

webharvy.com - desktop tool

mozenda.com - SaaS solution

grepsr.com - another SaaS solution

Maybe a little bit more technical.

SeleniumTwill

Robot

= Browser automation

Where’s the catch?

Be responsible

Name your user agent

Check what you can/cannot use on the website.

Never copy and paste content

But be persistent

Induce delaysEmulate browserDistribute traffic

Proxies“Tor” network

Other issues? Legal!

BizWorld

Project BizWorld is a free tool ..

.. that uses multiple sources to create an integrated picture of a business, group of businesses or an industry.

Use it to research your target business market, potential partners or competition. Or even use it to monitor aspects of your own business.

Market research and reviewCustomer research

Competitor researchCompany image in the Media

What We Pull in and Track

LinkedIn

Twitter

Business Website

BizWorld

Facebook

Business

keywords

industry

subsidiaries&

outlets

Google/web

Social media activity

Themes

How you can pull the data

Flexible filter

Pivot with drill-down

Detailed listing

Create shortlist

Opportunity

analysis

BizWorld

Pull data

via API results

Your data

$

publish

$

$

ozplace.com.au (shadow)

ozplace

=

Research & FindThe place to live and buy in

Price/Rent

Profile

Transport

Environment

Everything is scrape-able.

en.wikipedia.org/wiki/Open_data