28
Data Strategy Aija Leiponen, Cornell University, [email protected]

Data strategy aija leiponen_01112016

Embed Size (px)

Citation preview

Page 1: Data strategy aija leiponen_01112016

Data Strategy

Aija Leiponen, Cornell University, [email protected]

Page 2: Data strategy aija leiponen_01112016
Page 3: Data strategy aija leiponen_01112016

Which technology fields are the most influential?Predicted control patent citations with co-listed technology fields

-10

12

34

pate

nt c

itatio

ns (l

ogge

d co

efs)

1920 1930 1940 1950 1960 1970 1980 1990 2000 2010

Computer Control Digicomms ControlData process Control Semiconductor ControlThermal ap. Control Transport ControlMaterial Control Machine ControlAll control

Patents listed in both digital comm & control

Koutroumpis-Leiponen-Thomas: “Invention Machines: How Instruments and Information Technologies Drive Global Technological Progress”

Page 4: Data strategy aija leiponen_01112016

Why?

Invention machines: Applicable in many sectors; Facilitate invention in other sectors; A broad and catalytic impact by enabling follow-on invention in many application sectors; Generate massive knowledge spillovers over long periods of time

• Control instruments

• Digital communication

• Computer technologies

Instruments enable manipulation of material; computers enable manipulation of information

Automation requires instrumentation

Internet of Things

Page 5: Data strategy aija leiponen_01112016

Web 3.0

Control instruments – sensors, indicators, logic devices, actuators

+Data – social, administrative, industrial, personal

+Artificial Intelligence – algorithms, machine learning, prescriptive

analytics=

“Second Wave of the Second Machine Age” (Erik Brynjolfsson/MIT)

Page 6: Data strategy aija leiponen_01112016

Earlier communication revolutions

} Printing press} Steam engine} Telegraph } Telephone} Radio} Television

} Networked data?

Page 7: Data strategy aija leiponen_01112016

Printing press

§ Johannes Gutenberg (Germany 1452)§ Reading became accessible to common people§ Fiction, entertainment, propaganda§ Mass education§ Network externalities:

availability of books àincentives to learn to read àdemand for books

Page 8: Data strategy aija leiponen_01112016

Impact of the printing press

} 30 years later a printing shop in Florence run by nuns charged 3 florins for 1000 copies of Plato’s Dialogues, while a scribe would have charged 1 florin for 1 copy

} Availability of paper from China à prices fell, demand increased

} #books produced in 50 years following the invention = #books produced by European scribes in preceding 1000 years!

} Fust was suspected in Paris to be in league with the devil – fear of novelty

è How did the printing press change society, lifestyles, economy?

Page 9: Data strategy aija leiponen_01112016

Expect societal changes due to Web 3.0} Privacy needs to be defined

} Ownership of data} Right to be forgotten – in/alienability

} Intellectual property for data} Data security} Legal framework

} Radical transparency} Real-time visibility} Data integration, inference, prediction

} New business models, new platforms, new winners

Page 10: Data strategy aija leiponen_01112016

Information Economy vs. Data Economy

} Nonrival?

} Partially excludable?

} Experience good?

} High fixed cost/low or constant marginal cost?

} Yes

} NOT excludable

} Yes if no metadataNo if proper metadata

} Varies: exhaust data vs. data collected for a purpose

Are data information goods?

Page 11: Data strategy aija leiponen_01112016

PUZZLE: How can data be commercially exploited?

• Data are not intellectual property– Individual data points have no legal protection

• Essentially needs to be controlled contractually (secrecy, organization forms, product design, non-compete and confidentiality contracts), – Not via intellectual property rights

• How can something so “leaky” be valuable, commercialized?

Page 12: Data strategy aija leiponen_01112016

Money vs. Data

} Data is viewed as the “new oil”, an asset class} Digital currency is data on a fundamental level – streams of

bits} Currencies rely on trust in the medium – data have

intrinsic value. } Increasing subjectivity of data goods as we go from raw

to tagging/cleaning, aggregating, combining, processing} Provenance is hard to prove for data, currencies are

verifiable} Non-exchangeability of data – there is no quantum of

data with a minimum value} Nevertheless CS researchers starting to consider “data as

money”; developing conceptual models of a “central bank for data”

Page 13: Data strategy aija leiponen_01112016

Content vs. Data} Both have (some) intrinsic value} Both governed by copyright} On a fundamental level, content IS data and subject to

analytics (Natural Language Processing)} But the value of record data largely comes from

combination with other data and algorithms (models, statistics, prediction, deep learning…)

} And as a result, copyright is very weak on data

Page 14: Data strategy aija leiponen_01112016

Economic features of digital goods –all controversial and legally contested

Record Data Content Software Currency

Information Type

Raw records or structured databases

Knowledge (insights)

Knowledge (instructions)

Pure value

Good Type Intermediate/ Final

Final Final Final

Alienability Variable Medium High High

Inferability High Low Low Zero

Excludability None Variable Variable High

Fungibility Variable Low Low High

Protection Method

Secrecy Copyright Copyright or patents in some cases

Blockchain or other verification technology

Protection Aspect

Reuse Expression (patterns)

Expression (patterns) or insight (invention)

Transaction value

?

Page 15: Data strategy aija leiponen_01112016

Characteristics of different data sources

Source of data Privacy implications

Alienability Duration/ useful life

Sampling frequency

Inferrability

Health care High Low (health, retail, social network, locational)

>50 years Very low Low

Public sector administration

Medium Medium (public sector) –these usually have specific data protection protocols (confidential, etc)

>50 years Low Low

Manufacturing/ Operations (sensor networks)

Medium Medium (manufacturing) -these usually have specific data protection protocols (confidential, etc)

10-20 years Medium Low

Individual behavior

High Low (health, retail, social network)

1-5 years High High

Personal Locational Data

Medium Medium 1-5 years Very high Medium

Page 16: Data strategy aija leiponen_01112016

Summary I} The economics of data goods depend on an analysis of

data characteristics} Data are very heterogeneous

} Description, classification of data and its institutional framework is necessary for understanding its commercialization potential

} Overall, data goods substantially differ from other information goods} Excludability (protection)

} Transparency (metadata)

} Alienability (ongoing implications for individuals)

} Inferability (implications of data integration for individuals)

Page 17: Data strategy aija leiponen_01112016

Emergence of data markets?

} Data markets will work differently in different industries

} The legal framework is evolving à data attributes} Competitive strategies & outcomes will depend

particularly on the fungibility, excludability, alienability/inferability of the data in question} Business model design with determine profit potential of

fungible, poorly excludable, alienable data

Page 18: Data strategy aija leiponen_01112016

Types of market matching mechanisms

Matching Marketplace design

Terms of Exchange

Examples

One-to-one Bilateral Negotiated Data brokers

One-to-many Dispersal Standardized Twitter API

Many-to-one Harvest Implicit barter Google Services

Many-to-many Multilateral Standardized or negotiated

InfoChimps, Microsoft Azure

“The (unfullfilled) promise of Data Marketplaces”, P. Koutroumpis, A. Leiponen, L. Thomas

Page 19: Data strategy aija leiponen_01112016

Bilateral: Proprietary data vs. other IP licenses

Data Patents Trademarks Copyrights

License duration 1-2 years 10-20 years Up to 20 years 1-5 years

Exclusivity Rare Frequent Often regional Rare

Confidentiality Frequent Rare Rare Rare

Use restrictions Abundant Concise Specific Concise

Warranty ‘As is’ Frequent -- --

Obligation & remedy

Correct/refund/replace/ update

-- -- --

Audit Frequent -- -- --

Modal fee schedule Annual subscription % of sales or flat fee

NA Per device

“Data Contracts”, P. Koutroumpis, A. Leiponen, L .Thomas & J. Wu (2016)

Page 20: Data strategy aija leiponen_01112016

0%10%20%30%40%50%60%70%80%90%

100%

Contract type

Proprietary License

Open Database Comons

GNU

FOI / Open Government

0%10%20%30%40%50%60%70%80%90%

100%

Commercial use

Not Noted

No Commercial Use Permitted

Commercial Use Permitted

0%10%20%30%40%50%60%70%80%90%

100%

Data sharing

Sharing Permitted

Share Alike

Not Noted

No Sharing

Academic37 %

Commercial

19 %

Government

21 %

Non-Profit17 %

Personal4 %

International2 %

Dispersal: 366 Open Data Contracts (T&C)

Page 21: Data strategy aija leiponen_01112016

Multilateral: Centralized Data Platform

• Selling data outside the firm through the platform

• Platform provider takes the risk, provides services, takes a cut

• Technical challenges in standardization, rights management,

• Strategic challenges in revenue sharing, chicken & egg etc

Data Marketplace

Data Providers

AlgorithmProviders

Expert Advice

Customer Customer Customer

Complement Complement

Supply

Demand

Page 22: Data strategy aija leiponen_01112016

Common Pool Resources (Ostrom 1990)} Costly but not impossible to exclude

potential beneficiaries from obtaining benefits from use

} CPR àTragedy of the Commons} Collective action resolves TOTC and

maintains resource if} Clearly defined boundaries identify

legitimate users} Rules define how CPR should be used;

metarules to change rules} Effective monitoring to enforce rules,

boundaries

Page 23: Data strategy aija leiponen_01112016

Decentralized Data Platform –blockchain for data?

Aggregators

User content & sensor data

Tagging & Cleaning

Public Ledger…transactionXX1transactionXX2transactionXX3transactionXX4transactionXX5…

Trading

• “Bottom-up” approach in information exchange

• Users and sensors collect data

• Aggregators can buy/sell data for profit; data owners get paid and have control over future uses

• Processing, analysis and insights are separate

A

D

BC

GF

E

HI

“The (unfullfilled) promise of data marketplaces”, P. Koutroumpis, A. Leiponen & L .Thomas (2016) Processing

Page 24: Data strategy aija leiponen_01112016

Decentralization tasks

Page 25: Data strategy aija leiponen_01112016

Marketplace and data typology

Matching Marketplace design

Transaction costs

Provenance Boundary definition

Rules definition

Effective monitoring

Characteristics of data

One-to-one Bilateral High High High High High High value, High privacy

One-to-many Dispersal Low Low Low Low Minimal Low value, Low privacy

Many-to-one Harvest Low Low Low Low Minimal Low value, Low privacy

Many-to-many Multilateral Centralized

Low Medium Medium Medium Low Medium value, Medium privacy

Many-to-many Multilateral Decentralized

Medium High Low High High High value, Medium privacy

Data is no longer a Common Pool Resource!

Page 26: Data strategy aija leiponen_01112016

Performance of centralized and decentralized market designs

Centralized Decentralized Thickness Variable depending on

the rules and membership/usage fees

Assumed to have full participation

Congestion Assumed to have minimal effect

Assumed to have minimal effect

Transaction costs

Very low Increased friction for each transaction (can be limited by using trusted third-party licensing)

Decentralized: Trading off market thickness against increasing (technical) transaction costs

http://hackingdistributed.com/2016/08/04/byzcoin/https://www.technologyreview.com/s/600781/technical-roadblock-might-shatter-bitcoin-dreams/

Page 27: Data strategy aija leiponen_01112016

Conclusions } Platforms/multisided markets bring together multiple

different types of parties} There are complementarities among the parties

} Need to engage the different sides} Pricing and integration strategies may help in reaching critical

mass for the platform} Successful platforms benefit from strong network effects and

scale economies and can become very profitable …and very powerful} Monopolization of communication and information platforms can be

societally harmful} Algorithmic transparency/monitoring will be necessary

} How digital platforms are operationalized depends on the nature of the service/good provided, institutional setting, Digital Rights Management – IoT

Page 28: Data strategy aija leiponen_01112016

Summary II

• Data really is a different kind of an intellectual asset– Careful attention to technical, institutional detail is required!

• Trading regimes: secrecy & trust or verification technology (blockchain?) – or ‘FREE’– Bilateral trading sets up a complex relationship with remedies,

audits, subscriptions as contractual features– Multilateral based on verification tech could be anonymous

and one-off – probably for more high-value data due to computing cost

• Continuing evolution in control technologies and Artificial Intelligence will be the “invention machines” of the 21st century – data will be the lubricant