View
34
Download
0
Embed Size (px)
Citation preview
Data Strategy
Aija Leiponen, Cornell University, [email protected]
Which technology fields are the most influential?Predicted control patent citations with co-listed technology fields
-10
12
34
pate
nt c
itatio
ns (l
ogge
d co
efs)
1920 1930 1940 1950 1960 1970 1980 1990 2000 2010
Computer Control Digicomms ControlData process Control Semiconductor ControlThermal ap. Control Transport ControlMaterial Control Machine ControlAll control
Patents listed in both digital comm & control
Koutroumpis-Leiponen-Thomas: “Invention Machines: How Instruments and Information Technologies Drive Global Technological Progress”
Why?
Invention machines: Applicable in many sectors; Facilitate invention in other sectors; A broad and catalytic impact by enabling follow-on invention in many application sectors; Generate massive knowledge spillovers over long periods of time
• Control instruments
• Digital communication
• Computer technologies
Instruments enable manipulation of material; computers enable manipulation of information
Automation requires instrumentation
Internet of Things
Web 3.0
Control instruments – sensors, indicators, logic devices, actuators
+Data – social, administrative, industrial, personal
+Artificial Intelligence – algorithms, machine learning, prescriptive
analytics=
“Second Wave of the Second Machine Age” (Erik Brynjolfsson/MIT)
Earlier communication revolutions
} Printing press} Steam engine} Telegraph } Telephone} Radio} Television
} Networked data?
Printing press
§ Johannes Gutenberg (Germany 1452)§ Reading became accessible to common people§ Fiction, entertainment, propaganda§ Mass education§ Network externalities:
availability of books àincentives to learn to read àdemand for books
Impact of the printing press
} 30 years later a printing shop in Florence run by nuns charged 3 florins for 1000 copies of Plato’s Dialogues, while a scribe would have charged 1 florin for 1 copy
} Availability of paper from China à prices fell, demand increased
} #books produced in 50 years following the invention = #books produced by European scribes in preceding 1000 years!
} Fust was suspected in Paris to be in league with the devil – fear of novelty
è How did the printing press change society, lifestyles, economy?
Expect societal changes due to Web 3.0} Privacy needs to be defined
} Ownership of data} Right to be forgotten – in/alienability
} Intellectual property for data} Data security} Legal framework
} Radical transparency} Real-time visibility} Data integration, inference, prediction
} New business models, new platforms, new winners
Information Economy vs. Data Economy
} Nonrival?
} Partially excludable?
} Experience good?
} High fixed cost/low or constant marginal cost?
} Yes
} NOT excludable
} Yes if no metadataNo if proper metadata
} Varies: exhaust data vs. data collected for a purpose
Are data information goods?
PUZZLE: How can data be commercially exploited?
• Data are not intellectual property– Individual data points have no legal protection
• Essentially needs to be controlled contractually (secrecy, organization forms, product design, non-compete and confidentiality contracts), – Not via intellectual property rights
• How can something so “leaky” be valuable, commercialized?
Money vs. Data
} Data is viewed as the “new oil”, an asset class} Digital currency is data on a fundamental level – streams of
bits} Currencies rely on trust in the medium – data have
intrinsic value. } Increasing subjectivity of data goods as we go from raw
to tagging/cleaning, aggregating, combining, processing} Provenance is hard to prove for data, currencies are
verifiable} Non-exchangeability of data – there is no quantum of
data with a minimum value} Nevertheless CS researchers starting to consider “data as
money”; developing conceptual models of a “central bank for data”
Content vs. Data} Both have (some) intrinsic value} Both governed by copyright} On a fundamental level, content IS data and subject to
analytics (Natural Language Processing)} But the value of record data largely comes from
combination with other data and algorithms (models, statistics, prediction, deep learning…)
} And as a result, copyright is very weak on data
Economic features of digital goods –all controversial and legally contested
Record Data Content Software Currency
Information Type
Raw records or structured databases
Knowledge (insights)
Knowledge (instructions)
Pure value
Good Type Intermediate/ Final
Final Final Final
Alienability Variable Medium High High
Inferability High Low Low Zero
Excludability None Variable Variable High
Fungibility Variable Low Low High
Protection Method
Secrecy Copyright Copyright or patents in some cases
Blockchain or other verification technology
Protection Aspect
Reuse Expression (patterns)
Expression (patterns) or insight (invention)
Transaction value
?
Characteristics of different data sources
Source of data Privacy implications
Alienability Duration/ useful life
Sampling frequency
Inferrability
Health care High Low (health, retail, social network, locational)
>50 years Very low Low
Public sector administration
Medium Medium (public sector) –these usually have specific data protection protocols (confidential, etc)
>50 years Low Low
Manufacturing/ Operations (sensor networks)
Medium Medium (manufacturing) -these usually have specific data protection protocols (confidential, etc)
10-20 years Medium Low
Individual behavior
High Low (health, retail, social network)
1-5 years High High
Personal Locational Data
Medium Medium 1-5 years Very high Medium
Summary I} The economics of data goods depend on an analysis of
data characteristics} Data are very heterogeneous
} Description, classification of data and its institutional framework is necessary for understanding its commercialization potential
} Overall, data goods substantially differ from other information goods} Excludability (protection)
} Transparency (metadata)
} Alienability (ongoing implications for individuals)
} Inferability (implications of data integration for individuals)
Emergence of data markets?
} Data markets will work differently in different industries
} The legal framework is evolving à data attributes} Competitive strategies & outcomes will depend
particularly on the fungibility, excludability, alienability/inferability of the data in question} Business model design with determine profit potential of
fungible, poorly excludable, alienable data
Types of market matching mechanisms
Matching Marketplace design
Terms of Exchange
Examples
One-to-one Bilateral Negotiated Data brokers
One-to-many Dispersal Standardized Twitter API
Many-to-one Harvest Implicit barter Google Services
Many-to-many Multilateral Standardized or negotiated
InfoChimps, Microsoft Azure
“The (unfullfilled) promise of Data Marketplaces”, P. Koutroumpis, A. Leiponen, L. Thomas
Bilateral: Proprietary data vs. other IP licenses
Data Patents Trademarks Copyrights
License duration 1-2 years 10-20 years Up to 20 years 1-5 years
Exclusivity Rare Frequent Often regional Rare
Confidentiality Frequent Rare Rare Rare
Use restrictions Abundant Concise Specific Concise
Warranty ‘As is’ Frequent -- --
Obligation & remedy
Correct/refund/replace/ update
-- -- --
Audit Frequent -- -- --
Modal fee schedule Annual subscription % of sales or flat fee
NA Per device
“Data Contracts”, P. Koutroumpis, A. Leiponen, L .Thomas & J. Wu (2016)
0%10%20%30%40%50%60%70%80%90%
100%
Contract type
Proprietary License
Open Database Comons
GNU
FOI / Open Government
0%10%20%30%40%50%60%70%80%90%
100%
Commercial use
Not Noted
No Commercial Use Permitted
Commercial Use Permitted
0%10%20%30%40%50%60%70%80%90%
100%
Data sharing
Sharing Permitted
Share Alike
Not Noted
No Sharing
Academic37 %
Commercial
19 %
Government
21 %
Non-Profit17 %
Personal4 %
International2 %
Dispersal: 366 Open Data Contracts (T&C)
Multilateral: Centralized Data Platform
• Selling data outside the firm through the platform
• Platform provider takes the risk, provides services, takes a cut
• Technical challenges in standardization, rights management,
• Strategic challenges in revenue sharing, chicken & egg etc
Data Marketplace
Data Providers
AlgorithmProviders
Expert Advice
Customer Customer Customer
Complement Complement
Supply
Demand
Common Pool Resources (Ostrom 1990)} Costly but not impossible to exclude
potential beneficiaries from obtaining benefits from use
} CPR àTragedy of the Commons} Collective action resolves TOTC and
maintains resource if} Clearly defined boundaries identify
legitimate users} Rules define how CPR should be used;
metarules to change rules} Effective monitoring to enforce rules,
boundaries
Decentralized Data Platform –blockchain for data?
Aggregators
User content & sensor data
Tagging & Cleaning
Public Ledger…transactionXX1transactionXX2transactionXX3transactionXX4transactionXX5…
Trading
• “Bottom-up” approach in information exchange
• Users and sensors collect data
• Aggregators can buy/sell data for profit; data owners get paid and have control over future uses
• Processing, analysis and insights are separate
A
D
BC
GF
E
HI
“The (unfullfilled) promise of data marketplaces”, P. Koutroumpis, A. Leiponen & L .Thomas (2016) Processing
Decentralization tasks
Marketplace and data typology
Matching Marketplace design
Transaction costs
Provenance Boundary definition
Rules definition
Effective monitoring
Characteristics of data
One-to-one Bilateral High High High High High High value, High privacy
One-to-many Dispersal Low Low Low Low Minimal Low value, Low privacy
Many-to-one Harvest Low Low Low Low Minimal Low value, Low privacy
Many-to-many Multilateral Centralized
Low Medium Medium Medium Low Medium value, Medium privacy
Many-to-many Multilateral Decentralized
Medium High Low High High High value, Medium privacy
Data is no longer a Common Pool Resource!
Performance of centralized and decentralized market designs
Centralized Decentralized Thickness Variable depending on
the rules and membership/usage fees
Assumed to have full participation
Congestion Assumed to have minimal effect
Assumed to have minimal effect
Transaction costs
Very low Increased friction for each transaction (can be limited by using trusted third-party licensing)
Decentralized: Trading off market thickness against increasing (technical) transaction costs
http://hackingdistributed.com/2016/08/04/byzcoin/https://www.technologyreview.com/s/600781/technical-roadblock-might-shatter-bitcoin-dreams/
Conclusions } Platforms/multisided markets bring together multiple
different types of parties} There are complementarities among the parties
} Need to engage the different sides} Pricing and integration strategies may help in reaching critical
mass for the platform} Successful platforms benefit from strong network effects and
scale economies and can become very profitable …and very powerful} Monopolization of communication and information platforms can be
societally harmful} Algorithmic transparency/monitoring will be necessary
} How digital platforms are operationalized depends on the nature of the service/good provided, institutional setting, Digital Rights Management – IoT
Summary II
• Data really is a different kind of an intellectual asset– Careful attention to technical, institutional detail is required!
• Trading regimes: secrecy & trust or verification technology (blockchain?) – or ‘FREE’– Bilateral trading sets up a complex relationship with remedies,
audits, subscriptions as contractual features– Multilateral based on verification tech could be anonymous
and one-off – probably for more high-value data due to computing cost
• Continuing evolution in control technologies and Artificial Intelligence will be the “invention machines” of the 21st century – data will be the lubricant