73
Ranking | Analytics | Search | Reporting Structured Data for the Financial Industry The extensions to schema.org and their benefits for: Trusted Open Data Ecosystem, September 28, 2017, Madrid, Spain Dr. Mirek Sopek, Dr. Robert Trypuz

Structured Data for the Financial Industry

Embed Size (px)

Citation preview

Ranking | Analytics | Search | Reporting

Structured Data for the Financial Industry

The extensions to schema.org and their benefits for:

Trusted Open Data Ecosystem, September 28, 2017, Madrid, Spain

Dr. Mirek Sopek, Dr. Robert Trypuz

2TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

THE WORKSHOPAGENDA

• A Quest for Meaning

• On the open web

• In the business world

• The principle of least power

• The rise of schema.org

• Intro to schema.org

• Under the hood of schema.org

• Extending schema.org

• Applications of schema.org

• Rank

• Analytics

• Search

PART I – THE PRESENT

3TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

THE WORKSHOPAGENDA

• The reporting horizon

• The goal – to test the idea of the further simplification of the reporting

• The relevant development:

• Semantics for XBRL

• The movement from within

• What we have done so far?

• MakoLab POCs & exercises

• A vision for the future steps

• Discussion

PART II – THE FUTURE

The PresentSCHEMA.ORG and its existing applications

5TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

INTROA QUEST FOR MEANINGON THE WEB

• The Web is (mostly) a Mess

• Metadata becomes (very often) Meta-Crap (after: Cory Doctorow*)

• There is no such thing as Esperanto of the Web (despite its importance, English is not a lingua franca)

• The trust is lost – people of the Web (often) live in echo-chambers

THE WEB WAS IN THE DEEP NEED OFA PRAGMATIC APPROACH

SHORTLY AFTER THE WEB WAS INVENTEDWE NOTICED THAT:

* https://www.well.com/~doctorow/metacrap.htm

6TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

INTROWEB FULL OF MEANING INVENTION

• The “Web Full of Meaning” was invented(a.k.a. the “Semantic Web” or Web 3.0)

• Web gurus borrowed a fundamental term from philosophy –ONTOLOGY - to name their Vocabularies.

• Using Ontologies (aka Vocabularies) they started to create and promote new models for Data (Linked Data, Graph Data, Smart Data)

TO COUNTERBALANCE THE MESS …

7TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

INTRODISSATISFACTION

• Most of the results were (so far) only good for academic research

• Almost none of our ontologies enjoyed wide adoption

• Promises to build Web 3.0 quickly turned out to be failed

THE WEB WAS IN THE DEEP NEED OFA PRAGMATIC APPROACH

HOWEVER…

8TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

MEANWHILEIN THE BUSINESS WORLD …

• ISO20022, FpML, FIX, MISMO, XBRL, DPM, RIXML, IFX, OFX, BPM6, SDMX, SDDS, MDDL, ACORD

• FIBO, ACTUS, DPM2ISO, SMCube

MULTIPLICITY of STANDARDS and PROJECTS

From: Michał Piechocki:

„Trusted Open

Data Ecosystems”

Data Amplified 2016

9TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

SIRTIM BERNERS LEE

• Principle: Powerful languages inhibit information reuse.

• Good Practice: Use the least powerful language suitable for expressing information, constraints or programs on the World Wide Web.

• Tradeoff: Choosing between languages that can solve a broad range of problems and languages in which programs and data are easily analyzed

PRINCIPLE OF LEAST POWER - 1998

10TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

…EARLY ATTEMPTS TO ACT WITH LESS POWER ..

• MCF (Meta Content Framework) – R. Guha 1995-1997

https://en.wikipedia.org/wiki/Meta_Content_Framework

• SHOE - Simple HTML Ontology Extensions – Sean Luke, Lee Spector, James Hendler, Jeff Heflin, and

David Rager, 1996

https://en.wikipedia.org/wiki/Simple_HTML_Ontology_Extensions

• RSS - RDF Site Summary – Dan Libby and Ramanathan V. Guha, 1999

• MICROFORMATS (μF) – a grassroots movement, 2005

https://en.wikipedia.org/wiki/Microformat

NONE OF THEM RECEIVED WIDESPREAD ADOPTION !!!!

11TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

…SO THE SCHEMA.ORG WAS INVENTED !

• Schema.org (2011), sponsored by the most important search engines: Google, Microsoft, Yahoo

and Yandex, is a large scale collaborative activity with a mission to create, maintain, and promote

schemas for structured data on the WEB pages and beyond.

• It contains more than 2000 terms: 753 types, 1207 properties and 220 enumerations.

• Schema.org covers entities, relationships between entities and actions.

• Today, about 15 million sites use schema.org. Random yet representative crawls (Web Data

Commons) show that about 30% of URLs on the web return some form of triples from schema.org.

• Many applications from Google (Knowledge Graph), Microsoft (like Cortana), Pinterest, Yandex and

others already use schema.org to power rich experiences.

• Think of schema.org as a global Vocabulary for the web transcending domain and language

barriers.

• The principal authors of the schema.org conceptual framework are R. Guha, D. Brickley and P. Mika

12TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

http://bl.ocks.org/danbri/raw/1c121ea8bd2189cf411c/

WHAT IS SCHEMA.ORG?

http://schema.org

13TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

SCHEMA.ORG USE SIMPLICITY – AN ILLUSTRATION

http://finances.makolab.com/HTML/LoanStudents/LoanStudents.html

Under the hoodOF SCHEMA.ORG

15TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

UNDER THE HOOD OF SCHEMA.ORG

• „The driving factor in the design of Schema.org

was to make it easy for webmasters to publish

their data. In general, the design decisions

place more of the burden on consumers of the

markup.”

R.V. GUHA, D. DAN BRICKLEY, S. MACBETH –

„Schema.org - Evolution of Structured Data on the Web”

DESIGN DECISIONS

• Derived from RDFS (RDF Schema)

• Multiple inheritance hierarchy

• POLYMORPHIC PROPERTIES - Each property

may have one or more types as its domain

and its range („domainincludes” and

„rangeincludes”)

DATA MODEL

16TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

UNDER THE HOOD OF SCHEMA.ORG

USAGE MODELS

• Under full control of site/messages/data

publishers

• Data EMBEDDED into page, data

representation or into message markup (HTML,

XML)

• Harvested during standard crawling, message

or data processing

SERIALIZATIONS

• RDFa - CANONICAL

• Microdata (native to HTML5)

• JSON-LD

17TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

UNDER THE HOOD OF SCHEMA.ORG

CORE http://schema.org/<term> http://schema.org/<term>

HOSTED EXT. http://<ext>.schema.org/<term> http://schema.org/<term>

External EXT. http://<ext.domain>/<term> http://<ext.domain>/<term>

CORE http://schema.org/Car http://schema.org/Car

HOSTED EXT. http://auto.schema.org/Motorcycle http://schema.org/Motorcycle

External EXT. http://fibo.org/voc/BusinessEntity http://fibo.org/voc/BusinessEntity

EXTENSION MECHANISM: RULES FOR URIs

Documentation URI: Canonical URI:

Examples:

Rules:

18TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

UNDER THE HOOD OF SCHEMA.ORG

<div itemscope itemtype="http://schema.org/BankTransfer">

<h1>If you want to donate</h1>

Send <span itemprop="amount" itemscope itemtype="http://schema.org/MonetaryAmount">

<span itemprop="amount">30</span>

<span itemprop="currency" content="USD">$</span>

</span>

via bank transfer to the

<span itemprop="beneficiaryBank">European ExampleBank, London</span>

Put "<i itemprop="name">Donate wikimedia.org</i>" in the transfer title.

</div>

EXAMPLES - MICRODATA

19TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

UNDER THE HOOD OF SCHEMA.ORG

<div vocab="http://schema.org" typeof="BankTransfer">

<h1>If you want to donate</h1>

Send <span property="amount" typeof="MonetaryAmount">

<span property="amount">30</span>

<span property="currency" content="USD">$</span>

</span>

via bank transfer to the

<span property="beneficiaryBank"> European ExampleBank,London</span>

Put "<i property=’name’>Donate wikimedia.org</i>" in the transfer title.

</div>

EXAMPLES - RDFa

20TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

UNDER THE HOOD OF SCHEMA.ORG

<script type="application/ld+json">

{"@context": "http://schema.org/",

"@type": "BankTransfer",

"name": "Donate wikimedia.org",

"amount": {

"@type": "MonetaryAmount",

"amount": "30",

"currency": "USD"

},

"beneficiaryBank": "European ExampleBank, London"}

</script>

EXAMPLES – JSON-LD

21TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

SCHEMA.ORG VS. ONTOLOGIES AND LINKED DATA

• Common elements: a graph data model of typed entities with named properties

• Schema.org uses RDFS schema language and JSON-LD and RDFa syntaxes

• Schema.org shares (with Linked Data and Ontologies) many of the same goals

• Linked data and ontologies have brought to the Web a much smaller number of data sources than Schema.org, but their quality is (often) very high. This opens up many opportunities for combining the two approaches—for example, professionally published ontologies can often authoritatively describe the entities mentioned in Schema.org descriptions from the wider mainstream Web.

SIMILARITIES DIFFERENCES

• Schema.org's approach can be seen as less noisy and decentralized

than Linked Data

• Schema.org promotes syntaxes (microdata, RDFa) that are a tradeoff

between machine-friendly and human-friendly formats

• Linked RDF data publication practices have not been adopted in the

Web at large

• Schema.org shares the Linked-Data community's skepticism toward

the premature ontologies (rule systems, description logics, etc.) found

in much of the academic work that is carried out under the Semantic

Web banner.

• Schema.org avoids assuming that rule-based processing will be

commonplace

• Schema.org’s approach, in contrast to the methodologies of building

Linked Data and ontologies, does not assume that various kinds of

cleanup, reconciliation, and post-processing will usually be needed

before structured data from the Web can be exploited in applications.

• Many frame-based knowledge representation systems, including RDF

Schema and OWL have a single domain and range for each relation.

Schema.org assumes polymorphism.

• Schema.org allows for multiple inheritance.

ExtensionsOF SCHEMA.ORG

23TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

UNDER THE HOOD OF SCHEMA.ORG

CORE HOSTED EXTENSIONS EXTERNAL EXTENSIONS

• CORE – „Core, basic vocabulary for describing the kind of entities the most

common web applications need”*

• HOSTED/REVIEWED EXTENSIONS – Domain specific basic vocabularies.

• EXTERNAL EXTENSIONS – More specialized, fully independent domain

specific vocabularies. Built by a third party.

• Today: autos, finance, bibliography, health & life-sciences, iot

EXTENSION MECHANISM: SEQUENCE OF SPECIFICITY

* http://schema.org/docs/extension.html

24TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

CREATING EXTENSIONS TO SCHEMA.ORG

• Extension URI: auto.schema.org

• Designed as the first phase of the GAO project

(Generic Automotive Ontology -

http://automotive-ontology.org)

• First step: extending core vocabulary by a

minimal set of new terms (May 2015)

• Second step: creating auto.schema.org hosted

extension (May 2016)

• Third step: creating POC of the external

extension (March 2017)

• Extension URI: fibo.schema.org

• Inspiration from FIBO project (Financial

Industry Business Ontology – http://fibo.org )

• Going through BOC (Bag-Of-Concept) phase

and using an „Occam Razor” approach.

• First step: extending core vocabulary by a

minimal set of new terms (May 2016)

• Second step: creating fibo.schema.org hosted

extension (published in pending.schema.org

(March 2017))

• Third step: creating POC of the external

extension (March 2017)

AUTOMOTIVE EXTENSION FINANCIAL EXTENSION

25TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

AUTO.SCHEMA.ORG

May 13, 2015

– official introduction

of the Automotive extension

to schema.org

Collaborative project

of Hepp Research GmbH, MakoLab SA

and many other individuals.

26TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

FIBO.SCHEMA.ORG

Extension of the core vocabulary by a minimal set of new terms (May 2016)The hosted extension (published March 2017) as pending.schema.org

Collaborative project

of an international group of individuals lead by

MakoLab SA.

Described in:

http://schema.org/docs/financial.html

27TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

The financial extension of schema.org refers to

the most important real world objects related to

banks and financial institutions:

• A bank and its identification mechanism

• A financial product

• An offer to the client

• Described in:

http://schema.org/docs/financial.html

Thing CLASSES

Action

TransferAction

MoneyTransfer

Intangible

Service

FinancialProduct

BankAccount

DepositAccount

CurrencyConversionService

InvestmentOrDeposit

BrokerageAccount

DepositAccount

InvestmentFund

LoanOrCredit

CreditCard

MortgageLoan

PaymentCard +

PaymentService

StructuredValue

ExchangeRateSpecification

MonetaryAmount

RepaymentSpecification

FIBO.SCHEMA.ORG

28TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

FIBO.SCHEMA.ORG

The financial extension of schema.org refers to

the most important real world objects related to

banks and financial institutions:

• A bank and its identification mechanism

• A financial product

• An offer to the client

• Described in:

http://schema.org/docs/financial.html

29TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

A BANK

A DEPOSIT ACCOUNT

A PAYMENT CARD

THE BASIC MODELS OF THE FINANCIAL OBJECTS

FIBO.SCHEMA.ORG

30TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

CREATING EXTENSIONS – THE ART OF HARD CHOICES

• All additions to schema.org, to its core and to a „hosted” extension must meet extremely strict conditions:

• Their number must be minimal compared to the size of the vocabulary of the domain the extension represents. The making of an extension is an endless trade-off between the need for the expressive vocabulary of the domain and the requirement for its minimalism.

• They must represent the CUSTOMER NEEDS and adopt down „bottom-up” design rules – not the demands of the domain specialists and practitioners.

• The bottom-up approach assumes the „BOC” (Bag Of Concepts) approach, where the elements of the bag stem from the public „discourse” (the search on the web, social media)

• The extensions must reuse the existing schema.org terms wherever possible, even if the current meaning of them may differ from the expected meaning.

Why is the creation of schema.org extension the Art of Hard Choices?

The Applications. I.Rank

WEB SEARCH REDEFINED

32TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

FUNDAMENTAL TRENDS IN WEB SEARCH

1. BIGGER SHARE ON THE TRANSACTION

2. RICHER INTERACTION

This slide is based on the work of M. Hepp & M. Sopek "Web Search and Beyond: Digital Marketing for Automotive"

33TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

4. DYNAMICS AND VOLATILITY

3. STRONGERINDIVIDUALIZATION

FUNDAMENTAL TRENDS IN WEB SEARCH

This slide is based on the work of M. Hepp & M. Sopek "Web Search and Beyond: Digital Marketing for Automotive"

34TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

RICH SNIPPETS KNOWLEDGE PANEL

VISUAL FEATURES IN SEARCH ENGINES

This slide is based on the work of M. Hepp & M. Sopek "Web Search and Beyond: Digital Marketing for Automotive"

35TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

FACTUAL ANSWERS

And more …

TABULAR RESULTS

VISUAL FEATURES IN SEARCH ENGINES

This slide is based on the work of M. Hepp & M. Sopek "Web Search and Beyond: Digital Marketing for Automotive"

36TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

CONCRETE BENEFITS

Rich snippet results on 2nd position received higher CTR than standard snippet on 1st position

CTR INCREASE EXAMPLE

37TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

What you measure in a traditional waymay not reflect your actual performance

Solutions:

• Use KPIs with care

• New metrics based on external resources

• Add granular event handlers

MEASURE WITH CARE

NEW METRICS NEEDED

38TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

SUMMARY OF “RANK” BENEFITS OF SCHEMA.ORG

• CTR increase (Rich Snippets effect)

• Better Brand visibility (Knowledge Panels and Factual Answers)

• Better Product positioning (Rich snippets & Tabular results)

• Faster way to reach searched content (more sitelinks)

• Better mobile device experience of search

11.09.2015 – Google:

„Over time, I think it [structured markup] is

something that might go into the rankings as well.

If we can recognize someone is looking for a car, we can

say oh well, we have these pages that are marked up

with structured data for a car, so probably they are

pretty useful in that regard. We don’t have to guess if

this page is about a car.”

John Mueller / Webmaster Trends Analyst @Google

39TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

WHAT ELSE CAN WE DO WITH SCHEMA?

• While schema.org was invented to help search engines in their job and to help site owners to be

more reliably discovered and ranked on the Search Engine Results Pages – its benefits are much

more profound.

• This why we say that schema.org power goes beyond RANK, and allows you to ANALYZE your site

market environment better, improve site convergence and LEADS generation and helps to deliver a

new kind of SEARCH capacity for your site!

• What is more, to SEARCH and to ANALYZE you don’t need Google to cooperate

The Applications. II. Analyse

NEW KIND OF DATA ANALYTICS

41TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

SCHEMA.ORG DATA IN GOOGLE ANALYTICS

The markup

in the website’s code

• Schema.org

Google

Tag Manager

• Additional

setup

Google

Analytics

• Additional

Dimensions

and Metrics

42TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

Auto

Model 1

- Name

- Brand

Version1

Model, fuelConsumption,

fuelType,

numberOfDoors, Color

Version 2

Version 3

Model 2

- Name

- Brand

Version 1

Version 2

Version 3

Model 3

- Name

- brand

Version 1

Version 2

Version 3

SCHEMA.ORG DATA IN GOOGLE ANALYTICS

POC 1

43TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

SCHEMA.ORG DATA IN GOOGLE ANALYTICShttp://wisem.makolab.pl/ga/model1.html

47TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

SCHEMA.ORG DATA IN GOOGLE ANALYTICSUsage within GA

48TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

SCHEMA.ORG DATA IN GOOGLE ANALYTICSUsage within GA

49TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

SCHEMA.ORG DATA IN GOOGLE ANALYTICSUsage within GA

Which colour of a car should be used in Display Campaigns or in TV ads for Car1?

50TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

SCHEMA.ORG DATA IN GOOGLE ANALYTICS

Which engine model of Car1 is most popular online?

Should we spend campaign money on Sport version or on Eco version?

Usage within GA

51TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

FINANCIAL EXTENSION SCHEMA.ORG POC

• http://finances.makolab.com

• Full use of fibo.schema.org

• Definitions of financial dimensions

• Analytics with Google “GA”

POC 2

52TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

POC’s page Json property Dimension Dimension name

BankAccount.html price Bank Account Fee Price

name Financial Product

Name

Financial Product

Name

BrokerageAccount.ht

ml

minValue Brokerage Account

Minimum Investment

Minimum

name Financial Product

Name

Financial Product

Name

CreditCard.html annualPercentageRate Credit Card APR Percentage Rate

minValue Credit Card Required

Collateral

Minimum

price Credit Card Annual Fee Price

name Financial Product

Name

Financial Product

Name

CreditCard8.html name Financial Product

Name

Financial Product

Name

minValue Credit Card Limit Minimum

PaymentService.html name Financial Product

Name

Financial Product

Name

FinancialProducts.html name Financial Product

Name

Financial Product

Name

minValue Minimum Insurence

Coverage

Minimum

maxValue Maximum Insurence

Coverage

Maximum

FINANCIAL EXTENSION SCHEMA.ORG POC

53TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

TRUE DATA ANALYTICS

54TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

SCHEMA.ORG DATA IN GOOGLE ANALYTICS

PROS: CONS:

• None.• Analyse additional information available in

Schema markup right in Web Analytics.

• Better insights into what people look at on

the website. Deeper understanding of users’

needs.

• Better conclusions for website’s UX

optimization.

• Better conclusions for campaigns

optimization.

The Applications. III.

SearchADD SMART SEARCH TO YOUR SITES

56TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

INTELLIGENT/SMART SEARCH BASED ON SCHEMA.ORG MARKUP

Mark your product data with schema.org markup

Run the smart Search Crawlerfor an Enterprise Website

Check for schema.org

markup (Microdata or JSON-LD)

When markup is found, create property map and assign values

Display enhanced search results

57TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

Corporate product page + microdatahttp://nusil.com/product/r-2370_rtv-silicone-rubber-foam

INTELLIGENT/SMART SEARCH BASED ON SCHEMA.ORG MARKUP

58TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

UNDER THE HOOD…Crawler

Indexer(Lucene)

Microdata found

SemanticData

WebSite

INTELLIGENT/SMART SEARCH BASED ON SCHEMA.ORG MARKUP

59TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

SEARCH AGAINST BOTH CONCEPTS AND THEIR PROPERTIES’ VALUES

The real values taken from existing data found

by crawler within the marked website pages

INTELLIGENT/SMART SEARCH BASED ON SCHEMA.ORG MARKUP

60TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

SEARCH AGAINST MULTIPLE CRITERIA

INTELLIGENT/SMART SEARCH BASED ON SCHEMA.ORG MARKUP

Practical Session❧

How to use it?

62TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

USING SCHEMA.ORG IS EASY !

POC for financial domain: IMPLEMENTATION STEPS:

• Understand the domain your website belongs to

• Find schema.org types and properties that can be used to mark up your data

• Add markup to your web pages – use types and properties properly!

• As a general rule, you should mark up only the content that is visible to people who visit the web page

• The more content you mark up, the better

• Test your markup (use: Google’s rich snippets testing tool)

http://finances.makolab.com

ReportingPERSPECTIVES FOR BUSSINESS REPORTING

SCHEMA.ORG EXTENSIONS

The Future❧

64TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

THE REPORTING HORIZON

THE BUSINESS REPORTING and BUSINESS INFORMATION EXCHANGE IS REIGNED BY XBRL STANDARD

• However, the cost of filing financial reports is still quite high, particularly for small companies*

• This is why in the US, “Small Company Disclosure Simplification Act” :

“(…) exempts emerging growth companies and issuers with total annual gross revenues of less than

$250 million from the requirement to use Extensible Business Reporting Language (XBRL) for financial

statements and other mandatory periodic reporting filed with the Securities and Exchange

Commission (SEC). Such companies, however, may elect to use XBRL for such reporting.”

* $2,000 to $25,000 per year according to XBRL US.

65TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

THE PURPOSE OF THIS PART OF THE WORKSHOP

TO DISCUSS THE POSSIBILITY OF DEEPER SIMPLIFICATION OF XBRL by adoption of schema.org principles

• We have performed a series of simple technical exercises that pave the initial path for further

studies

• While we do not propose here any new standard nor want to shake the foundations of the old,

we think it is worth to consider if schema.org principles offer the possibilities to make business

reporting even simpler and more accessible.

66TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

THE RELEVANT DEVELOPMENT

THE USE OF SEMANTIC WEB STANDARDS

• „Publishing XBRL as Linked Open Data”

(Roberto García & Rosa Gil, Universitat de Lleida)

• „Triplificating and linking XBRL financial data”

(Roberto García & Rosa Gil, Universitat de Lleida)

• „Adopting Semantic Technologies for Effective

Corporate Transparency”

(Maria Mora-Rodriguez, Ghislain Auguste Atemezing,

Chris Preist)

• „Financial Report Ontology”

(Charles Hoffman)

67TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

THE RELEVANT DEVELOPMENT

THE EVOLUTION WITHIN XBRL WORLD

• „Open Information Model” -

https://specifications.xbrl.org/work-product-

index-open-information-model-open-

information-model.html The Open Information Model

provides a syntax-independent model for XBRL data, allowing reliable

transformation of XBRL data into other representations. The work

product includes: xBRL-XML, xBRL-JSON, xBRL-CSV, OIM Common.

• XBRLS - XBRL Simple Application Profile(how a simpler XBRL can make a better XBRL)

• Inline XBRL - https://specifications.xbrl.org/spec-

group-index-inline-xbrl.html

THE EXERCISES ❧

69TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

HOW COULD IT WORK?

POC: FIBO as schema.org external extension

• The extension URI: http://fibo.org/voc/

• The conversion from FIBO-V

(SKOS complaint ontology)

• The markup example based

on the extension

70TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

* Charles Hoffman:

http://xbrl.squarespace.com/journal/2008/12/18/

hello-world-xbrl-example.html

HOW COULD IT WORK?

INITIAL EXCERSISE “I”- XBRL „Hello World” * expressed as schema.org compliant markup

• Converting taxonomy (XSD) to OWL ontology

(with help of: http://rhizomik.net/html/redefer/)

• Writing schema.org compliant JSON-LD markup

71TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

HOW COULD IT WORK?

INITIAL EXCERSISE “II”- iXBRLexample

• Based on https://www.xbrl.org/ixbrl-

samples/valeo-income-statement.html

• Expression of the data semantics in JSON-LD –

schema.org compliant markup

72TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

HOW COULD IT WORK?

INITIAL EXCERSISE “III”- GAAP TAXONOMY IN SCHEMA.ORG FORMAT

• Source: PROPOSED 2018 US GAAP FINANCIAL

REPORTING TAXONOMY

• How: Extracting parent-child taxonomy with

the definitions of terms + schema.org-like RDFa

formatting of the obtained model

• Result: http://sdo-gaap-

ee.appspot.com/GrossProfit

73TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

A VISION for the future steps

CREATION of SCHEMA.ORG extensions and their applications

• Step I – the external extension based on selected XBRL taxonomy (like GAAP or IFRS)

• Step II – the external extension based on selected SBR taxonomy

• Creation of implementation guidelines and live POC

• Working with interested parties on the real-life tests

• Critical evaluation of the project

• If successful - working on the HOSTED EXTENSION to schema.org

• In general - Adopting the philosophy of bottom-up, empirical approach to the creation

74TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

Discussion

Let’s evaluate the soundness of the ideas presented here …

THANK YOU

Full slide deck at: http://ml.ms/makolabtode

76TRUSTED OPEN DATA ECOSYSTEMS, MADRID, SPAIN, September 28, 2017

PLEASE CONTACT US!

DR. MIREK SOPEKCTO

[email protected]

Poland:MakoLab SA,Demokratyczna 46, 93430 Lodz, PolandPhone: +48 600 814 537,www.makolab.com

USA:Makolab USA Inc,20 West University Ave, Gainesville, FL 32601Phone: +1 551 226 5488 ,www.makolab.com

Dr ROBERT TRYPUZ

MakoLab SARzgowska 30

93-172 Łódź

Poland

[email protected]

INDUSTRY

MakoLab SARzgowska 30

93-172 Łódź

Poland

[email protected]

ACADEMIA

JPII University

Lublin

[email protected]