View
828
Download
0
Category
Preview:
DESCRIPTION
Presentation given at EDW 2012 (Atlanta)
Citation preview
Semantic MediaWiki Approach to Metadata
Scott E. Thompson Manager - Data ArchitectureOntario Teachers’ Pension Plan
2
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
Agenda
Why?Mashup of slides I’ve used before…–What is Semantic MediaWiki?–Proof of Concept–The Unexpected
Wrap UpQuestions
3
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
pinterest.com/thompland777
SELECT ?Person WHERE { ?Person :hasExperience :Semantic Technologies .
?Person :hasExperience :Meta Data.?Person :hasExperience :Capital Markets }
4
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
Ontario Teachers’ Pension Plan
Fixed IncomePublic EquitiesPrivate CapitalReal EstateInfrastructureForeign CurrencyCommoditiesHedge Funds
5
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
The Challenge: Metadata
6
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
Current: Low Confidence
IT
ETL CorrectTrade
ReloadData
RerunReport
Data Warehouse Data Warehouse ReloadReload
42?
7
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
Future: Nirvana
8
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
Business Requirements
Findability of Data Ownership of Data Data Quality Consistent Business Terminology
Added later… Ownership of Metadata Metadata Quality
9
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
Business Requirements
• Allows business users / end users to gain the required insight into what the data and reports they are looking at means
• Makes data available and visible to others
• Creates a searchable set of information about the firm’s data. This allows data developers and users to search for existing data and avoid data duplication.
• Provides a platform for sharing and publicizing data. This reduces the workload of developers (interfaces, reports, etc.) and users and increases efficiency.
• Quality control, data restrictions and uses can be applied to the entire data set.
• Metadata documentation transcends people and time. Staff turnover and balancing of multiple projects can be mitigated with metadata, providing data permanence and the documentation of institutional knowledge.
Value of Meta Data & Meta Data Tool
10
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
MDM?
MDM could stand for Master Data Management or Meta Data Management… coincidence?
“Lets go get all the key pieces of data and put them in one place, which is really more of an enterprise data warehouse but master data management then says… it’s almost a map… here is what each of those data fields are, here is how you can find them, here is what they mean, here is where they came from.”
Blake Johnson Consulting Professor Stanford University“The Truth and Power of Master Data Management” (Teradata)http://www.youtube.com/watch?feature=player_embedded&v=p6VHpIlDfu4#!
11
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
Investment Strategy & Planning
SecuritiesOperations
PortfolioAccounting
PerformanceMarket
RiskManagement
Credit & CounterpartyRisk
Management
LiquidityRisk
ManagementCompliance
Collateral &CashManagement
PortfolioResearch &Analytics
Post-tradePre-Trade
Total Fund Reporting
Trades
Market Context
Model
Trades
Market Context
ModelTrades
Market Context
Model
Reconciliation
V = f(trade, market context, model, business context)
Trade & Deal
Management
Business Context
Business Context
Business Context
One Truth?
12
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
What is a Wiki?
Hawaiian for “quick”Allows large numbers of people to create and edit the same contentEffective for reaching a credible consensus from a large groupWikipedia is the world’s largest collaboratively edited source of encyclopedic knowledge
13
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
What is the Semantic Web?
14
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
MediaWiki (Web 2.0)
15
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
Semantic MediaWiki (Web 3.0)
16
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
Future Opportunities
Simple search algorithms would suffice to provide a precise answer to the question…
17
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
Faceted Search
18
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
Graphs (relate/infer)
otpp:Debtotpp:Index-Linked Bond
otpp:Debtotpp:Fixed-Rate Bond
otpp:Amortizing Index-Linked Bond
otpp:Index-Linked BondsubClassOf
subClassOf
subClassOf
otpp:subtypeOf
dbpedia:Inflation Linked Bond
<sameAs>otpp:Index-Linked Bond
sameAs
dbpedia:Inflation-Linked Bond
19
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
Who Needs Consistency?
20
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
Linked Open Data Graph (OLD)
21
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
FIBO
22
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
Proof of Concept
Build a knowledgebase about:Our structured data (schemas, tables, columns)Our business terminology (business process, products, attributes)
Prove that the technology could: • Automatically load technical metadata
and relate it with business metadata• Customize workflow to collect and
govern the manual business input
23
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
Data Architecture Ontology
Table
Schema
Schema Group
IsPartOfA
BelongsToA
Instances:TOOLKITCOREPRODUCTFUNCTIONALBUAD
Instances:ACCTMREFMKTFIQR
Instances:Table1Table2View1View2
24
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
25
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
Data Management Ontology
Table
Quality StateOrganizational
Group
hasDataOwnerhasA
Instances:UserAuthoratative
Instances:Investment Division – Asset Mix & RiskFinance Division – Data Management
SLA
Instances:SLA1SLA2
hasDataSteward
hasA
26
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
27
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
28
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
29
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
30
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
Workflow
31
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
32
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
Product Attribute Ontology
Product Group Stored Procedure
Table
CallsA
ReferencesA
Product Attribute
Quality Test
hasDMQual ityTest
Instances :MissingStaleNull ValueComparativeToleranceChanged
Product
belongsToA
Column
hasA
getsDataFromhasAttribute
Focus on this data entry form
Metadata to be curated by DM
Metadata to be curated by AM &R
33
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
% Sourced from Core Schemas?
{{#sparql: SELECT DISTINCT ?Product ?Product_attribute ?Column ?Schema WHERE { ?Product property:HasAttribute ?Product_Attribute . ?Product_attribute property:GetsDataFrom ?Column . ?Column MDM:belongsToSchema ?Schema . } |merge=true|link=all}}
34
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
Data Management Indexes
35
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
36
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
It’s a New Kind of Database!
37
1. Why? 2. SMW? 3. The PoC 4. The Unexpected 5. Wrap Up
WYSIWYG extensionEnhanced Retrieval ExtensionDeployment Framework
MediaWikiSemantic MediaWiki
Web Server
SMW+ in a nutshell
“The smartest organizations are not those with the smartest people but those with the quickest access to their collective knowledge”
- Rod Collins (wiki-management.com)
Recommended