Semantic Applications for Financial Services:Presentation to the Silicon Valley Semantic Technology Group
David NewmanStrategic Planning ManagerEnterprise Technology Architecture and PlanningWells Fargo Bank
January 14, 2010
2January 14, 2010
Agenda
The Case for Semantic Technology Key Enterprise Business and IT Drivers for Semantic Technology Limitations of Conventional Integration and Database Technologies Benefits of Semantic Technology
Overview of Semantic Technology Origins, Ontology Model, Basic Principles, Languages, Basic Concepts
Semantic Technology Providers and Adopters Semantic Applications for Financial Services
Use Cases: Business and Technology Perspectives Implications for Enterprise Architecture and Data Management
Organizations
Recommended Semantic Technology Books and Articles
Disclaimer: The content in this presentation represents only the views of the presenter and does not represent or imply acknowledged adoption by Wells Fargo Bank
3January 14, 2010
The Case for Semantic Technology
4January 14, 2010
Key Business and IT Drivers for Semantic Technology Problem: Enterprise Data Fragmentation as a result of:
incompatible data meanings, definitions, vocabulary. multiple incompatible physical data formats and structures proliferation of unstructured data multiple heterogeneous data stores across multiple siloed
organizations with redundant data
Impact: Results in less than optimum information/knowledge quality Dilutes effectiveness and business value of data Data integration is costly and difficult to achieve Negatively impacts enterprise bottom line and increases risk
Goal: Reintegrate the fragmented meanings and instances of data
5January 14, 2010
Limitations of Conventional Integration and Database Technologies
Data Schema
New Data Entity
Physical Database
New Physical Table for New Entity
Application Software
Business Rules in Code
Access
Update
Define
Knowledge is encapsulated in opaque software Challenge to normalize disparate data from multiple sources
Often represented in proprietary software and programs
Hard to access, should be an institutional asset
Conventional Technology Data Definition and Access Patterns
6January 14, 2010
Limitations of Conventional Integration and Database Technologies (continued)
Data schemas reflect limited knowledge conceptual model or framework used to describe a pattern or a
set of data structures segregation between the schematic structure of the data and
programmatic logic or rules that are invoked at runtime to classify data
limited to data structures and data constraints, but not to richer categorizations and rules
Data organization is tightly coupled with the schema physical representation of the data, is dependent upon the
content of the schema that defines the data change to the schema often requires, or results in, change to the
physical representation of the data
7January 14, 2010
Limitations of Conventional Integration and Database Technologies (c0ntinued)
Schemas support limited data integrity no inherent ability to define and manage real integrity constraints very basic primitive data type checking or referential integrity
checking possible becomes a requirement challenge for the tools or programs that
populate and access the data store challenge for the labor intensive quality assurance efforts to vet
multiple error conditions
The problem of localization Localization is the process of gathering, collecting and
concentrating data from disparate data sources into a common local data store
the same source data may be localized redundantly by many systems
8January 14, 2010
Benefits of Semantic Technology
New Data Entity
Ontology / Semantic SchemaPhysical Database
Some Business Rules Added to Ontology
Application Software
Some Inferred Data
Some Business Rules Removed from Code
Physical Format Unchanged after New Data Entity Added
Access
Update
Define
Semantic Technology Data Definition and Access Patterns
Knowledge is open and represented by an ontology an ontology can be characterized as a knowledge schema provides a conceptual framework that classifies entities and their
relationships to one another includes a set of integrity rules that govern the relationships between
entities
TBox (terminology)
ABox (assertions)
9January 14, 2010
Benefits of Semantic Technology (continued)
Data organization is decoupled from the schema semantic schema is independent from the physical
organization of the data while the schema may require change, the underlying objects
and data instances described by the ontology do not need to physically change for the new knowledge relationships to be realized
semantic capabilities can offer faster time to market opportunities for projects; at potentially lower costs, due to the expected reduction in labor intensive tasks.
Inferencing creates new knowledge ability to use rules asserted about classes in order to generate
a super-set of facts that is logically derived from a sub-set of facts, to arrive at a conclusion
10January 14, 2010
Benefits of Semantic Technology (continued)
Defines meaning of data use of standardized semantic vocabulary relationships of data link analysis that traverses network graph of relationships
Enables data integration across heterogeneous silos accepts the notion that data representations of the same fact
can be diverse and heterogeneous as long as the meaning is tied together by an ontology (owl:sameAs)
No need to centralize data, just go to the source(s).
Utilizes “reasoners” to ensure data integrity flags contradictions guarantees consistent information provides automatic data integrity checking
11January 14, 2010
Benefits of Semantic Technology (continued)
All semantic data can be Web addressable every resource and every semantic language construct can be
configured as a Web addressable URI.
Enables Web 3.0 “The Semantic Web” machine understanding of Web content – intelligent agents
ubiquitous connectivity – every resource is a URL
knowledge centric patterns of computing – via ontologies
universally translated via self-describing ontology.
virtualized infrastructure and everything as a service (XaaS)
12January 14, 2010
Overview of Semantic Technology
13January 14, 2010
Origins Philosophical Origins:
Deductive Logic - Aristotle
Epistemology - Study of knowledge
Ontology - Study of Being, Existence,
Reality, Nature of Things
Ontology (Computer Science) Knowledge representation so that
machines as well as people can commonly understand the meaning of data in order to accomplish tasks.
Knowledge is represented as a set of taxonomic classes, with relations and properties
Ontology is a specification of a conceptualization [Gruber]
Aristotle
14January 14, 2010
Semantic Ontology Model
Small step forward towards reducing data chaos
Based upon Description Logic A symbolic logic that allows
reasoning about properties that are shared by many objects through the use of variables
Mathematically verifiable
Describes domains in terms of: Concepts (classes)
Roles (relationships, properties)
Individuals (instances)
Subject(domain)
Subject(domain)
Predicate (property)
Predicate (property)
Object(range)
Object(range)
RDF Triples/ Statements
Aligns linguistically with how we think and speak
Jackson Pollock “Convergence”
15January 14, 2010
Basic Principles of Semantic Technology Open view of the Truth
Closed World Assumption (CWA) – Any statement that is Not known to be True is therefore False. (Conventional Databases: If it is not in the database it doesn’t exist )
Open World Assumption (OWA) – A statement is False only if it is known to be False. Web Ontology allows incomplete data. Designed for inferencing, search, informed answers.
Monotonic Logic Adding a new fact doesn’t invalidate previous facts or
conclusions. (A person may live in many places).
Unique Name Assumption Not Supported Unless specifically stated, any two instances might refer to the
same thing i.e. doesn’t assume that because two individuals have different names, that they are not the same person
16January 14, 2010
W3C Semantic Technology Languages RDF – Resource Description
Framework RDFS – RDF Schema OWL – Web Ontology Language SPARQL – SPARQL Protocol and
RDF Query Language SWRL – Semantic Web Rules
Language – rules that can be applied to RDF graphs
RIF – Rules Interchange Format GRDDL – Gleaning Resource
Descriptions from Dialects of Languages
POWDER - Protocol for Web Description Resources
W3C Semantic Language Stack
OWL
SPARQL
RDFSRDFS
SWRL(RIF)
SWRL(RIF)
RDFRDF
GRDDLGRDDL
POWDERPOWDER
XMLXML
URIURI
GRDDL/XSLT Transform
17January 14, 2010
Foundational Concepts based on Description Logic Class – a concept, a resource, a thing, a set, a collection of
elements with similar properties. :Person rdf:type owl:Class
Individual – instance that belongs to one or more classes. A member of a set :David_Newman rdf:type :Person
Properties – describes the relationships between individuals. A property is also a class in its own right Resembles language constructs, how we think :subject :predicate :object = {domain property range} Object Properties – range of property is another class
:Service :hasOperation :Operation
Datatype Property – range of property is a data primitive, e.g. literal value, number, string :Person :hasName “David Newman”
18January 14, 2010
Assertions Equivalence – asserts that two classes are the same
Every individual member of one class is also a member of the equivalent class Class equivalence :TeamMember owl:EquivalentClass :Employee
Property equivalence :EmployedBy owl:EquivalentClass :WorksFor
Individual equivalence :David_Newman owl:SameAs :Dave_Newman
Subsumption – asserts that if an individual is a member of a class, it is also a member of its superclass. :TeamMember :rdfs:subClassOf :Person
Class inheritance is transitive. (A -> B -> C), A -> C A class inherits all of the attributes or properties of its superclass
Disjointness – asserts that two things are different. Disjoint classes cannot have members in common
:Religious owl:disjointWith :Atheist
OWA assumes that things are the same unless told otherwise
19January 14, 2010
Property Expressions
Functional – asserts that a property can have only one unique value for each instance. :BiologicalMother rdf:type owl:FunctionalProperty
Inverse – asserts the property that is the reverse of the stated property. :Child owl:inverseOf :Parent.
Symmetric – asserts that a property holds true even when the subject and object are reversed :Sibling rdf:type owl:SymmetricProperty
Transitive – asserts that if A has a relation to B, and B has a relation to C, then A has a relation to C. :Ancestor rdf:type owl:TransitiveProperty
20January 14, 2010
Complex Classes
Intersection (And) – class that contains all of the individuals that are common to all classes in the intersection MainframeMQApp = intersectionOf(MQApp, MainframeApp)
Union (Or) – class that includes all members specified in the union SFOAirlines = unionOf(UnitedAirlines, AmericanAirlines, etc)
Complement (Not) – class that includes all members that do not belong to a specific class Vegetarian = complementOf(MeatEater)
Restriction – conditions that specify membership in a class. Reasoner determines whether an individual is a member of a class based upon predefined rules. Constrains the set of possible values or ranges for a property. TierOneApplication = restriction(onProperty(hasTier), hasValue(TierOne))
21January 14, 2010
Semantic Technology Providers and Adopters
22January 14, 2010
(Some) Providers of Semantic Technology
Ontology Editors Triple Stores
Middleware
PelletRacerPro
Reasoners
Sesame
OWLAPILanguages
23January 14, 2010
(Some) Adopters of Semantic Technology
24January 14, 2010
Semantic Applications for Financial Services
Fraud Detection requires advanced capabilities for pattern matching, event
correlation and link analysis Know Your Customer (KYC)
regulations require financial organizations to assimilate diverse information about their customers from multiple sources
Asset and IT Portfolio Management requires localization and integration of data from multiple sources
Customer Integration requires a 360 degree view of the customer must be assembled
Personalization and Cross-Sell requires a 360 degree view of the customer must be assembled
Records Management and eDiscovery requires categorizing, searching and accessing structured and
unstructured content
25January 14, 2010
Semantic Applications for Financial Services (continued) Service Oriented Architecture and Service Discovery
requires a canonical data schema that can auto-translate data content from one interface protocol to another, increasing the level of interoperability and reducing the need to continually version changes to Web service message interfaces
requires capability to advertise and locate service interfaces defined by a Service Registry
Logging and Monitoring requires recording and monitoring of data that is often highly
heterogeneous and diverse
Business Intelligence and Analytics requires ability to access distributed disparate data and perform
complex queries and link analysis
Market Intelligence for Investment Analytics requires ability to scan the Web, parse RSS news feeds, and other
sources, to identify, in real time, subjects of interest to an organization
26January 14, 2010
Implications for Enterprise Architectureand Data Management Organizations
Enterprise Ontology, Standards and Governance Upper Ontologies
Line of Business Ontologies Federation of ontologies
Mission: Manage and provide standards and quality control for
enterprise semantic content Limit risk of siloed ontologies
Utilize an enterprise Ontology Repository Enterprise, federated, collaborative RDF/OWL repository
[OpenOntologyRepository Initiative]
27January 14, 2010
Recommended Semantic Technology Books and Articles
28January 14, 2010
Recommended Books
29January 14, 2010
The Power of Semantic Technology: Mind over Meta
Article in Data Strategy Journal
Spring 2009
30January 14, 2010
Semantic Applications for Financial Services, FSTC Innovator: The Journal for Financial Services Technology Leaders, Volume 2, Issue 7, October 2009
Article in FSTC Innovator Journal