View
100
Download
0
Category
Tags:
Preview:
Citation preview
MICROSOFT SEMANTIC ENGINE
Unified Search, Discovery and Insight
Significant Content is Outside Structured Storage (RDBMS, OLAP, BI)
Integration of this Content is Prohibitively Expensive (Time, Money, Resources)
Extracting Insight, Analytics, and Recommendations is even harder
Situation is a Confluence of Search | Predictive Analytics | Large-Scale Collaborative Filtering
Having all forms of digital information on a single platform allows people to blend unstructured and structured content and to drive insight and decision making
Microsoft Semantic Engine provides a combination of technologies to form a contextual understanding of all digital content
Cri
tica
l B
usi
ness
Need Analysts gather
documents, medi
a and web
content about
“Business
Analytics”, “Data
Integration” and
“Search and
Discovery” Co
re M
ach
ine L
earn
ing
Unsupervised
learning infers
“Unified
Information
Access” concept
cluster based on
automated
analysis of
content Eff
icie
nt
Data
Ag
gre
gati
on
Cluster gains in
relevance from
mining across
unstructured and
structured
sources added
from ERP and BI
systems Use
r R
ele
van
ce B
oo
st Users (BDM) re-
label cluster as
“Unified Search,
Discovery and
Insight” and
engine adopts it
further boosting
that cluster
relevance Co
llab
ora
tive B
oo
st Analysts collate
this content
requiring multi-
resolution super-
clusters with
embedded sub-
clusters
Bu
sin
ess
Deci
sio
n M
akin
g
The CxO explores
super-cluster and
drafts business
plan for her new
division
|
|
Search and Collaboration | Personalized search, discovery and organization
Legal | Precedent and subject based search over large scale textual corpuses
Life Sciences | Systems biology with large volume data correlation and search
Government Services | Intelligence, real-time analytics, visualization, clustering
Social Networking | Social graph relevance mining, ranking criteria auto tuning
|
Unified Search, Discovery and Insight
Automatic Clustering and Organization
Meaning-Driven Indexing, Classification and Storage
Scalable Content Processing over all Content Types
Instant On Experience for Out of Box Value
|
Search, Discover and Organize features exposed via sample UX gallery
Seamless installation and indexing of desktop, email and web content
Fully documented Managed APIs used in UX gallery and JavaScript / C# samples
|
Streams | Descriptors (Properties) | Kinds (Concepts)
Streams processed into contextualized and indexed concepts for search | discovery | organization
KR_CLIENT_225.docxSTREAM
LEGAL DOCUMENTCONCEPT
BILLABLE WORKCONCEPT
EVIDENCECONCEPT
DEPOSITIONCONCEPT
EXTRACTED PROPERTIESPROPERTY
LEGAL CASE [xxx]CONCEPT CLUSTER
SEARCH AND SHAREMDP
|
Engine consists of self-contained set of pluggable services
Text Processing
Image Processing
Video Processing
Audio ProcessingSupervised Machine
Learning
Clustering MDI (RBV)
Conceptual Search
InferenceSequence Store
(Suffix Tree)Distributed Content Store Ontology and Taxonomy
Management
Semantic Engine
Search and MarkupTrend and Predictive
AnalysisAutomatic Organization
Recommendation and Discovery
|
The logical architecture partitions analysis, indexing and storage
API1 API2 API3 Analysis3Analysis2Analysis1
Staging Core Index Stream
Store(<content>) Annotate(<kind>)
Index(<content>) Organize(<kinds>)
Search(<query>) …
Text
Image
Audio Video Video
|
Designed to be hassle free out of the box
Several programming languages and frameworks supported
CLR/.NET, JavaScript, TSQL, C++
|
Sample of storing a stream in the system
Initiates the content processing, classification, and indexing
|
Sample of search and recommendations
Returns contextual results from the store and the web
|
Seamless Integration in Windows Desktop Federated Search
Expose Meaning-Driven Indexing and Semantic Actions
Zero Learning Curve
|
Importers
Files
PlugInsPlugInsPlug-Ins
Semantic
Engine
Database
Kind Descriptor Stream KindLink
ListKind
|
KindID SourceUri
00000000-1111 C:\My Documents\Saint Germain Des Pres Cafe (Finest electro-jazz compilation)\05 Track
5.wma
StreamID KindID StreamUri Format Stream
11111111-2222 00000000-1111 audio/x-
ms-wma
0xFFD8FFE000104A4649460001…
DescriptorID KindID Type Attribute ValueDescriptorID KindID Type Attribute Value
10000000-0000 00000000-
1111
Classificat
ion
Audio 1.0
20000000-0000 00000000-
1111
Metadata Name 05 Track 5.wma
30000000-0000 00000000-
1111
Metadata Item Type Windows Media Audio File
DescriptorID KindID Type Attribute Value
10000000-0000 00000000-
1111
Classificat
ion
Audio 1.0
20000000-0000 00000000-
1111
Metadata Name 05 Track 5.wma
30000000-0000 00000000-
1111
Metadata Item Type Windows Media Audio File
40000000-0000 00000000-
1111
Metadata Length 00:05:22
50000000-0000 00000000-
1111
Metadata WM/ProviderStyl
e
Electronica
DescriptorID KindID Type Attribute Value
10000000-0000 00000000-
1111
Classificat
ion
Audio 1.0
20000000-0000 00000000-
1111
Metadata Name 05 Track 5.wma
30000000-0000 00000000-
1111
Metadata Item Type Windows Media Audio File
40000000-0000 00000000-
1111
Metadata Length 00:05:22
50000000-0000 00000000-
1111
Metadata WM/ProviderStyl
e
Electronica
60000000-0000 00000000-
1111
Audio Tonality/Major 0.78
70000000-0000 00000000-
1111
Audio Tempo/Moderato 0.79
DescriptorID KindID Type Attribute Value
10000000-0000 00000000-
1111
Classificat
ion
Audio 1.0
20000000-0000 00000000-
1111
Metadata Name 05 Track 5.wma
30000000-0000 00000000-
1111
Metadata Item Type Windows Media Audio File
40000000-0000 00000000-
1111
Metadata Length 00:05:22
50000000-0000 00000000-
1111
Metadata WM/ProviderStyl
e
Electronica
60000000-0000 00000000-
1111
Audio Tonality/Major 0.78
70000000-0000 00000000-
1111
Audio Tempo/Moderato 0.79
80000000-0000 00000000-
1111
Classificat
ion
Music .8
|
|
All Change data is
returned to MSE as one
XML block
MSE data is exposed
through custom views
keyed to the Users’
Primary Keys
|
Seamless Integration of Meaning-Driven Indexing in ALL SQL Tables
Expose Meaning-Driven Indexing via T-SQL
PARTING THOUGHTS
Unified Search, Discovery and Insight over Every Digital Artifact
Extensible and Scalable Semantic Platform
Zero Learning Curve
>
>
channel9.msdn.com/learnBuilt by Developers for Developers….
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market
conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT
MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Recommended