23
Text Mining Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining 5./6. October 2009

Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

Embed Size (px)

Citation preview

Page 1: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

Text Mining Text Mining in Combination within Combination with Enterprise SearchEnterprise Search

Thomas HerbstCEO B-S-S GmbH

7th Fraunhofer Symposium on Text Mining5./6. October 2009

Page 2: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

Todays Challenge: Information Overload

CMSCMS

KMKM

SearchSearch

DMSDMS APPsAPPs

WWWWWW

30% of working time is used for search of relevant information.

85% of all relevant data are unstructured.

the amount of unstructured information doubles approximately every 8 months.

user has the need to get information combined

user is missing the 360°view on all relevant content

21.04.23 2B-S-S Business Software Solutions GmbH

Page 3: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

What Customers ask for...

...provide a dynamic holistic view

of all information

in a proper context.

21.04.23 3B-S-S Business Software Solutions GmbH

Page 4: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

Todays information system

architecture

21.04.23 4B-S-S Business Software Solutions GmbH

Page 5: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

Classic information architecture

Portal

News App KM Intranet

DMS

WCMS KM MOSS

...siloed content, that can‘t be used in a combined context.

DMSDB

21.04.23 5B-S-S Business Software Solutions GmbH

Page 6: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

Enterprise Search Today

enterprise search SearchSearch

... find most of the content, but only links to the content silo‘s.

21.04.23 6B-S-S Business Software Solutions GmbH

Page 7: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

Todays KM + Search Infrastruct

KM Search 1 Search 2

Information Worker

Web Search

GoogleOracle WebLucene

• Search or KM Systems often only address a specific need or purpose

• Data must be transferred and transformed between the systems

•Time consuming

•Information lost

• Holistic view cannot be created because every system is a new data silo that can’t be combined

• User must learn the query language of every system

21.04.23 7B-S-S Business Software Solutions GmbH

Page 8: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

21.04.23 8B-S-S Business Software Solutions GmbH

Enterprise Search + Text Mining

based on a

Information Access Layer

Page 9: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

Information access layer

CMSCMS

KMKM

SearchSearch

DMSDMS APPsAPPs

WWWWWWIAL

InformationAccessLayer

21.04.23 9B-S-S Business Software Solutions GmbH

Page 10: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

10

Create Virtual Datasources

CMS DB AppSearch DMS

Portal 1

Portal 2

App 2App 1

21.04.23 10B-S-S Business Software Solutions GmbH

MarketingMarketing

HealthcareHealthcare

Brand Protection

Brand Protection

MarketWatchMarketWatch

IntranetIntranet

Page 11: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

21.04.23 11B-S-S Business Software Solutions GmbH

Conversion LanguageCompanyGeographyPeople

Lemmas

OntologyPLUG-IN

Speechtagger

AlertSearch

Taxonomy Sentiment Entities

Pipeline = Extract + Enrich

<pages>

<page id=„1“><abstract id=„0“><sentence id=„0“>dpa-afx <location country=„Germany“ long=„46225533“ lat=„13452345“>FRANKFURT</location>. <sentence><sentence id=„1“>“Wir werden weiter profitabel wachsen, die Qualität verbessern und die operative Marge vergrößern“, sagte Vorstandschef <person typ=„male“ class=„economy“>Wolfgang Mayrhuber</person> am Donnerstag in <location country=„Germany“ long=„46225533“ lat=„13452345“ >Frankfurt</location>. </sentence>...

</page>

<pages>

<page id=„1“><abstract id=„0“><sentence id=„0“>dpa-afx <location country=„Germany“ long=„46225533“ lat=„13452345“>FRANKFURT</location>. <sentence><sentence id=„1“>“Wir werden weiter profitabel wachsen, die Qualität verbessern und die operative Marge vergrößern“, sagte Vorstandschef <person typ=„male“ class=„economy“>Wolfgang Mayrhuber</person> am Donnerstag in <location country=„Germany“ long=„46225533“ lat=„13452345“ >Frankfurt</location>. </sentence>...

</page>

Page 12: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

Cerebral infarctCerebral infarct

Cerebral infarctsApoplexyApoplectic insultStroke

“Cerebral infarct”Cerebral infarktSerebral infarctCetebral ingarct

Cerebral diseaseInfarction

Cerebral infarct / medicineCerebral infarct / biology

Cerebral infarct / conferences

Infarctus cérébral

Phrasing

Doc typeclassification

Spellchecking – Phonetic match

Synonymy

Thesaurussupport

Refinement

Characternormalization

Lemmatization

Topic classification

Ambiguousqueries

Advanced Linguistics

Page 13: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

13

Architecture Overview

• Intuitional generation of dynamic application and portals

• Enablement of search driven portals

• Highly flexibel to modify, adapt and update

• Rank based content delivery (popularity, expected sales, confidence)

CMS DB AppSearch DMS

PortalFrontend

• Building a real information layer

• Integrate all needed content

• Convert to one common access layer

• Combine all content into virtual datasources

• WITHOUT INFLUENCING THE EXISTING INFRASTRUCTURE

Information Access Layer

e.g.

Portal 1

Portal 2

App 2App 1

21.04.23 13B-S-S Business Software Solutions GmbH

Page 14: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

14

Page 15: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

Dynamic Content networking

Portal

Boulevard

Sport

Gallery

Events

Automatic cross linking of content based on either user context, content context or extracted entities

• A sport article about „Tiger Woods“ links to Galleries' and boulevard news about him

• A boulevard article also offers upcoming events

21.04.23 15B-S-S Business Software Solutions GmbH

Page 16: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

Automatic content linking

• Paragraphs

• Persons

• Locations

• Countries / Regions

• Companies

• Branches

• Acronyms

• Chemical Structures

• Dates

• Other custom entities

21.04.23 16B-S-S Business Software Solutions GmbH

Page 17: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

Navigators + TagcloudsAutomatically generated navigators and clouds for most common topics

Enables the user to get an idea of the list of content and results and also to understand and to navigate through them

Automatic search by relevant words or pair of words

21.04.23 17B-S-S Business Software Solutions GmbH

Page 18: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

Offering similiar news offering of similar contents, based on topic-sensitive matching techniques

Real-time provision of related content (Find, Refine, Exclude, Custom Logic)

21.04.23 18B-S-S Business Software Solutions GmbH

Page 19: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

Document thumbnailing

Creates thumbnails from many document types in different sizes

Gives a user a quick look without opening a native application

Allows visual navigation on page level between text and images

21.04.23 19B-S-S Business Software Solutions GmbH

Page 20: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

Content Analysis

On the fly multi dimensional cross tab content analysis

Discover trends, knowlege or content relations in structured or unstructured content

e.g. sales per region, expert for products, relations between persons and locations

21.04.23

20

B-S-S Business Software Solutions GmbH

Page 21: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

User generated content

Put comments on every content

Comment list to show the last comments or the content with the most comments

Let users rate your content

Use the rating to boost or deboost content in the result

21.04.23 21B-S-S Business Software Solutions GmbH

Page 22: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

Information Access Layer can combine different kind of data silos

Integrate content once and use it in different scenarios under different perspectivs

fully security and access control support

Seamless integration of different Text Mining Products

21.04.23 22B-S-S Business Software Solutions GmbH

Summary

Page 23: Text Mining in Combination with in Combination with Enterprise Search Enterprise Search Thomas Herbst CEO B-S-S GmbH 7th Fraunhofer Symposium on Text Mining

Thank you

B-S-S Business Software Solutions GmbHWartburgstrasse 199817 Eisenach/GermanyTel. +49 3691 [email protected]

21.04.23 23B-S-S Business Software Solutions GmbH