28
Dassault Systèmes Exalead, Franc Amar-Djalil MEZAOUR, Phd “Active Hiring” use case demonstrator WP8: Linked Enterprise Data

LOD2 Plenary Vienna 2012: WP8: Linked Open Data for Enterprise Data Web

Embed Size (px)

DESCRIPTION

State of Play presentation at the LOD2 Plenary Vienna 2012: WP8: Linked Open Data for Enterprise Data Web by Amar-Djalil MEZAOUR,Dassault Systèmes Exalead.

Citation preview

  • 1. Creating Knowledge out of Interlinked Data WP8: Linked Enterprise Data Active Hiring use case demonstrator Amar-Djalil MEZAOUR, Phd Dassault Systmes Exalead, FranceLOD2 Presentation . 02.09.2010 . Page http://lod2.eu
  • 2. Creating Knowledge out of Interlinked DataLuxembourg 1st Review Meeting Report Analysis Criticism: Use case specifications are not convincing RDF use and benefits in search are not clear. Concerns on the relevance of the use case application Unclear identification of business actors and incentives to use the application Doubts on the volume of HR data that could be used Suggestions: Innovantage: http://www.innovantage.co.ukLOD2 Event . 06.09.2010 . Page 2 2 http://lod2.eu
  • 3. Creating Knowledge out of Interlinked DataResponse and action plan (1/5) Use case specifications are not convincing Exit (for the moment) the application on resumes: Few available data and so unable to showcase a convincing application Few RDF standards for managing resumes Optimisation of project resources to target a first release of a convincing application before the 2nd review Refocus the use case on job vacancies: More available data (crawl of job boards or recruitment section of companies web sites) Identification of market opportunities: recruitment reports, dashboards and analysis for business and administrations (Innovantage potential competitor) RDF support for representing job posts (JobPosting by http://schema.org)LOD2 Event . 06.09.2010 . Page 3 3 http://lod2.eu
  • 4. Creating Knowledge out of Interlinked DataResponse and action plan (2/5) RDF use and benefits in search are not clear Linked data, in RDF format, will help increasing the accuracy of the application since it supports the process of data entities recognition: identification and disambiguation of locations, hiring organizations, industry domains, job titles Search benefit: efficient indexing of key data entities. So retrieval and query suggestions are more relevant. Consolidation and analytics will bring clear added value: aggregation of numeric indicators will reflect the actual state of the indexed dataLOD2 Event . 06.09.2010 . Page 4 4 http://lod2.eu
  • 5. Creating Knowledge out of Interlinked DataResponse and action plan (3/5) Concerns on the relevance of the use case application The existence of two potential competitors in UK (Innovantage & Myresourcer) is a good indicator of the need of the market to have intelligence survey on jobs vacancies and skills.LOD2 Event . 06.09.2010 . Page 5 5 http://lod2.eu
  • 6. Creating Knowledge out of Interlinked DataResponse and action plan (4/5) Unclear identification of business actors and incentives to use the application HR departments of businesses and administrations are the target end users. In a similar way as Innovantage and Myresourcer, WP8 use case application will provide tools for releasing a job market analysis by domains and vacancies/skills. Linked data will provide the leveraging mean for enriching the end user experience by providing additional information The dashboard of market intelligence widgets is one of the incentives that we target to make businesses use and get advantage from WP8 use case applicationLOD2 Event . 06.09.2010 . Page 6 6 http://lod2.eu
  • 7. Creating Knowledge out of Interlinked DataResponse and action plan (5/5) Doubts on the volume of data that could be used Job vacancies are easy to get from the web. A crawl process of major job boards, companies web sites is enough to provide large amount of job postingsLOD2 Event . 06.09.2010 . Page 7 7 http://lod2.eu
  • 8. Creating Knowledge out of Interlinked DataNext target The 2nd review in September M24 M27 M48 First release of use case application Specification of the semantic features in use case dataflow LOD2 Event . 06.09.2010 . Page 8 8 http://lod2.eu
  • 9. Creating Knowledge out of Interlinked DataActive Hiring use case specifications Job vacancies market dashboard and analytics using linked data Monitor hiring market on the web Provide insights on job leads Compute comprehensive dashboard of analytics on job vacancies trends Link and reuse linked datasets in HR enterprise application Enrich search experience by mashing up content from different sources (social networks for example) LOD2 Event . 06.09.2010 . Page 9 9 http://lod2.eu
  • 10. Creating Knowledge out of Interlinked DataArchitecture and data workflow import HR vocabularies HTML D2R +NLP SERVER crawl RDFisation pipeline Indexing Analytics CloudView Scraping RDF STORE SearchLOD2 Event . 06.09.2010 . Page 10 10 http://lod2.eu
  • 11. Creating Knowledge out of Interlinked DataRequirements of the initial release of enterprise demo R1: Geolocation R2: Entities identification and linking R3: Mapping with HR resources and vocabularies R4: Duplicate (repost) detection R5: Mapping with legal regulations R6: Hiring support R7: Analytics R8: Mashup LOD2 Event . 06.09.2010 . Page 11 http://lod2.eu
  • 12. Creating Knowledge out of Interlinked DataR1: Geolocation R1.1: Disambiguate job vacancies locations Paris, IDF, France vs. Paris, Virginia, United-States R1.2: Enrich job description with geo coordinates in RDF store For Paris, IDF, France: N 48 51 12E 2 20 55 R1.3: Link with a reference resource (geonames, dbpedia, freebase,) GeoNameId: 2988507 Dbpedia: ?? Freebase: ??LOD2 Event . 06.09.2010 . Page 12 12 http://lod2.eu
  • 13. Creating Knowledge out of Interlinked DataR2: Entities identification and linking R2.1: Extraction of salaries in job postings when available R2.2: linking hiring organizations with reference source, opencorporates for example LOD2 Event . 06.09.2010 . Page 13 13 http://lod2.eu
  • 14. Creating Knowledge out of Interlinked DataR3: Mapping with HR resources and vocabularies R3.1: Identification and import of occupations taxonomies R3.2: Identification and import of industry domains taxonomies R3.3: RDFisation of taxonomies R3.4: Mapping of job titles to occupation taxonomy labels R3.5: Classification of job vacancies by domain R3.6: Aggregate and match skills (skills in job description with taxonomy required skills)LOD2 Event . 06.09.2010 . Page 14 14 http://lod2.eu
  • 15. Creating Knowledge out of Interlinked DataR4: Duplicate detection R4.1: Detect job reposts and duplicates from different sources. R4.2: Merge same job vacancy posted in different sources LOD2 Event . 06.09.2010 . Page 15 15 http://lod2.eu
  • 16. Creating Knowledge out of Interlinked DataR5: Mapping with legal regulations R5.1: Map job vacancies with laws and regulations ?? LOD2 Event . 06.09.2010 . Page 16 16 http://lod2.eu
  • 17. Creating Knowledge out of Interlinked DataR6: Hiring support R6.1: Integration with social networks ?? LOD2 Event . 06.09.2010 . Page 17 17 http://lod2.eu
  • 18. Creating Knowledge out of Interlinked DataR7: Analytics R7.1: Provide analytics of job posts by region R7.2: Provide analytics of job posts by occupation R7.3: Provide analytics of job posts by industry domain R7.4: Provide analytics of job posts in a timeline ?? LOD2 Event . 06.09.2010 . Page 18 18 http://lod2.eu
  • 19. Creating Knowledge out of Interlinked DataR8: Mashup R8.1: Advanced search using identified criteria R8.2: Enhanced interface with analytics widgets R8.3: Geolocation of search results in a map service R8.4: Enhance displayed content with information provided by external sources (organization info, country info, related news, stock exchange info, .) ??LOD2 Event . 06.09.2010 . Page 19 19 http://lod2.eu
  • 20. Creating Knowledge out of Interlinked DataRoadmap R1.1 R1.2 R1.3 R2.1 R2.2 R3.1 R3.2 R3.3 R3.4 R3.5 R3.6 R4.1 R4.2 R5.1 R6.1 R7.1 R7.2 R7.3 R7.4 R8.1 R8.2 R8.3 R8.4 priority Deadline TO BE DISCUSSED DURING WP8 BREAKOUT SESSIONcomponent Partner TO BE DISCUSSED DURING WP8 BREAKOUT SESSION HIGH MEDIUM LOW LOD2 Event . 06.09.2010 . Page 20 20 http://lod2.eu
  • 21. Creating Knowledge out of Interlinked DataPreexisting assets for the first release: Cloud Platform Outscale platform for hosting the demo: Outscale is a Dassault Systmes company providing cloud computing solutions for businesses and ISV (independent software vendors) TINA is Outscales cloud IAAS (Infrastructure as a service) cloud computing service & software. TINA is Amazon EC2 compatible An access account will be created to WP8 partners within their organization network (network IP mask). SSH Logging to the public IP of the VM host. For the moment, it is 46.231.151.11 LOD2 Event . 06.09.2010 . Page 21 21 http://lod2.eu
  • 22. Creating Knowledge out of Interlinked DataPreexisting assets for the first release LOD2 Event . 06.09.2010 . Page 22 22 http://lod2.eu
  • 23. Creating Knowledge out of Interlinked DataPreexisting assets for the first release: HR dataset Dataset of HRXML v3.2 data crawled from the web: 7,035 CVs: 110 in English and the others in French. 42,186 job opening descriptions all in English. The format is documented here: http://ns.hr-xml.org/schemas/org_hr-xml/3_2/Documentation/ComponentDoc/PositionOpening-noun.php XSLT processors to transform HRXML v3.2 to RDF: ResumeRDF JobPosting: more than 1 million RDF triples in virtuoso. JobPosting is a format for describing job descrptions in HTML pages. JobPosting is maintained by http://schema.org (Google, Yahoo! & Microsoft) An RDF schema of JobPosting is maintained by LATC projet: http://schema.rdfs.org/all.rdf LOD2 Event . 06.09.2010 . Page 23 23 http://lod2.eu
  • 24. Creating Knowledge out of Interlinked DataJobPosting overviewLOD2 Event . 06.09.2010 . Page 24 24 http://lod2.eu
  • 25. Creating Knowledge out of Interlinked DataJobPosting overviewLOD2 Event . 06.09.2010 . Page 25 25 http://lod2.eu
  • 26. Creating Knowledge out of Interlinked DataPreexisting assets for the first release: HR domain vocabulary O*NET data dictionary database 16.0 (Standard Occupational Classification) + additional tab separated txt files: LOD2 Event . 06.09.2010 . Page 26 26 http://lod2.eu
  • 27. Creating Knowledge out of Interlinked DataConclusion COME TO THE WP8 BREAKOUT SESSIONLOD2 Event . 06.09.2010 . Page 27 27 http://lod2.eu
  • 28. Creating Knowledge out of Interlinked Data Thank you for your attention!LOD2 Presentation . 02.09.2010 . Page http://lod2.eu