2 Publishable Summary - CORDIS

Preview:

Citation preview

D6.2–InterimProjectReport

10 / 81  

2 PublishableSummary

2.1 ContextandObjectivesIt is broadly acknowledged that a unified solution for transforming and renovating existing datasources, regardless of the original data format, would greatly enhance the ability of publicorganisationstoprovideusable,machine‐processablelinkeddata,whileofferingSMEstheopportunityto combine and link existing public sector information with privately‐owned data in the mostresourcefulandcost‐effectivemanner.Towardsthisdirection,however,thereisalsoastrongneedforsupporting consumers unfamiliar with the linked data paradigm through interfaces that hide theunderlying complexity and allow the re‐use of existing software apps and database managementsystems.LinDAaimstoassistSMEsanddataprovidersinrenovatingpublicsectorinformation,analysingandinterlinkingwithenterprisedatabydeveloping:

x A cross‐platform, extensible software framework that provides a simplified workflow forrenovating and converting a set of common data containers, structures and formats intoarbitraryRDFgraphs.TheframeworkcanbeusedtodevelopcustomsolutionsforSMEsandpublic sectororganisationsorbe integrated intoexistingopendataapplications, inorder tosupporttheautomatedconversionofdataintolinkeddata.TheplatformwillallowtheexportofarbitraryRDFgraphsastabulardata,allowingSMEstostorethefinalresultsofdatalinkingintorelationaldatabasesorprocessfurtherwithspreadsheetanddataanalysissoftware.

x ArepositoryforaccessingandsharingLinked‐DatavocabulariesandmetadataamongstSMEsthat can be linked to the LOD (Linked Open Data) cloud. The system will allow SMEs toreference and enrich metadata shared by well‐established vocabulary catalogues (LOV,prefix.cc,LODStats),thuscontributingtoeasyandefficientmappingofexistingdatastructurestotheRDFformataswellastoincreasingthesemanticinteroperabilityoftheSMEsdatasets.

x AnecosystemofLinkedDatapublicationandconsumptionapps,whichcanbeboundtogetherin a dynamic manner, leading to new, unpredicted insights. While traditional RDFrepresentations and SPARQL query access is provided to support advanced users, a LinkedDataAPIwill bedeployed as aproxy toprovide access in otherwidely established formats,suchasCSV, JSONandXML,basedonthe internalRDFdata(RDF2Any).Thiswillallowbothconsumersfamiliarwiththelinkeddataparadigmandthoseunfamiliarwithit,toleveragetheprovided knowledge bases. In particular, this solution enables the re‐use of advanced,JavaScript‐based data visualisation components for data presentation, aswell as Java‐basedanalytics/dataminingcomponents.Weaimtorealiseanecosystemofdataextractionsandvisualisations,whichcanbeboundtogetherinadynamicandunforeseenway.Thiswillenableuserstoexploredatasetsevenifthepublisherofthedatadoesnotprovideanyexplorationorvisualisation means. Most existing work related to visualizing RDF is focused on concretedomainsandconcretedatatypes,sotheenvisionedvisualisationecosystemisoneofthemaininnovationsofLinDA.

x Alibraryofvisualisationtoolsfordifferentdatamodalities(e.g.spatial,temporalandstatistic)basedonHTML,CSS and JavaScript that can consumeoutput from theLinkedDataAPI andgenericwebAPIs. Such visualisationswill includemap viewsof spatial information (e.g. forWMS/WFS endpoints, geocoded data) as well as common graphs and charts for statisticalinformation (e.g. statistical data in theDataCubeRDF vocabulary aswell as CSV time seriesdata).

x A library of end‐user Analytics and data mining apps library, based on existing Java‐basedcomponents (e.g.Weka) extended to point to RDF as source format, specifically targeted toleveragethepotentialofLinkedDatasources,especiallyintermsofpatternandlinkanalysis.

x End‐to‐endbusinessscenariosandmodelsforLinked‐DatautilisationonanalyticsbySMEs.

D6.2–InterimProjectReport

11 / 81  

Figure2‐1:TheLinDAConcept

TheconsortiumpartnerswillusetheirportfoliotobringalongSMEsthatwilltesttheLinDAsuitetobedeployedandareinterestedinadoptingtheLinDAsolutions.ThethreepilotsdevelopedduringtheLinDAprojectarethefollowing:

x LinkedDataAnalyticsinBusinessIntelligence(CPPilot)‐Themainobjectiveofthispilotistodemonstrate innovative andgainful business intelligence‐based consulting to customersandstrategyplanningthroughtheLinDAtransformationandanalytictools.

x LinkedDataAnalyticsintheEnvironmentalSector(HYPERBOREAPilot)‐Theobjectiveofthispilot istoutilisetheLinDAsolutionsfortheefficientmanagementandanalysisoftheItalianRegionsEnvironmentaldata.DataavailableintheexistingdatagovinitiativesandrepositorieswillbetransformedintheLinkedDataformat.

x LinkedDataAnalyticsintheMediaIndustry(TTNEWS24Pilot)‐Thepurposeofthispilotistodemonstrate the potential of the LinDA renovation and consumption tools in the Mediaindustry,aswellassetupaninitiallibraryofvisualisationandexplorationapplicationscreatedforservingTTNEWS24servicesandtobesharedthroughtheLinDAecosystem.

The overall realisation of the LinDA project will be done through the realisation of the followingobjectives:

x Objective1:Enhancetheabilityofdataproviders,especiallypublicorganisationstoprovidere‐usable,machine‐processablelinkeddata.

x Objective2: Provide out‐of‐the‐box software components and analytic tools for SMEs thatoffer the opportunity to combine and link existing public sector informationwith privately‐owneddatainthemostresourcefulandcost‐effectivemanner.

x Objective3:DeliveranecosystemofLinkedDataPublicationandConsumptionapplicationsthatcanbeboundtogetherindynamicandunforeseenways.

x Objective4:Demonstrate the feasibilityand impactof theLinDAapproach in theEuropeanSMEsSector,overasetofpilotapplications.

x Objective 5: Achieve international recognition and spread excellence for the researchundertaken during the LinDA implementation towards enterprises, scientific communities,dataprovidersandend‐users.Diffuseandcommunicatereadily‐exploitableprojectresults,ofapro‐normativenature.Contributetostandardisationandeducation.

D6.2–InterimProjectReport

12 / 81  

2.2 ConsortiumTheconsortiumofLinDAconsistsof7partnerscomingfrom4EuropeanCountries.Thepartnersoftheconsortiumareshownbelow.

NATIONALTECHNICALUNIVERSITYOFATHENSDECISIONSUPPORTSYSTEMSLABORATORY

(NTUA‐DSSLab)Co‐ordinator

Greece

FRAUNHOFER‐GESELLSCHAFTZURFOERDERUNGDERANGEWANDTENFORSCHUNGE.V.(FOKUS) Germany

GIOUMPITEKMELETISCHEDIASMOSYLOPOIISIKAIPOLISIERGONPLIROFORIKISETAIRIAPERIORISMENISEFTHYNIS(UBITECH)

Greece

UNIVERSITYOFBONN(UBONN) Germany

PIKSELSPA(PIKS) Italy

CRITICALPUBLICSLTD(CP) UnitedKingdom

HYPERBOREAS.R.L.(HYPERBOREA) Italy

TTNEWS24S.R.L.(TTNEWS24) Italy

D6.2–InterimProjectReport

13 / 81  

2.3 WorkPerformedandResultsAchievedDuringthefirstyearoftheproject,the1stversionoftheLinDAtoolshasbeendevelopedanddeployedinacommonenvironment(LinDAWorkbench)accordingtheuserrequirementsandusagescenariosthat were defined and communicated with the pilot users. Moreover, a comprehensive pilot’soperationandevaluationplanhasbeendeveloped,thatwillguidethepilot’soperationduringthe2ndyearoftheprojectandprovidefeedbackforfurtherenhancementandfine‐tuningoftheLinDAtools.Ingeneral,theactivitiesoftheLinDAapproachhaveproceededaccordingtoplanandasperDoWandhaveproducedsignificantresultswhicharesummarizedinthecomingcategory:

UserRequirementsandBusinessScenariosDuringthe1styearoftheLinDAproject,emphasiswasgiventothedefinitionofuserrequirementsandBusinessScenariosthatsetthegroundworkforthecreationoftheLinDAtools.Morespecificallythekeyachievementsforthe1styearare:

x Definitionanddetaileddescriptionof10BusinessScenariosfortheutilizationofLinkedDatain the domain of Business Intelligence, Environmental Sector, Media Industry, Tourism,AnalyticsandPublicDataproviders.TheBusinessScenariosgenerated35userstoriesinviewofformingthebaseelementsfortheLinDAproject

x Astateoftheartanalysisonexistingopensource/commercialmethodsandcomponentsthatcanbeintegratedintotheLinDATransformationandAnalyticssuites.ForeachoftheLinkedDatacomponentsarespectivetestbedenvironmenthasbeensetupinordertoanalyseandreporttheirbenefits,capabilities,shortcomingsandlimitations(e.g.easeofintegration,licensingissues,complexity)

x AnonlineLandscapeofLinkedDataTools4sectionintheLinDAprojectwebsitethatprovidesaconvenient overview of the state‐of‐the‐art analysis as well as a much more efficientmaintenanceandupdateoftheanalysedlinkeddatatools.

x An initial list of 75 technical functional and non‐functional requirements driven by thebusinessscenariosanduserstories.TherequirementshavebeenrepresentedandmanagedasGithub issues (https://github.com/LinDA‐tools/LindaWorkbench/issues) for the efficientmanagementoftheLinDAToolsdevelopment.

x AcompletesetofAcceptanceCriteriaandAcceptanceTestingProcedurehavebeendefined.

LinDADevelopmentDuringthe1styearoftheLinDAimplementation,thefollowingkeyachievementshavebeenreached:

x The1stversionoftheLinDATransformationEnginehasbeendevelopedanddeployedtotheLinDAWorkbench. The engine can be used to support the mapping and transformation oftraditional data structures and formats into linked data. To this end, the TransformationEnginehasfocusedontwowidelyuseddatasourcesinYear1oftheproject;a)relationaldatafromdatabasessuchasPostgreSQL,andb)tabulardatafrom.CSVfilesandExcelsheets.

x The1stversionoftheLinDAVocabularyandMetadataRepositoryhasbeencreated.TheLinDAVocabulary and Metadata Repository leverages and syncs with existing online linked datavocabulary services (LOV, prefix.cc, LODstats) in order to assist users and SMEs during thedatatransformationprocesstoselectappropriatevocabularies(intermsofpopularity,rating

4LandscapeofLinkedDataTools‐http://linda‐project.eu/linked‐data‐tools/

D6.2–InterimProjectReport

14 / 81  

and relevance to the specific domain / industry) for the semantic representation of theircurrentdatastructurestotheRDFformat.

x A “Suggest API” has been developed that performs automatic and intelligent mappingsuggestionsbasedontheLinDAvocabularytobeusedbyexternalapps, includingtheLinDATransformationEngine.

x The LinDA Workbench, an integrated environment for hosting the LinDA tools has beencreated. The LinDAWorkbench facilitates the workflow between the tools and handles themaincommunicationwithaselectedtriplestore.

x The 1st version of the RDF2Any API for data transformation from Linked Data format to anumberofformatsincludingRDB,XML,CSVandPDF.

x TheQuery Builder tool that enables non‐experts to formulate a SPARQL query and exploreopendatasets.

x ThedevelopmentoftheConQuerOntology,whichdefinestransformationsexecutedusingthePublicationandConsumptionFramework.

x TheQueryDesignertoolthatprovidesaninnovativeandeasywaytousegraphicalmethodstointeractivelybuildasimpleorcomplexqueryovermultipledatasourcesandviewtheresultsin a SPARQL editor. The Query Designer follows the paradigm and quality of SQL Querydesigners of popular relational databasemanagement systems (Oracle, SQL Server, etc) butseamlessly adjusted to harness the potential of Linked data. With simple drag n dropfunctionalityusers canperformcomplexSPARQLqueriesand filtering including interlinkingwithexternalSPARQLendpointsthroughtheuseofSPARQL1.1FederatedQuery.

x TheLinDASPARQLeditorthatprovidesfunctionalityofatext‐basedquerywizardoverlinkeddata.MorespecificallytheLinDASPARQLeditorprovidescodestyleformattingandintelligentcodecompletion(byclickingcltr‐space)forsuggestingSPARQLsyntax,namespaces,availableendpoints,classesandproperties.

x The 1st Version of the LinDA Visualization and Exploration system, allows different visualrepresentations of data sources in the Linked Data format and provides automaticrecommendations for determining the compatibility between the selected dataset and theavailablevisualisationsandsuggestingalistofvisualisationsaccordingly.

x The1stVersionoftheLinDAAnalyticsEnginethatallowfortheconstructionandprocessingofanalyticalgorithmsandproceduresregardingdatacomingoutofLinkedDatasets

LinDAPilotsFor the first year, a comprehensive pilot operation and evaluation plan has been developed. Morespecificallythefollowingtaskshavebeenperformed:

x Threepilotshavebeensetupwiththedirectcontributionofthepilotusers‐SMEs,namelytheBusiness Intelligence Analytics (BIA) pilot, the Environmental Analytics pilot and theMediaAnalyticspilot.

x A detailed description of the LinDA pilots and details regarding the operation and theevaluationphasehavebeenexaminedanddocumented

x An in‐depth analysis has been performed for a) the redesign of business processes that isrequiredbasedontheuseoftheLinDAtools,b)thedatasetsthathavetobecreatedorusedforthe analysis, c) the type of the analysis along with the selected algorithms and d) theconsumptionapplicationsthataregoingtobedeveloped.

x AdefinesetofEvaluationcriteria, targetsandevaluationplanfortheLinDAWorkbenchandtheLinDAPilotshasbeenidentifiedanddocumented.

D6.2–InterimProjectReport

15 / 81  

Networking,disseminationandExploitationDuringthe1styear,LinDAdissemination,engagementandexploitationtasksfocusedtothefollowingtasks:

x Online Tools and printed material (leaflets, brochures, etc) have been created fordisseminationpurposes.

x AfirstversionoftheLinDAwebsitehasbeenlaunchedforthedisseminationoftheproject’sresults.

x Establishmentof5socialchannelstomaintaintheusers’interestaliveanddrivetraffictothewebsite(Facebook,Twitter,Google+,Youtube,Slideshare)

x Apressrelease,writteninEnglish,hasbeenproducedtoannouncethestartoftheproject.Thepress release has been translated to Greek and Italian and submitted to popular newschannels.

x Liaisonandcollaborationactivitiesandwithmorethan10relatedprojects.x Establishment of collaboration agreements (signed MoUs) with 3 projects in the LOD field

(SDI4APPS,PolicyCompass,E‐SPACE).x Activeparticipationto15conferencesandworkshopswithLinDApresentationsandposter.x 2conferencepapersx Organizationof2ProjectWorkshops

o “LinkedOpenData:ImprovingSMECompetitivenessandGeneratingNewValue”on2ndSeptember2014hostedinLeipziginconjunctionwithSemantics2104

o “Making (Linked)OpenData Available for Business” on 30thOctober 2014 hosted inBelfastinconjunctionwitheChallenges2014.

x 1stversionoftheExploitationandSustainabilityplanthat identifiedthepotentialexploitableassets of LinDA, conducted a market analysis and proposed a set of exploitation andsustainabilitypathswhichwillbethoroughlydiscussedatconsortiumlevelduringthesecondyearoftheprojectinordertoconcludewiththemostappropriateoptions.

Allpublicdeliverablesoftheproductcanbeaccessedonlineat:http://linda‐project.eu/deliverables/LinDAKeyPublicDeliverablesforthe1st yearD1.1‐LinkedDataComponents&ToolsSotA andBusinessScenariosforLinkedDataUtilization‐1stVersion

D1.2‐LinDATechnicalRequirementsD2.1‐LinDAArchitecture:TransformationEngine&LinkedDataVocabulariesandMetadataRepositoryD2.2‐LinDATransformationEngine,LinkedDataVocabulariesandMetadataRepository‐ 1stVersionD3.1‐LinDAPublicationandConsumptionFrameworks,LibrariesandInterfaces‐1stVersion

D5.3‐DisseminationPlanD5.4‐DisseminationActivitiesReport‐1stVersion

D6.2–InterimProjectReport

16 / 81  

2.4 WebsiteandSocialMediaChannelsIn order to achieve the purpose of reaching people at a pan‐European scale, which is not possiblethrough physical contact by the consortium members, the project has invested in the design,developmentandregularupdateofaninteractive,Web2.0portalthatwilloperateasaone‐stop‐shopforanyinterestedpartythatwishesbeinformedaboutLinDA’sadvancements.Theproject’swebsiteincludesalltherequiredinformationfortheLinDAproject.SinceLinDAbringsinto effect aWeb 2.0 strategy to stakeholders’ involvement besides the traditional communicationchannels,thereisanoticeablepresenceinsocialmediawhicharelinkedtothewebsite.TheLinDAwebsiteresidesat:http://www.linda‐project.euandisbasedona3‐tierarchitecture,builtusing open source software. Specifically, the infrastructure contains a web server (Apache5) and adatabase server (runningMySQLCommunityEdition6),which allwork together toprovide a seamlessexperience to the visitors of the site. The CMS engine that is currently used is version 4.0.1 ofWordpress7.Thesitecomplementedbytheestablishmentof5socialchannelstomaintaintheusers’interestaliveanddrivetraffictothewebsite.Thesearethefollowing:

x Facebook(news),https://www.facebook.com/LinDAFP7x Twitter(news),https://twitter.com/LinDa_FP7x Google+(news),https://www.google.com/+Linda‐projectEux Youtube(videos),https://www.youtube.com/channel/UCfZHhkxIN_O1jRovE2lTWwAx Slideshare(presentations),http://www.slideshare.net/LinDa_FP7

Thefollowingscreenshotpresentthelandingofthewebsiteatitspresentstate.Ithastobenotedthatthewebsitewillbeoptimisedforofferinganimproveduserexperience,whilecontentpopulationwillbecontinuous.

5ApacheWebServerfromwww.apache.org6MySQLisanopensourceproductofOracleCorporationand/oritsaffiliatesavailablefromwww.mysql.com7Wordpressisanopensourcewebauthoringsoftwareavailablefromwww.wordpress.org

D6.2–InterimProjectReport

17 / 81  

Figure2‐2:WebsiteLandingPage