41
Project Deliverable D2.2 Reference Architecture and Integration Plan Project Number 700692 Project Title DiSIEM – Diversity-enhancements for SIEMs Programme H2020-DS-04-2015 Deliverable type Report Dissemination level PU Submission date August 31, 2017 (M12) Responsible partner FCiências.ID Editor Alysson Bessani Revision 1.0 The DiSIEM project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 700692.

D2.2 Reference Architecture and Integration Plandisiem-project.eu/wp-content/uploads/2017/09/D2.2.pdf · Reference Architecture and Integration Plan Project Number 700692 ... , IBM

Embed Size (px)

Citation preview

ProjectDeliverable

D2.2ReferenceArchitectureand

IntegrationPlanProjectNumber 700692ProjectTitle DiSIEM–Diversity-enhancementsforSIEMsProgramme H2020-DS-04-2015Deliverabletype ReportDisseminationlevel PUSubmissiondate August31,2017(M12)Responsiblepartner FCiências.IDEditor AlyssonBessaniRevision 1.0

The DiSIEM project has received funding from the European Union’s Horizon 2020researchandinnovationprogrammeundergrantagreementNo700692.

D2.2

2

EditorAlyssonBessani,FCiências.IDContributorsAlyssonBessani,FCiências.IDFernandoAlves,FCiências.IDAdrianoSerckumecka,FCiências.IDSusanaGonzálezZarzosa,ATOSGustavoGonzálezGranadillo,ATOSIlirGashi,CityFrancesBuontempo,CityCagatayTurkay,CityPhongNguyen,CityOlivierThonnard,AmadeusMiruna-MihaelaMironescu,AmadeusZayaniDabbabi,AmadeusAbdullahiAdamu,DigitalMRMichaelKamp,FraunhoferIAIS

D2.2

3

ExecutiveSummaryTheDiSIEMprojectaimstoaddressseverallimitationsofexistingSIEMsalreadydeployed in production by defining, designing, and implementing a set ofextensionstobedeployedtogetherwiththesesystems.Thisdeliverabledefinesthereferencearchitectureadopted in theproject,discussesall thecomponents(alsocalledextensions)thatgroupthetechnicalinnovationsunderdevelopmentintheproject,andpresentapreliminaryversionoftheintegrationplanofthesecomponents.The reference architecture (see figure bellow) takes into consideration thelimitationsofexistingSIEMs,asidentifiedinDeliverable2.1.Thisarchitectureisbasedon thedesignof fourprinciples– (1)nomodificationonexistingSIEMs,(2)theexistenceofextensionportsinSIEMs,(3)theexistenceofnon-integratedcomponents,and(4)nosubstantialadditionalconfigurationshouldbeneeded-whichconstrainttheexpectedbehaviouroftheextensionsunderdevelopmentinthe project. The architecture considers nine components that encapsulate thetechnical contributions being developed in work packages 3-6, namely:Enhanced ApplicationMonitoring, Network-based Behavior Anomaly Detector,Listening247 Threat Predictor, OSINT Threat Analyser, Context-awareIntelligence Integrator, Action Sequence Analysis for User BehaviorUnderstanding,VisualAnalysisofDiverseSIEMData,DiversityAssessmentandForecasting,andCloud-backedLong-termEventsArchive.AllthesecomponentscanbeintegratedtoSIEMsfollowingthreedifferentpatterns-EventGenerator,Event Inspector, and Event Collector – that are supported by the four SIEMsselected forvalidating theproject innovations (HPEArcSight,XL-SIEM,Splunk,andElasticStack).

8/28/17 2

EnhancedApplicationMonitoring

Network-basedBehavior Anomaly

Detector

PublicCloudStorage

Services

OSINTDataSources

Cloud-backedLong-term

EventArchival

Listening247Threat

Predictor

SIEMDatabase

DiversityAssessmentandForecasting

ActionSequenceAnalysisforUser

BehaviorUnderstanding

VisualAnalysisofDiverse

ModellingofSIEMData

Visualization andAnalysis Tools

Diversity-enhanced Monitoring

OSINTDataFusion &Analysis

SIEM

OSINTThreatAnalyser

InfrastructureEnhancements

Context-aware

IntelligenceIntegrator

ManagedInfrastructure

D2.2

4

The deliverable also contains a preliminary version of the integration plan forthe components. Within these plan, we assess the requirements for thecomponents execution environments and discuss how they can be integratedwiththeSIEMsbeingconsideredintheproject.Thisarchitectureandintegrationplan,togetherwiththeanalysisofthestateofthe art of the SIEM technologypresented inDeliverable2.1, correspond to themain result of the DiSIEM WP2 (“Requirements and Architecture for SIEMIntegration”). The information here presented will guide the design andimplementationofthetechnicalinnovationsdevelopedinthenexttwoyearsoftheproject.

D2.2

5

TableofContents1 Introduction................................................................................................................................81.1 LimitationsofSIEMs......................................................................................................81.2 DiSIEMReferenceArchitectureandComponents............................................91.3 SIEMIntegration..............................................................................................................91.4 OrganizationoftheDocument.................................................................................10

2 DiSIEMReferenceArchitecture........................................................................................112.1 MainObjectives..............................................................................................................112.2 DiSIEMArchitecture....................................................................................................122.3 SIEMIntegrationPatterns.........................................................................................142.4 DiSIEMComponentsOverview...............................................................................162.5 FinalRemarks.................................................................................................................18

3 DiSIEMComponents..............................................................................................................193.1 Diversity-enhancedMonitoring..............................................................................193.1.1 EnhancedApplicationMonitoring...............................................................193.1.2 Network-basedBehaviourAnomalyDetector........................................21

3.2 OSINTDataFusionandAnalysis............................................................................223.2.1 Listening247ThreatPredictor......................................................................223.2.2 OSINTThreatAnalyser.....................................................................................243.2.3 Context-awareIntelligenceIntegrator.......................................................25

3.3 VisualisationandAnalysisTools............................................................................273.3.1 ActionSequenceAnalysisforUserBehaviourUnderstanding........273.3.2 VisualAnalysisofDiverseSIEMData.........................................................283.3.3 DiversityAssessmentandForecasting......................................................29

3.4 InfrastructureEnhancements..................................................................................313.4.1 Cloud-backedLong-termEventsArchive.................................................31

3.5 FinalRemarks.................................................................................................................324 PreliminaryIntegrationPlan.............................................................................................334.1 IntegrationbetweenComponents.........................................................................334.1.1 OSINTDataFusionandAnalysis...................................................................334.1.2 ApplicationMonitoringandUserBehaviorAnalysis...........................334.1.3 DiversityAnalysisandVisualisation...........................................................34

4.2 IntegrationwithSelectedSIEMs.............................................................................344.2.1 EnhancedApplicationMonitoring...............................................................344.2.2 Network-basedBehaviorAnomalyDetector..........................................344.2.3 Listening247ThreatPredictor......................................................................354.2.4 OSINTThreatAnalyser.....................................................................................354.2.5 Context-awareIntelligenceIntegrator.......................................................364.2.6 ActionSequenceAnalysisforUserBehaviorUnderstanding..........364.2.7 VisualAnalysisofDiverseSIEMData.........................................................374.2.8 DiversityAssessmentandForecasting......................................................374.2.9 Cloud-backedLong-termEventsArchive.................................................38

4.3 FinalRemarks.................................................................................................................385 SummaryandConclusions..................................................................................................39References...........................................................................................................................................40

D2.2

6

ListofFiguresFigure1-DiSIEMArchitecture...................................................................................................13Figure2-DiSIEMarchitectureandcomponents................................................................17Figure3-EnhancedApplicationMonitoringextensionarchitecture........................20Figure4-Network-basedbehaviouranomalydetectorarchitecture.......................22Figure5–Listening247OSINTthreatpredictorproposal.............................................23Figure6–OSINTThreatAnalyserarchitectureofourframework............................25Figure7-Context-awareIntelligenceIntegratorarchitecture....................................26Figure8-Conceptualdiagramofthecomponent..............................................................28Figure9–VisualAnalysisofDiverseSIEMData.................................................................29Figure10-Thediversityassessmentandforecastingtoolforalertfiltering,

labellingandanalysis............................................................................................................30Figure11-Long-termCloudEventArchivalarchitecture.............................................32

D2.2

7

ListofTablesTable1-IntegrationPatternsandthefourSIEMsconsideredinDiSIEM...............16Table2–DiSIEMcomponentsandtheirintegrationpatterns.....................................18

D2.2

8

1 IntroductionOrganizationscurrentlymonitorandmanagethesecurityoftheirinfrastructuresby setting up Security Operation Centres (SOC) to make security-relateddecisions (e.g., which system is under attack, what has been compromised,wherehasanaccessbreachoccurred,howmanyattackshavehappened in thelast12hours).ASOCobtainsanintegratedviewofthemonitoredinfrastructureby employing a Security Information and Event Management (SIEM) system.Thesearecomplexsystemsthatincorporatethefunctionalitytocollectlogsandeventsfrommultiplesources,correlatetheseeventstogetherandthenproducesummarisedmeasurements,data trendsanddifferent typesofvisualisations tohelpsystemadministratorsandothersecurityprofessionals.Duetothenatureofthe functionality of these systems (the number of systems that feed events tothem,thedifferenttypesofeventstheyneedtocorrelateetc.)theyarecomplexandcostlytodeployandmaintain.TheSIEMmarketisagrowingone.AccordingtoarecentGartnerreport[Gartner2016],in2015theSIEMmarketgrewfrom$1.67billiontoapproximately$1.73billion.Therearemanyhigh-qualityproducts from large ITvendors.Examplesare IBM QRadar,1HPE ArcSight,2Splunk,3LogRhythm4and AlienVault OSSIM.5Overall, theGartner report identified twomaindrivers for suchgrowth: threatmanagement and compliance. The spectrum of new attacks (with hundreds ofnovel kinds ofmalware eachmonth, including the ones relatedwith advancedpersistent threats) and the complexityof the IT infrastructures require awell-structured and integrated monitoring of security events. Additionally, manyindustrieshavestrictrequirementsforcompliance[Swift2010],especiallywhenitcomes to logmanagement,whichoftenmandates theneed for integrated logmanagementfunctionality,asprovidedbySIEMsystems.

1.1 LimitationsofSIEMsDespitetheirwidespreaduseandtheimpressivemarketgrowth,currentSIEMsstillhavemanylimitations(seeamorecompleteanalysisinD2.1[D21]):

1. The threat intelligence capability of SIEMs is still in its infancy.Consequently, the systems are unable to automatically recognize novelthreatsthatmayaffect(whole,orpartsof)themonitoredinfrastructure,requiringconsiderablehumaninterventiontoadaptandreacttochangesinthethreat landscape.Thishappensdespitetheavailabilityofrichandup-to-date security-related information sources on the Internet (e.g.,socialmedia,blogs,securitynewsfeeds),whichcurrentSIEMsareunabletouse.

1http://www-03.ibm.com/software/products/en/qradar-siem2http://www8.hp.com/us/en/software-solutions/siem-security-information-event-management/3http://www.splunk.com/4https://www.logrhythm.com/5https://www.alienvault.com/products/ossim

D2.2

9

2. Current systems can show any “low-level” data related with thereceived events, but they have little “intelligence” to process thisdata and extract high-level information. These low-level data (e.g.,numberoffailedloginsinaserver)areonlyaccessibleandmeaningfultoalimitedsubgroupofsystemadmins,andaredifficulttotranslatetohigh-level metrics for senior, C-level managers (such as executives anddecision-makers who may need to make decisions on securityexpenditure, but may not necessarily be well versed in the technicaldetails). This impacts, for instance, the capacity of SOC coordinators tojustifythereturnoninvestmentinsecurityforanorganization.

3. Most data visualisation techniques in current SIEMs arerudimentary. Advanced data visualisation in current SIEMs is stilllimited. This can seriously impact the ability of the SOCs to deal withincidentsasandwhentheyhappen,inatimelymanner.

4. TheeventcorrelationcapabilitiesofSIEMsareasgoodasthequalityof the events fed to it. Imprecise events and alarms generated byimperfectmonitoringdeviceswillbetakenascorrectbytheSIEMandtheuncertaintiesassociatedwiththeseeventsarenevercommunicated.

5. Duetostorageandeventprocessingconstraints,SIEMsareincapableofretainingthecollectedeventsforalongduration.Thislimitstheirusein conducting forensic investigations in the long run, for example onadvancedpersistentthreats,orotherhistoricalincidents.

TheDiSIEMprojectaimstoaddresstheselimitations,byprovidingasetoftoolsand methods to enhance the technology available today. However, instead ofproposing novel SIEMs or extensivemodifications to existing ones, the projectapproach consists in extending current systems, leveraging their built-incapacityforextensionandcustomisation.

1.2 DiSIEMReferenceArchitectureandComponentsThis deliverable describes the reference architecture of DiSIEM, togetherwiththemaincomponentsunderdevelopmentintheproject.Thesecomponentsaimto improve the state-of-the-art in SIEM technology, solving or relieving thelimitations just described, without requiring radical changes in existing SIEMdeployments.Therefore,theprojectputastrongemphasisinclearlypositioningthese components with respect to the extensibility technology available onSIEMs.

1.3 SIEMIntegrationInapreviousdeliverableforthisworkpackage[D21],weprovidedananalysisofsevenprominentSIEMproductsthatmembersoftheprojecthavesomeaccess,namely: HPE ArcSight, IBM QRadar, Intel ESM, Alienvault OSSIM, XL-SIEM,Splunk,andElasticStack.66 Technically, neither Splunk nor Elastic Stack are complete SIEM products. However, since both platforms are being extensively used for implementing cybersecurity-related big data analysis (also, by some partners) we decided to include them in our analysis. During this report, we call them SIEMs (together with other products) just to simplify the exposition.

D2.2

10

After careful consideration about the available infrastructures for pilotdeployment that could be provided by the partners leading the technologyvalidation work package (WP7), and the limited resources of the project, theconsortiumdecided to focuson thedesignof components targeting immediateintegrationwith fourSIEMs:HPEArcsight (EDP),XL-SIEM(ATOS),Splunk,andElastic Stack (Amadeus). Although most of our work in the integration planfocuses on technological feasibility of the integration on these SIEMs, thereferencearchitectureassumesonlybasicextensibilityfeaturesareavailableoneverySIEMweareawareof.TheobjectiveistoensurethecomponentsmightbeintegratedwithotherSIEMsiftheneedarises(e.g.,forfurtherexploitation).

1.4 OrganizationoftheDocumentChapter2givesanoverviewofthemainobjectivesofDiSIEManddescribeshowtheseobjectivesaretranslatedtoareferencearchitectureandasetofconcretecomponentsthatcanbeintegratedinexistingSIEMs(andhowthiscanbedone).Chapter3describes–inhighlevel–theninecomponentsdevisedintheDiSIEMproject, while Chapter 4 presents the integration plan for these componentsconsidering the four SIEMs selected for validating the project contributions.Chapter5presentstheconclusionsandoutlinefuturework.

D2.2

11

2 DiSIEMReferenceArchitectureSIEMssystemsarefundamentalforlargeorganizationstoimplementintegratedsecuritymonitoring andmanagement policies. In this way, they represent thefundamentaltooltowhichSOCanalystsworkon.SIEMswereinitiallyadoptedbymostcompaniesforachievingcompliancewithsomeindustryspecificregulation[Swift2010].Thiswasespeciallyimportanttoensurealleventscollectedbythemultiplesensorsdeployedintheorganizationinfrastructure are kept on a normalized format, in a centralized database, andcanbepresentedinhuman-readablereports.Morerecently,theincreasingsophisticationofthreatsagainsttheorganizations,the complexityofmodern IT infrastructures, and the amountof cybersecurity-related information available on the web, has lead organizations to employSIEMsalsotocorrelateandintegrateinformationcomingfrommultipleinternalandexternalsources.Theobjectiveistodetectandreact,asfastaspossible,toanythreatorattackagainstthemanagedinfrastructure.TheDiSIEMprojectrecognizethecomplexenvironmentinwhichSIEMsoperate,and aims to improve several aspects of existing systems by providing a set ofcomponentsthatcanbeintegratedtoexisting,alreadydeployed,systems.Inthiscontext, the DiSIEM reference architecture aims to provide a blueprint fordefiningwhichcomponentswouldbe integrated, inwhichSIEMs,andhowthisshouldbedone.This chapter defines themain ideas for the project reference architecture.Westart by a description of the architecture main objectives (Section 2.1), thendescribethemainprinciplesandideasbehindit(Section2.2),andsomepatternsand techniques for integrating components to a selected set of SIEMs (Section2.3).WeconcludethechapterwithanoverviewofthecomponentsconsideredintheDiSIEMreferencearchitecture(Section2.4)andsomefinalremarks(Section2.5).

2.1 MainObjectivesOneofthecoreideasoftheDiSIEMprojectistoenhanceexistingSIEMsystemswith several advancedcapabilities, thatwillbeencapsulatedas components tobe integrated in the system. These components can be grouped in five mainadvancesbeyondthestateoftheart:

1. Integrate diverse OSINT (Open Source Intelligence) data sourcesavailableontheweb,suchastheNIST’sNationalVulnerabilityDatabase,7vulnerabilityandpatchdatabasesofferedbyvendors;threatintelligencedata that organisations share with each other (e.g., Internet addresses,

7 https://nvd.nist.gov/

D2.2

12

URLs and file reputation databases like SANS ISC,8malware domainslists);securityblogsanddatastreamsfromsocialnetworks(e.g.,Twitter,LinkedIn),collaborativeplatformsused in theDarkWeb(e.g.,Pastebin),search engines and online repositories (e.g., Google Hacking Database,Shodan, RIPE/ARIN), standards-based Indicators of Compromise (IoC)(e.g.,STIXandOpenIoC),andmanyothers.Thisdataneedstobefetched,analysed,normalisedandintegratedtoidentifyrelationships,trendsandanomalies, and hence helping to react to new vulnerabilities in theinfrastructure or even predict possible emerging threats against themonitoredinfrastructure.

2. Developnovelprobabilistic securitymodels and risk-basedmetricsto help security analysts to decide which infrastructure configurationsoffer better security guarantees and increase the capacity of SOCs tocommunicatethestatusoftheorganisationtoC-levelmanagers.

3. Designnovelvisualisationmethodstopresentthediversedatasets,tobettersupportthedecision-makingprocessbyenablingtheextractionofhigh-level security insight from the data which will be used by thesecurityanalystsoperatingtheSIEM.

4. To increase the value of the events fed to the systemwewill integratediverse,redundantandenhancedmonitoringcapabilitiestotheSIEMecosystem. The idea is to have enhanced sensors and protection toolsbuilt using a set of diverse tools. Implementing these mechanismsrequiresprobabilisticmodellingofdiversity forsecurity todefinewhichcombinationsoftoolsaremoreeffectiveandhowmuchimprovementcanbe expected. Likewise, we propose to deploy and integrate novelapplication-based anomaly detectors and thus improve the SIEM’svisibility into the functional security status of these monitoredapplicationsanditsusers.

5. Addsupport for longtermarchivalofevents inpubliccloudstorageservices. To satisfy the security requirements of such data (whichcontains sensitive information), the events will be encrypted anddistributed through multiple cloud providers (e.g., Amazon, WindowsAzure, Google), employing techniques such as secret sharing andinformationdispersal.

To support these improvements on existing SIEM systems, we need a flexiblearchitecturethat isgeneralenoughtobeemployedformultiplecomponents indifferentSIEMs.Thisarchitectureispresentedinthenextsection.

2.2 DiSIEMArchitectureTheadvancesproposedintheprojectwillbematerializedthroughasetoftoolsandcomponents,intheformofplugins,thatcanbeintegratedintoexistingSIEMsystems,asillustratedinFigure1.

8 https://isc.sans.edu/

D2.2

13

Figure1-DiSIEMArchitecture.

The figure shows all the improvements will be made around the basic SIEMfunctionality, without requiring modifications on the base system. This is themostfundamentalarchitecturalprincipleofDiSIEM:

This is feasible as the SIEMs and otherbig data platforms considered in theproject (e.g., HPE ArcSight ESM, Splunk, IBM QRadar, OSSIM from AlienVault,Elastic Stack) all support extensibility. The componentswewill designwill beintegrated in these systems using their extension interfaces.More specifically,mostSIEMsweareawareofsupporttheintegrationofnewconnectors(e.g.,HPEArcSightConnectors[HPDC2012],ortheIBMQRadarConnector[IBM2014])tocollect events andprovideRESTful interfaces for accessing stored events [IBM2014b].The importanceof thisprinciplecomes fromthe impact thatSIEM-agnosticismcanhaveontheexploitationoftheprojectresults.BysupportingmultipleSIEMs,we increase the possibilities of adoption of DiSIEM components by differentorganizations.These extensibility mechanisms open the possibility of creating add-ons andextensions toexistingsystems.TheDiSIEMarchitectureexploits this feature toenhancethequalityoftheeventsfedtothesystemthroughcustomconnectors,and provide new visualisation tools, collecting data from the SIEM datarepository.Our secondarchitecturalprincipledefineshowweplan toenhancethesystem.

DiSIEM Architecture

7/12/17 1

OSINTDataSources(internet)

Reporting

CorrelationNormalization

SIEM

EventStorage

VisualisationandAnalysisTools

OSINTDataAnalysisandFusion

ManagedInfrastructure

Diversity-EnhancedMonitoring

Cloud-of-cloudsEventArchival

PublicCloudStorageServices

Principle1:DiSIEMcomponentsmustbeintegratedtoSIEMswithoutrequiringanymodificationsinthebasesystem.

D2.2

14

This principle is important as it constraints the integration ports we areconsideringforSIEMs.WedonotassumeanykindofdeepintegrationAPIthatwouldallowthemodificationofeventprocessingandstorageforthesystem.Onthecontrary,thisprinciplewasdesignedbyfollowingthein-depthstate-of-the-artstudyreportedinDiSIEMDeliverable2.1[D21].Evenrestrictingtheamountofextensibility,itisstillpossiblethattheintegrationofsomeoftheimprovementsdevelopedonDiSIEMwillnotbepossibleinsomeof the target SIEMs. However, these components need to respect some basicprinciples to be considered part of the DiSIEM ecosystem, as defined by thefollowingprinciple:

Theaimof thisprinciple is to ensuredevised componentsoffer enhancementsover the existing systems, complementing existing functionalities, and reusinginformation (e.g., configurations, assets,data sources)alreadyprovided for thebasesystem.Our last architectural principle rules out components that require extensiveconfigurationormanualinterventionduringoperation.Intheageofbigdataandmachine learning, the DiSIEM components should either exploit informationalreadyavailableintheSIEMorOSINTfreelyavailableontheweb.

Thisprinciple is importantas it forces thecomponentdevelopers to figureoutwaystogatherinformationfromthenetworkortheSIEM,withoutdefiningitasaconfigurationparameterofthesystem.

2.3 SIEMIntegrationPatternsHavingdiscussedthemainarchitecturalprinciplesoftheDiSIEMmodel,wearenowequippedtodescribehowtheseprinciplescanbeappliedtoexistingSIEMs.Inthissection,wediscussthemainintegrationpatternsforextendingasubsetofSIEMswiththecomponentsdevisedintheproject.

Principle2:ExtensionscanbemadeeitherbyfeedingneweventstotheSIEM,bytakinginfofromtheSIEMdatabase,and/orbycreatingnewwaystovisualiseSIEMdata.

Principle3:Non-integratedcomponentscanbeusediftheyoperateside-by-sidewiththeSIEM,withoutreplicatinganyexistingfunctionality.

Principle 4: No significant manual work should be required to setup andoperateDiSIEMcomponents.

D2.2

15

Althoughbasedon thesameconceptual framework,SIEMsare implemented indifferent ways, using different technologies. Nonetheless, all SIEMs withprominent position in the market support some form of extensibility. ThecomponentsdevelopedinDiSIEMaimstobeintegrated,individuallyortogether,inmostSIEMs,soweconstrainttheintegrationpatternstowhatissupportedbymostavailablesystems.Undertheseconstraints,thethreebasictypesofcomponentsthatcanbebuiltforintegrationwithexistingSIEMsconsideredintheprojectare:

• EventGenerator:ThecomponentisaneventgeneratorfortheSIEM,i.e.,theproducedoutputs fromthesystemmustbe ingestedby theSIEMasIoC, and can be visualised and correlated using standard SIEMcapabilities. This is especially important for new intrusion detectorsystemsorOSINTanalysers,whichfeedSIEMswithnewinformation;

• EventInspector:Thecomponentneedstoaccessthedatabaseassociatedwith the SIEM to inspect its configuration data (e.g., assets and theirproperties) and the collected events. This is especially important forvisualisationandanalysistools;

• EventCollector:Thecomponentneedstohaveaccesstoallevents(oratleast all events for a given type) sent to theSIEM.This is important forside-systems that do additional processing of events, beyond what theSIEMiscapableof.9

WeanticipatethatsomeSIEMswillnotprovideexternalAPIsforimplementingan Event Inspector (e.g., HPE ArcSight), neither considers the notion ofconnectors (e.g., Splunk) for implementing Event Collectors. However, in bothcasesthecomponentscanstillbeintegrated,aslongastheSIEMssupportoneofthe two integration patterns. In the former case, by collecting all events ofpotentialinterestsenttotheSIEMandstoringthemonaseparateddatabasetobe consulted when required by the Event Inspector. In the latter case, theintegrationcanbedonebyinspectingtheSIEMdatabaseperiodicallylookingfortheeventsofinterestcollectedinthelasttimeinterval.It’sworthremarkingthatitmightbepossibletoimplementdeeperintegrationsthan theonesprovidedby thesepatterns. For instance, newdata visualisationand analysis dashboards can be integrated in the visualisation panel of SIEMs,however,itisimportanttonotassumethatwhendevelopingsuchcomponents,asnoteverySIEMsupportextensibilityatthislevel.Table 1 shows how the SIEMs that will be used to validate the DiSIEMcomponentscansupporttheseintegrationpatterns.

9 Notice that making every sensor send the event to multiple sites is not a solution, as the configuration/management burden will be too high in large infrastructures.

D2.2

16

SIEM EventGenerator EventInspector EventCollectorHPEArcSight

UseofFlexconnectorstoparsecustomformatsorthegenerationofeventsinanalreadysupportedformat(e.g.,CEF,syslog)

TheinformationstoredinthisSIEMcanonlybeaccessedthoughtitsownproducts.Anexternalagentcan’taccessit

TheconnectorscanbeconfiguredtosendreceivedeventstotheSIEMandother(multiple)destinations

XL-SIEM EventscanbegeneratedandencapsulatedinthesyslogformattobeprocessedbyXL-SIEMAgents

ThroughaStormDRPCserver,XL-SIEMwillprovideanAPIforsupportingdiversetypesofqueriesoverthestoredinformation

Noeventconsolidatordefined.However,XL-SIEMAgentscanbeconfiguredtosendeventstomultipledestinations

Splunk Eventscanbegeneratedandencapsulatedinawellknowformat(e.g.,syslog),thatcanbeunderstoodbythesystem.SplunkcaningestdatainarbitraryformatsbyusingtheSplunkDataAPI

SplunkeventscanbeconsumedbyotherapplicationsusingtheSplunkSearchAPIorleveragingSplunkcapabilitytoforwarddatato3rdpartysystemviaTCPsocketsorSyslog

Nostandardwayfordoingthat.Nonetheless,heavyforwarderscanbeusedforcollectingandconsolidatingeventsfrommultiplesensors(thatrununiversalforwarders)

ElasticStack

Logstashsupportmostdataformatsusedtoday,butcanalsobeextendedwithnewpluginstosupportcustomformats

TheElasticStackdatastoresupportsaRESTAPIforimplementingseveraltypesofqueriesoverthestoredinformation

Logstashcanbeconfiguredand/orextendedtosendreceivedinformationtomultiplerecipients

Table1-IntegrationPatternsandthefourSIEMsconsideredinDiSIEM.

The table shows that it is possible to seamless implement most componentintegrationpatternswiththeSIEM.TheonlyexceptionsaretheEventInspectorin HPE ArcSight and the Event Collector in XL-SIEM and Splunk. In all thesecases, it is possible to circumvent these limitations by using alternativeAPIs/mechanisms,asexplainedbefore.

2.4 DiSIEMComponentsOverviewTakingintoconsiderationtheintegrationpatternsdescribedinprevioussectionsand the architectural principles defined in Section 2.2, we devised a set ofcomponents to enhance the existing SIEMs with respect to all the limitationsdiscussedintheintroductionofthisreport.Figure2presentsanoverviewoftheconsolidatedarchitectureenvisionedfortheenhancedSIEMs.Thefigureillustratesadeploymentwithallcomponentsbeingusedand(insomecases, working together in an integrated way). In practice, we expect thatdifferentSOCswillpickdifferentsubsetsofcomponentsthatfittheirneedsandpriorities.

D2.2

17

Figure2-DiSIEMarchitectureandcomponents.

The figure showsnine components,divided in fourgroups, inaccordancewiththe type of enhanced provided by the component. TheOSINTData Fusion&Analysis group contains the components related with the acquisition,processing, and integration of OSINT data, mostly developed in WP4. TheDiversity-enhancedMonitoring components comprise the enhanced sensorsdevelopedintheproject,mostlyinWP6.AsforInfrastructureEnhancements,itisagroupwithasinglecomponent(alsodevelopedinWP6),thatimprovethecapacity of the SIEM to archive events for long time. The last group,VisualisationandAnalysisTools,containsthecomponentsforvisualanalysisand forecast of collected SIEM data. Most of these components are created inWP5, implementing also techniques developed in WP3. All the componentspresentinthisfigurewillbedescribedinthenextchapter.Table 2 shows which integration pattern corresponds to each of the ninecomponents presented in Figure 2. For each component, we mark Yes if thecomponentneedsthe integrationpattern,Possibly if thepatterncanbeusedasanalternative,andNoifthecomponentdoesnotrequirethepattern.Ascanbeseen,allthecomponentscanbeintegratedtoSIEMsinsomeway.

8/28/17 2

EnhancedApplicationMonitoring

Network-basedBehavior Anomaly

Detector

PublicCloudStorage

Services

OSINTDataSources

Cloud-backedLong-term

EventArchival

Listening247Threat

Predictor

SIEMDatabase

DiversityAssessmentandForecasting

ActionSequenceAnalysisforUser

BehaviorUnderstanding

VisualAnalysisofDiverse

ModellingofSIEMData

Visualization andAnalysis Tools

Diversity-enhanced Monitoring

OSINTDataFusion &Analysis

SIEM

OSINTThreatAnalyser

InfrastructureEnhancements

Context-aware

IntelligenceIntegrator

ManagedInfrastructure

D2.2

18

DiSIEMComponent EventGenerator

EventInspector

EventCollector

EnhancedApplicationMonitoring Yes No NoNetwork-basedBehaviourAnomalyDetector Yes No NoListening247ThreatPredictor Yes No NoOSINTThreatAnalyser Yes No NoContext-awareIntelligenceIntegrator Yes Possibly YesActionSequenceAnalysisforUserBehaviourUnderstanding No Yes Possibly

VisualAnalysisofDiverseSIEMData No Yes PossiblyDiversityAssessmentandForecasting Yes Possibly YesCloud-backedLong-termEventArchive No Possibly Yes

Table2–DiSIEMcomponentsandtheirintegrationpatterns.

2.5 FinalRemarksThischapterdescribedthereferencearchitectureofDiSIEM,withitsunderlyingprinciples andmain components. Our exposition focuses on showing how thecomponents devised in the project can be integrated in SIEMs already inproduction.Wealsoprovideexamplesofhowthe integrationpatternsadoptedintheprojectcanbeimplementedinthefourSIEMsselectedintheproject:HPEArcSight,XL-SIEM,Splunk,andElasticStack.

D2.2

19

3 DiSIEMComponentsIn this chapter, we present the components being developed in the DiSIEMproject. The descriptions are necessarily brief, just to position the componentwithintheoverallDiSIEMreferencearchitecture.Furthermore,itisimportanttoremarkthatthesecomponentsarebasedonongoingworkinDiSIEMWPs3-6,soitisreasonabletoexpectthatsomeofthedesignspresentedherewillbesubjecttoupdatesinthenexttwoyearsoftheproject.TheseupdateswillbereportedinfuturedeliverablesandpapersproducedinWPs3-7.

3.1 Diversity-enhancedMonitoringThese components comprise a set of advanced sensors built either forapplications or on top of existing basic network data collector services. Theobjectiveofthesecomponentsistofeedinformation-richeventstotheSIEMforimprovingitscapabilitiesfordiscoveringthreatsinthemonitoredinfrastructure.Thisaddressesthe4thlimitationofcurrentSIEMs,asdescribedinSection1.1.

3.1.1 EnhancedApplicationMonitoringThe Enhanced Application Monitoring component is an application-basedintrusion/anomaly detector that aims at enhancing application securityusing a User/entity Behaviour Analytics (UBA) approach. This componentwillleveragebothsupervisedandunsupervisedmachinelearningtechniquestocomputeananomalyscoreforapplicationsessions.Theanomalyscorewillthenbe used to take appropriate action (e.g., warn, block, suspend). The extensiondesign is modular to support different applications and to allow betterinteractionwiththeSIEMandotherDiSIEMcomponents.ThearchitectureofthecomponentsisdescribedinFigure3,andiscomposedbythefollowingmodules:

• A configuration module: The extension will rely on a singleconfiguration file that will define its expected behaviour and itsinteractionwiththeSIEMandtherestoftheSIEMdiversity-enhancementcomponents. The configuration will define the application log inputsources,theanalyticalmodels,thescoringschemeandtheoutputlocation(e.g.,SIEM,Application);

• An inputmodule: Thismodule is used to parse and ingest applicationlogs in addition to OSINT data, being useful to improve the anomalydetectionperformance;

• Aggregation/Correlation/Normalizationmodule:Thismodulewillusetheparsedapplicationlogscomingfromtheinputmoduletoperformtheneeded record correlation and normalization and create applicationsessions;10

10 Notice this is not a general event correlation engine, as used in SIEMs, but rather a module for correlating application records to build a user session.

D2.2

20

• Behaviouralengine:Thismodule isusedtobuildthestatisticalmodelsas a learning phase and to evaluate the anomaly score of applicationsessionsagainstthepreviouslybuiltmodels.Thebehaviouralenginewillfirst be an implementation of the Skeptic framework [D61], however,othersupervisedorunsupervisedanalyticalmodelscanbeused;

• Rule engine: Alongside the behavioural engine, the Rule Engine willleveragefunctionalandexpertknowledgebyapplyinguserdefinedruleson theapplicationsessions.Therulesconsumedby theRuleEnginecaneither be “Fuzzy” (Membership functions, Fuzzy Logic, etc.) or “Static”(Yes/No).TheresultsfromboththeRuleandBehaviouralenginesaretheoutputoftheEnhancedApplicationMonitoringcomponent;

• Outputmodule: Thismodule is responsible for forwarding the resultsfrom the Behavioural and Rule engines to the SIEM. Depending on theconfiguration,theoutputmodulecanalsosendrequeststothemonitoredapplicationitselftotakecorrectiveactionsagainstusersessionsifneeded(e.g.,logging-outand/orblockingauser).

Figure3-EnhancedApplicationMonitoringextensionarchitecture.

The Enhanced Application monitoring component will get feedback from thevisualisation and analysis tool (described in Section 3.3.1) to improve theanalytical models for anomaly detection by tweaking the configurationparameters.MoredetailsabouttheenhancedapplicationmonitoringcomponentcanbefoundintheDeliverable6.1[D61].

8/28/17 3

SIEMVisualizationandAnalysis

Tools

MonitoredApplications

Inputmodule

Outputmodule

BehavioralEngine

RuleEngineConfigurationMod

ule

Alerts/Sessions

Logs

CorrectiveAction

OSINT

Feed

back

ApplicationSessions

Aggregation/Correlation/NormalizationModule

Internet

D2.2

21

3.1.2 Network-basedBehaviourAnomalyDetectorThe Network-based Behaviour Anomaly Detector is a DiSIEM component thatcan be used tomonitor network traffic that takes place in the managedinfrastructure togetherwith logs generatedby applications of interest todetect, in real time, anomalies that might represent attacks to theinfrastructure. A common example of such anomalies is the observation of asignificantdeviationofregulartrafficpatternsinanetworksegmentorcomingfromaspecificserver.This component works in two phases: (i) learning phase, and (ii) operatingphase.Duringthelearningphase,thesensorusesmachinelearningalgorithmstobuildamodelforthebehaviourofthenetworktrafficbasedonthenormalusageofapplicationsbythedifferentusers,whereasintheoperatingphasethesensordetects deviations or anomalous behaviours. The detection is based oncomparingtheactualtrafficagainstthebuiltmodel.Thesensorreceivesasinputadatasetcontaininglegitimateandmalicioustrafficwithsomeofthefollowingfeatures:

• Number or incoming/outgoing connections from, to or betweenserversrunningtheapplications;

• Sizeofthepacketssent/received;• Duration of the connections established between servers or between

clientsandservers;• Source/destinationIPaddressesandportsoftheconnections;• Information related to the application relevant for modelling its

behaviour(e.g.,URLorfunctioninvokedbytheuser)recoveredfromthelogsoftheapplicationsbeingmonitored;

Theproposed sensorusesone class support vectormachines (One-Class SVM)[CorinnaandVapnik1995]asanunsupervisedalgorithmthatlearnsadecisionfunctionfornoveltydetection(i.e.,classifyingnewdataassimilarordifferenttothetrainingset).Asthisisatypeofunsupervisedlearningprocesswithnoclasslabels,themethodtakesasinputanarrayXanddetectsthesoftboundaryofthatset to classify new instances as belonging or not to that set. The sensor willconsider,forinstance,thenumberofconnectionsbetweentheapplicationserverandthedifferentservers inthemonitoredinfrastructurewheretheapplicationis running, the connections and traffic between the ports used by theapplications,etc.After learning the normal behaviour of the application, the sensor performsevaluations of a valid dataset containing legitimate and malicious traffic toclassifyasabnormaleverysetofpacketsthatfallsoutsidetheboundariesofthedeveloped model. The output of this component is therefore an indication ofwhethertheanalysedtrafficisnormaloranomalous.ThearchitectureofthiscomponentisdepictedFigure4,andmoredetailsaboutitcanbefoundinDeliverable6.1[D61].

D2.2

22

Figure4-Network-basedbehaviouranomalydetectorarchitecture.

3.2 OSINTDataFusionandAnalysisOne of the pillars ofDiSIEM is the use of OSINT for enriching the informationprovided by SIEMs. This is done through two components for gathering andanalysing OSINT frommultiple sources,11and a component that integrates theprocessedOSINTwithinternaleventsandprioritizesthreats.Thesecomponents,whichwillbefullydescribedinDeliverable4.1[D41],aimtoaddressthe1stand2ndlimitationsofcurrentSIEMs,asdescribedinSection1.1.

3.2.1 Listening247ThreatPredictorThe Listening247 threat predictor will gather OSINT data from varioussources includingTwitter,boards,news,forums,andInstagram,amongothers,using the DigitalMR listening247 platform.12This information will then be11 Listening247 Threat Predictor is based on the adaptation of an existing digital market research service for analysing cyber-security related OSINT and, therefore, is firmly based on existing ready-to-market tools. In accordance with the project Description of Action, in DiSIEM we are also investigating other recent machine learning techniques to extract threat information from OSINT data. These techniques were encapsulated in a second component, the OSINT Threat Analyser. 12 https://www.digital-mr.com/solutions/social-media-listening

D2.2

23

passed through a custom machine learning pipeline, which will apply noisefiltering,natural languageprocessing,andthreat likelihoodprediction, asshowninFigure5.

Figure5–Listening247OSINTthreatpredictorproposal.

Thekeyinputsforthelistening247willbethekeywordsrepresentingthetargetmonitored infrastructure.This is the first step tonoise filteringwhich involvesformingqueriesthatdisambiguatekeywordsthatmightbehomonymstootherwords. For example, “Windows” (the operating system),will yield informationaboutwindowswhichareusedinbuildings,amongothers.Formingspecializedqueries for theserelevant infrastructureaskeywordssuchas for theWindowsoperatingsystemwillnarrowdownthevastamountsofOSINTdataavailableonthe internet, thereby making the amount of data to be processed moremanageable.To process all these information, the OSINT data needs to be aggregatedwithrespect to timeso that informationoccurringwithinasimilar time intervalgetgrouped together.Thisallows for timeseriesanalysisof thedata fromvarioussources. Information could be aggregated by week, making it more likely tocontaindatawithsimilarinformation.Morespecifically,varioussourcesofdatahavedifferentvelocitiesandaggregatingdatabyweekmakesitmorelikelyforinformation related to each other to be grouped together. By velocity, we arereferringtotherateatwhichdataisbeingproduced.Forexample,whenthereisa breakdown ofWhatspApp, socialmedia sources like Twitter, Instagram, and

8/30/17 8

OSINTDataSources

Preprocessing

NoiseFiltering

ThreatAnalysis

API Crawler

listening247

SIEM

OSINT

STIX2.0

D2.2

24

Facebook are usually the first ones to report the news. Followed by newsagenciesontheirsites,andblogarticlesthatfollowupontheevent.After this pre-processing, the system filters out noise, specifically informationfound to not be useful for the infrastructures of interest. This noise filteringmodelwillneedtobetrainedonmanyOSINTdataannotatedwithrelevancebyhuman curators with domain knowledge. Filtered OSINT data will then beallowed to pass through to the secondmodel,whichwill use natural languageprocessing to obtain meta-data such as the infrastructure and possibly thelocations involved, which will be added to the data in STIX v2.0 (JSON).Prediction of threat likelihood will require annotating data with threat forsupervised training.As a result, the STIXpayload could include informationofthreat likelihood to the infrastructure, and possibly a prediction confidence toavoidfalsealarms.Thiswill likelyinvolvetheuseofrecurrentneuralnetworkssuchasLongShortTermMemoryNeuralNetworksorGatedRecurrentNeuralNetworks [Alpaydin 2014], which remember information over time and usesthat to influence their next prediction. This is essential for events that unfoldovertime.Finally,thethreatpredictorwillsendovertheinformation,eitherdirectlytotheSIEMortotheContext-awareIntelligenceIntegratorcomponent(Section3.2.3),intheSTIX2.0format,whichfollowstheJSONsyntax.

3.2.2 OSINTThreatAnalyserIna similarway to thepreviouscomponent, theobjectiveof theOSINTThreatAnalyser is to gather and analyse security-wise web data relevant to theinfrastructuremonitoredbyaSIEM.WeconsiderthattheSOCoperatorshavea limited time budget to get in touch with the latest news. Therefore, thiscomponent aims at maximizing the amount of information gathered, whileminimizingtheamountoftimerequiredtoviewit.Toachievethisobjective,wepropose aprocessingpipeline composedof anOSINT information gatherer, anautomatic method for selecting the relevant information, and a summarizingfunction. More specifically, we use an automated tool to gather tweets fromsecurity-relevant accounts, a supervised machine learning technique (SupportVector Machines, Multi-Layer Perceptron Neural Networks, Deep LearningNeuralNetworkArchitectures [Alpaydin2014]) to select the relevant ones forthe specified infrastructure, and a clustering method (k-means) to avoidpresentingrepeatedand/orunnecessaryinformation.Figure6showsthearchitectureofourinitialsolution,whichfocusesonTwitter.Thedatacollectorretrievestweetsconcerningaspecifickeywordset(ideallytheinfrastructure to be monitored, but other relevant inputs can be given). Thetweetsarefilteredastheymentionedthekeywords(infrastructureelements)ornot. Then, these are pre-processed into a normalized form before passing afeatureextractionstep.Thefeaturevectorsarefedtoasupervisedclassifierthatclassifiesthetweetsassecurityrelevantornot.Finally,thereisaclusteringstepthatgatherssimilartweets,sothatrepeatedinformationisnotshown.

D2.2

25

Figure6–OSINTThreatAnalyserarchitectureofourframework.

Differently from the previous component, our framework requires a set ofaccounts fromwhere to gather the tweets. The accounts shouldbe focusedoncybersecurity (e.g., @USCERT_gov, @SecurityWeek) or related with themonitored infrastructure (e.g.,@Cisco,@WordPress) to reduce the amount ofnon-relevant tweets gathered. It is possible to update the data collector toadd/removeOSINTsources,aswellaschangingtheassetsmonitored.Furthermore, the component requires akeyword setdescribing themonitoredinfrastructure. Ideally, the frameworkcould receivedirectly from theSIEM thevariouselementsmonitored,butthispossibilityisstillbeingstudied.Although most of the current work on this component comprises Twittermonitoring, the component can also be used tomonitor IP blacklists availableonline, through adifferent set ofmodules. This component aims to reduce thelarge false positive rate observed by industrial partners using these lists byintegratingthe50+datafeedsavailableontheinternet,improvingtheaccuracyofthedetectedIoC.

3.2.3 Context-awareIntelligenceIntegratorThe Context-aware Intelligence Integrator aims to integrate OSINT securityinformation(possiblyfrompreviouscomponents)withthreatintelligencecollected from the local infrastructure and assign a threat score for thegenerated indicators of compromise. Therefore, despite its name, thiscomponentnotonlyintegrateinformationfrommultipleOSINTdatasources,butalsoenrichesthisdatawithinformationfromthemonitoredinfrastructure.Thiscomponentwilluseheuristics tocomplementthestatic informationaboutthemonitoredinfrastructurewithdynamicandreal-timethreatintelligencedatagenerated by the SIEMs from inside their monitored infrastructure. Someexamplesofdynamic informationcoming fromthe infrastructure thatcouldbeusedintheevaluationare:

• VulnerabilitiesdetectedintheinfrastructurethatcouldappearastargetofsomeIoCreportedfromOSINTdatasources;

• EntitiesorthreatactorsinvolvedinapreviouslydetectedincidentinanincomingIoC;

D2.2

26

• ToolsorintrusionsetsdetectedasinstalledinsidetheinfrastructureareusedassourceinsomeincomingIoC;

• IPs or domains included in patterns provided by incoming IoCs (e.g.,blacklistedIPs,suchas theonescollectedbytheOSINTThreatAnalysercomponent), which have already been reported as the source of anincidentdetectedintheinfrastructure.

The key feature of this component is to provide the SIEMs with a betterevaluation of the priority and relevance of incoming security informationreceived from OSINT data sources. Furthermore, it aims to provide a moremeaningful threatpredictionbybeingawareofwhat ishappeningorhasbeenalready reported from inside the infrastructure.With this purpose, itwill alsoidentify, analyse, and validate which are the sources of information and theattributes that better allows performing this classification of incoming OSINTdata to assign a priority to the security information received. This componentwill focus on threat intelligence data provided from OSINT data sources andgeneratedbytheinfrastructuresfollowingthestandardSTIX2.0format.SecurityinformationcomingfromOSINTdatasourceswillbeprovidedbyotherDiSIEMOSINTDataFusionandAnalysiscomponents(seeprevioussections).AdraftofthearchitectureofthiscomponentispresentedinFigure7.

Figure7-Context-awareIntelligenceIntegratorarchitecture.

D2.2

27

AthreatscoreforthesecurityinformationreceivedfromOSINTdatasourceswillbecalculatedbythiscomponent.ThefinaloutputwillbeanewSTIXobjectequalto the original received from the other OSINT data Fusion and Analysiscomponents,butnowenrichedwiththethreatscoreandthedatarelatedtotheinfrastructureusedin itsgeneration,aswellasareferencetotheoriginalSTIXobject.TheresultingJSONeventwiththisSTIXobjectwillbesentusingsyslogtobe integrated by the SIEMs or to be used by other DiSIEM OSINT-basedcomponentstorefinetheirmachinelearningalgorithms.

3.3 VisualisationandAnalysisToolsThe Visualisation and Analysis Tools group comprises a set of components toenrich the dashboard of existing SIEMs, addressing thus the 2nd and 3rdlimitationsofcurrentSIEMs,asdescribedinSection1.1.Moredetailsabout thecomponentsheredescribedcanbe found inDeliverable5.1[D51]andintheupcomingDeliverable3.1.

3.3.1 ActionSequenceAnalysisforUserBehaviourUnderstandingThis component aims to provide a comprehensive understanding of userbehaviourthroughthevisualanalysisoftheiractionsequences.Understandinguser behaviour from action logs posesmany challenges. First, the raw actionscontain little semantics but they represent multiple levels of higher semanticconcepts such as user tasks and roles. The number of actions is high and thesequences contain noise. Also, multiple facets exist in the data such as time,semanticsandusers.Toaddressthesechallenges,weproposeavisualanalyticsapproach that combines the strength of both automatic and visual analysismethods to gain knowledge from data. Automated analysis techniques areappliedtoderiveorminehigh-levelfeaturesfromtherawdata.Theresultsarethen visualised in a set of linked views enabling operators to interactivelyexplore and gain deep understanding. One feature provides an overview ofsessions(separationofactionsequencessuchasbasedonuserloginandlogout)accordingtomultiplecriteria.Anotherfeatureallowsin-depthinvestigationofasubset of sessions throughmulti-scalar summaries of action types and theirtemporal distributions. The component also supports interactivecomparisonstobetterunderstandanomalies,overlaps,anddifferencesbetweensessions or users. If the actions contain anomalymodel scores, the scores andassociateduncertaintiesfromunderlyingmodelswillalsobevisuallyexplored.Figure 8 illustrates the Enhanced Application Monitoring component (Section3.1.1) and how it interacts with the remaining of the system. The componentreadsrawlogdatafromtheapplication,extractssessionsandcomputesanomalyscores.ItsavestheresulttotheSIEMandthiscomponenttakesthosesessionsastheinput.Twotypesofoutputareproducedinthiscomponent.Thefirstoneistheunderstandingaboutuserbehaviourgainedbyoperatorswhiletheyinteractwiththevisualisationsandthefeedback.ThesecondoneisthefeedbacktotheEnhancedApplicationMonitoringcomponenttoimprovetheperformanceofitsanalyticalmodels.

D2.2

28

Figure8-Conceptualdiagramofthecomponent.

3.3.2 VisualAnalysisofDiverseSIEMDataAlargeorganisationusuallymakesuseofmultipledetectionormonitoringtoolsto protect their network infrastructure. The generated data from those tools(stored in a SIEM) provides a chance for exploration and assessment of theperformanceofdifferentcombinationof tools.This componentsupportsvisualanalysisofsuchdatacomingfromdiversesecurityconfigurations.First, the component provides an overall understanding of the monitoringnetworkthroughvisualsummariesofhighnumbersofnetworkactivitiesandevents, highlighting those alerted by security tools. It allows data explorationbased on different perspectives such as alerts and security tools. Through aninteractivenetworktopologygraph,theuserwillbeabletovisualisenotonlythe alarms generated but also the anomalies detected in the traffic as well asadditionalthreatintelligenceinformationreceivedfromOSINTdatasourcesandrelevant to the different nodes. Besides, the user will have the possibility ofgeneratingnewIoCassociatedtoaspecificnodeorthemonitorednetwork.Thecomponentalsoprovidesvisualrepresentationand interactionmechanismto define what-if scenarios to compare and evaluate alternativeconfigurationsofsecuritytools.BesidesclassicgraphicalmeanssuchasROCcurve[Collins2014],novelvisualrepresentationswillbedevelopedtofacilitateperformanceassessment.Thecomponentwillbedesignedtohandlebothstaticand dynamic data. More importantly, time-varied performance could helpoperators assess and predict future configurations more effectively. ExternaldatasourcessuchasOSINTdatacouldalsobeused to identifyanycorrelationwithalerts.Figure9presentsanoverviewofthecomponent.TheDiversityAssessmentandForecastingcomponent(describedinnextsection)readsrawlogdatafromSIEMandlabelsthemwithadditional fieldsprovidinginformationaboutthesystemsraising alerts. This component takes those alerts and events as input andproduces interactivevisual representations supporting thedesigned tasks.TheContext-awareIntelligenceIntegratorcomponent(Section3.2.3)providesthreatintelligencefromOSINTdatasourcesafterdatafusionandanalysisandincludinga threat score based on IoC provided from the monitored infrastructure. The

D2.2

29

Network-basedBehaviourAnomalyDetectorcomponent(Section3.1.2)analysesthe network traffic to detect anomalies in the normal learnt behaviour of themonitored applications. This component takes as inputs the output of thosecomponents to provide additional information to the situational awarenessofferedtotheuserthroughaninteractivevisualnetworktopologygraph.

Figure9–VisualAnalysisofDiverseSIEMData

3.3.3 DiversityAssessmentandForecastingTheDiversityAssessment and Forecasting component aids data collection andlabelling, allowing assessment of diversity, and provides forecasting andprediction. The managed infrastructure will send data from more than onedetectionormonitoringtoolforthesamenetworkorsystemtotheirSIEM.Thiswill allow comparison or filtering of the alerts or events from the givennumber,N,oftools.Ahigh-levelmetricshowinghowmanytimesallthesystemshave “fired”/alerted togetherwill be provided. Furthermore, different filteringruleswillbeavailable,allowinganarchitecture tobechosen fromthepossibleconfigurations:oneoutofN(1ooN),majority,agivennumber,k(1<k<N),outofN (kooN), and everything (NooN). This feature can be used to select themostappropriatearchitectureofthegivensetup.Thealertsandeventsfromeachtoolwillhavefieldsincommontofacilitatetheircomparison.Sofar,wehavebuiltaworkingprototypeusingtheElasticStackfortwoIDSs,SuricataandSNORT,whichrunsonasavedpacketcapture(pcap)file.TheoutputfrombothwerethensentviaLogstashtoElasticsearch.Thedata(or“documents”inElasticsearchparlance)isenhancedwithafieldtoindicatefromwhich system the alert came from. Our prototype shows the number of timesbothhavealertedtogetherwiththenumberoftimesjustonesystemhasalerted.This data tells us nothing about potential problems that have beenmissed, sothistoolallowsanysavedsystemornetworktraffictobelabelled(offline)intomalicious and non-malicious. The labelling could be done by various networktraffic identifiers (protocol, src IP, dst IP, srcport, dstport, timestamp, etc), orrulescouldbedefinedsolabellingisdonewithoutfurtherinterventionfromthe

D2.2

30

users of the assessment and forecastingmodule (e.g., always label traffic fromthis IP address on this port asmalicious if IDS A has labelled it as such). Thelabels are values 1 or 0, indicating whether the traffic was malicious or not.Alternatively,amaliciousness/suspiciousnessratingcouldbegiven(e.g.,0to1,with0beingnotmaliciouswithhighcertainty,and1beingmaliciouswithhighcertainty). Using a numeric value allows a rating or cost to be assigned theproblematictrafficthatcanbeusedinthemetricplugin,describedlater.The Diversity Assessment and Forecasting component will use the data toproduce various statistics related to risk. For example, an assessment ofsensitivity and specificity [Alpaydin 2014] of single protection systems anddiversecombinationsofprotectionsystemsbasedontheirFalsePositive/FalseNegative rates can be made for labelled data. Even without labelled inputs,statistics and visualisations using the tools described in Section 3.3.2, can beinformative.Theoutputswillindicateuncertainty,forexampleoftheaccuracyofa1ooNconfigurationof IDSs. Inadditiontoassessment, thepackagewillmakeforecastsandpredictionsofeventsofinterestusingdatafromthelabelledtrafficandstatisticsfromthediversityassessmentsteps. Itwillreportthetimetothenext expected false positive (an alert which can safely be ignored) or falsenegative (an alert hasnotbeen generatedbut somethinguntoward is likely tohappen).Itwillreportthefrequencyoffalsepositiveandfalsenegativesoverthenexttimeinterval.Othermetricsofinterestsuchasaccuracyandfrequencyofalerts for one ormore systems, or similar, can also be reported. Furthermore,these results will aid a SOC in answering questions such as “givenachoiceofdiverseconfigurations,andasetofcriteria(e.g.,minimisefalsepositives,minimisefalsenegatives),whichdiverse configurations should I use”? Finally, this featurewill assess the predictive accuracy and address calibration of the modelsdeveloped.Figure10depictstheflowfromthesaved,labelleddatatotheSIEMfrontend,viaaseparatetoolforanalysis,labelling,forecastingandprediction.

Figure10-Thediversityassessmentandforecastingtoolforalertfiltering,labellingandanalysis.

8/31/17 7

Visualisation Tools

Alertsfrommultipletools

AnalysisandForecasting

Alertfilter

Labelling

Forecasting&Prediction

1ooN,kooN,NooN

visualisationROC

Timeseries

Diversity monitoring

1ooN,kooN,NooN

assessment

True/falsepositive/negative

report

SIEM

D2.2

31

3.4 InfrastructureEnhancementsThe last component group is currently composed by a single component, thatintegrates SIEMs event archival capabilities with cloud storage services. Thisaddressesthe5thlimitationofcurrentSIEMs,asdescribedinSection1.1.

3.4.1 Cloud-backedLong-termEventsArchiveThepurposeoftheCloud-backedLong-termEventArchivecomponentistostorethe events generatedby sensors in themonitored infrastructure.These eventsaresenttoSIEM,buttheystayinthesystemforalimitedamountoftimeduetostorageconstraints(generally,lessthan6months).Thiscomponentaimstousecloud storage services such as Amazon S313and Azure Blob Storage14tosecurely archive these events for a longer period. Increasing the retentionperiod is important because many threats, vulnerabilities, and zero-days arediscoveredlongaftertheyentertheinfrastructure.Somezero-dayvulnerabilitiestake 3 years or more to be disclosed [Ablon and Bogart, 2017]. Anotherimportant reason is to support long-term forensic analysis, which requiresaccesstoeventscollectedatthetimetheincidentoccurred.Thecomponentwillorganizeandstoretheeventsinacloud-of-clouds[Bessanietal.2013,Oliveiraetal.2016],ensuringsecurity,cloudfaulttolerance,andcostefficiency.Thecloudsused forstoringtheeventsshouldbecompliantwith theEU General Data Protection Regulation,15or any other legislation applicable tothepersonalorsensitivedata.Intheend,theobjectiveistoeliminatetheneedforSIEMstohavealocalarchivalinfrastructure.Figure11illustrateshowtheproposedcomponentinteractswithexistingSIEMs.In the figure, the Sensors (S) generate events and send them to Connectors,whichnormalizeandforwardthemtoSIEMandtoaLocalArchive, ifavailable.Onepossibility is to transfer thearchivedevents fromtheLocalArchive to thecloudstoreafteraninitialstorageperiod(Figure1a).IftheLocalArchiveisnotavailable,theeventscanbesentdirectlyfromtheConnectorstoourcomponentat the same time they are sent to the SIEM (Figure 1b). The Event Archiverreceives the events, groups them during a time interval (e.g., one hour), andcreatesaneventblockthatiswritteninthecloudsusing,forinstance,acloud-of-cloudsystemdesignedinapreviousproject[Bessanietal,2014].

13 https://aws.amazon.com/s3/ 14 https://azure.microsoft.com/en-us/services/storage/blobs/ 15 http://www.eugdpr.org/

D2.2

32

Figure11-Long-termCloudEventArchivalarchitecture.

Tominimize the costs of using cloud storage services, it is important that thecomponentorganizestheeventsinsuchawaythattypicalsearchesontheeventarchivalavoiddownloadinghighvolumesofdata.Thisrequirementcomesfromthefactthatdownloadbandwidthisthemostexpensiveresourceonmostcloudstorage services. To do that, we employ a strategy to efficiently divide eventtypesintoblockstominimizereadsfromthecloudandmaximizetheefficiencyof cloud data retrieval. This is done by generating index files for the blocksconsideringeventpropertiesnormallyusedforperformingsearches(e.g.,theIPaddressoftheeventsource).Archival searches are executed in the component by using a REST interface,providedbytheQueryManager.Readeventblocksarecachedtobereadlocally,as archival searches are usually part of a group of many operations, usuallyduringaforensicsanalysis.MoredetailsaboutthiscomponentcanbefoundinDeliverable6.1[D61].

3.5 FinalRemarksIn this chapter, we presented an overview of the nine main components thatintegrate the innovations being developed in DiSIEM. The objective was todiscussthemainproblemsthesecomponentssolveandhowthey interactwiththe SIEM ecosystem. In the next chapter, we discuss the integration of thesecomponentswiththefourtargetSIEMsbeingusedintheproject.

14-06-2017 6

Connector Connector Connector

SS S S S S S S S

S

Cache

SIEM

Connector Connector Connector

SS S S S S S S S

S

a) b)

SIEMLocalArchive

EventArchiver

QueryManager

PublicCloudStorageServices

Cache

EventArchiver

QueryManager

PublicCloudStorageServices

D2.2

33

4 PreliminaryIntegrationPlanThischapterdefinesandvalidatestherequiredinterfaces(inputandoutput)forthe selected SIEMs to support the integration of each of the nine componentsdescribedintheprevioussection.The“integrationplan”discussedinthischapterfocusonthetechnicalaspectsofthe integration. Our main objective is to assess, as early as possible, thefeasibilityoftheintegrationoftheDiSIEMcomponentswiththeSIEMstobeusedinthevalidationscenarios.Otheraspectsoftheintegrationtasks(tobestartedinmonth 22) will be described in Deliverable 7.1 (month 24). Furthermore, therisksassociatedwiththeintegrationarediscussedinDeliverable9.2[D92].In the following, we first describe the plans for integrating some componentsbetweenthemselves(Section4.1),andlaterweproceedtodescribehoweachofthecomponentscanbeintegratedwiththefourselectedSIEM(Section4.2).

4.1 IntegrationbetweenComponentsCertaincomponentsweredesignedtoworktogether.Inthissection,wedescribehowthesecomponentensemblescanbebuilt.

4.1.1 OSINTDataFusionandAnalysisContext-awareIntelligenceIntegratorcanreceiverelevantOSINTinfofrombothDigitalMR OSINT Threat Predictor and OSINT Threat Analyser, as well as IoCprovidedfromthemonitoredinfrastructure,enrichthemwithathreatscoreandthensendtheinformationtotheSIEMortoothervisualization/analysistool.All this data will be exchanged using the STIX 2.0 format and a REST APIprovidedbytheContext-awareIntelligenceIntegratortoreceiveIoC,suchas:

• Threat intelligence generated from inside the infrastructure (e.g., fromanti-virussoftwareorfirewallsdeployedthroughoutthenetwork);

• OSINT data, possibly already processed by DISIEM components likeListening247ThreatPredictorandOSINTThreatAnalyser.

TheenrichedIoCgeneratedbytheContext-awareIntelligenceIntegratorwillbethenforwardedtotheSIEMs,asdescribedinSection3.2.3.In terms of integration with the visualization components, the STIX 2.0 datafollowsaJSONformat,whichcanbeunderstoodbypopularvisualizationstoolssuch as D3.js or Kibana. It could also be used with Logstash for furtherprocessingtosummarizethedatapriortovisualizationaswell.

4.1.2 ApplicationMonitoringandUserBehaviorAnalysisTheActionSequenceAnalysisforUserBehaviorUnderstandingcomponentmaytake as input the output of the Enhanced Application Monitoring component:action sequences extracted into sessions. Each session contains meta

D2.2

34

information such as the user and the machine’s IP address, and the maininformationaboutthetimeandtypeofeachactionperformedinthesession.

4.1.3 DiversityAnalysisandVisualisationThiscomponentcanbeusedtovisualizemultipletypesofdata.For instance, itcanreceivediversityanalysissuchasassessmentoftrue/falsepositive/negativerates and predictions frommodels produced by the Diversity Assessment andForecast component. The rates could be generated by a stand-alone processgeneratingadailyCSV fileorsending JSONondemand.Similarly,assessments,forecasts,orpredictionscanbesent to fileoras JSONviaaRESTAPI fromtheDiversityAssessmentandForecastcomponenttothevisualisationcomponent.AnotherpossibledatasourceforvisualisationcouldbeprovidedbytheOSINT-relatedcomponents,followingtheSTIX2.0format.

4.2 IntegrationwithSelectedSIEMsIn the following we discuss the preliminary integration plans for each of theenvisioned components being developed in the project, with respect to fourSIEMs:HPEArcSight,XL-SIEM,Splunk,andElasticStack.Ascanbeseenbytheprovideddescription,ourpreliminaryanalysisindicatesthatallcomponentscanbeintegratedwithalltheseSIEMs.

4.2.1 EnhancedApplicationMonitoringThis component enhances the SIEM capability for monitoring existingapplications,providingmeansforUserBehaviorAnalysisoftheseapplications.ExecutionEnvironment.ThiscomponentisdevelopedinJavausingtheApacheSpark platform. Therefore, the ML computations executed may be CPU/RAMintensive depending on the volume of logs generated by the monitoredapplication.Integration with SIEMs. This component is an Event Generator, it uses thesyslogprotocol[Lonvick2001]tocommunicatethegeneratedalarms/eventstotheSIEM.All SIEMsconsidered inDiSIEMcanworkwith syslog.HPEArcSight,Splunk, and Elastic Stack have connectors/components that natively interpretthis protocol, namely, Smart Connector Syslog [HPE 2017], Syslog-ng [Splunk2016], Logstash syslog module [Elastic 2017], respectively. For XL-SIEM, twosolutionsareavailable: (1) theeventsare sent to thersyslog daemon,which isavailable forallUNIX-likeOS,whichwrites themtoa file that canbe readandparsedbytheXL-SIEMAgent;orencapsulatetheeventsinJSONandsendthemtotheRabbitMQserverofXL-SIEM.

4.2.2 Network-basedBehaviorAnomalyDetectorThiscomponentmonitorsthenetworkandappliesmachinelearningtechniquesto detect traffic anomalies thatmay indicate attacks or other problems in theapplicationhost.

D2.2

35

ExecutionEnvironment. Real-time analysis of the network traffic against theapplication behavior models obtained after training. To implement thismonitoring for high-speed networks it may be necessary to have a cluster ofservers(e.g.,[Viegasetal.2017]).Integration with SIEMs. This component is an Event Generator. The eventsgeneratedbythecomponentwiththeresultoftheanalysisofthetrafficwillbesent to the SIEM using the syslog protocol, in the same way as described inprevioussection.

4.2.3 Listening247ThreatPredictorTheDigitalMRplatformprovides a concise summaryof theOSINT informationrelatedwiththecybersecurityofthemonitoredenvironment.Thiscomponentisasubscriptionservicethatcanbepersonalizedforeachclient.Execution Environment. For components or applications connecting to thethreat predictor; themain requirement for execution will be the capability ofmakingRESTAPIcallsandprocessingtheresultingdata.ItrequiressupportforprocessingtheresultingJSON(i.e.,inSTIXformat).TheOSINTwillbestoredandprocessedinthecloud.Theprocessingwillinvolvefiltering noise, aggregating data, andmaking decisions about the nature of thepresent data (i.e., threat or no threat) andpotentially the likelihood of threatsbasedonpreviousdata.IntegrationwithSIEMs.ThiscomponentisalsoanEventGeneratorthatcanbeused by following a subscriptionmodel. The IoC generated by this componentwillbesenttotheContext-awareIntelligenceIntegrator,ifavailable(seeSection4.1.1),ordirectlytotheSIEM.Inthelattercase,thesyslogprotocolwillbeused,inthesamewayasdescribedinSection4.2.1.

4.2.4 OSINTThreatAnalyserThiscomponentis likethepreviousone,butincorporatesseveralexperimentaltechniquesforprocessingdifferentkindsofOSINTdatasources.ExecutionEnvironment. The execution environment consists in one ormoreservers running the Apache Spark framework and some data collectors forobtaining OSINT from registered data sources. If a large amount of OSINT isanalysed, this(ese) server(s) needs to have a high processing and storagecapabilities.Integration with SIEMs. The OSINT Thread Analysed component is also anEventGenerator.Theintegrationcanbedoneasinthepreviouscomponent.

D2.2

36

4.2.5 Context-awareIntelligenceIntegratorTheoutputSTIXobjectsincludingthethreatscorecalculatedbythiscomponentwillbesenttotheSIEMsusingtheSYSLOG/CEFformat.TheycanbealsostoredlocallyinadatabasetobeaccesseddirectlybytheSIEMsorothervisualizationcomponents.Additionally,thecomponentcouldprovideaninterfacewheretheSIEMsorotherDiSIEM components can subscribe to a message queue (e.g., RabbitMQ) tosend/receive IoC if this would be more suitable for the DiSIEM validationenvironments.BesidesthedirectintegrationwithSIEMs,theIoCcanalsoberetrieveddirectlyfromthecomponentdatabasebycustomconnectorsormodulesavailableintheSIEMsystem.ExecutionEnvironment.Thiscomponentrequiresalocalstorageservice(e.g.,arelational database) and decent processing capabilities to be able to correlateandattributeathreatscoretoeachenrichedIoCcomputed.Integration with SIEMs. This component is both an Event Generator and anEventCollector.TheenrichedIoCgeneratedbythecomponentwillbesenttotheSIEMusingthesyslogprotocol,inthesamewayasdescribedinSection4.2.1.Forcollectingevents,selectedsensorscanbeconfiguredtosendmessagesdirectlytothe component or the methods for implementing Event Collectors or EventInspectorsinTable1canbeused.

4.2.6 ActionSequenceAnalysisforUserBehaviorUnderstandingThis component comprises a visualization dashboard that provide acomprehensive understanding of user behavior through the analysis of theiraction sequences during a session (i.e., user actions between its login andlogout).Execution Environment. This component is implemented using webtechnology,suchasJavaScript,CSS,andJava,byusinglibrariessuchasD3.js.IntegrationwithSIEMs. This component is anEventInspector. Its integrationwithexistingSIEMsshould,inprinciple,bebasedonthedashboardlevel,whichmightrequiredifferentmeasurementsfordifferentSIEMs.

• HPEArcSight:Dashboards inArcSight are implementedusing Java, butthiscomponentwillbe implementedusingweb-basedtechnologies(e.g.,JavaScript, HTML, CSS). Therefore, it will not be integrated tightly as adashboardviewinArcSight.Instead,itreadsandvisualisesdatafromthesame sources that ArcSight ingests data. In that way, ArcSight and thecomponentworkonthesamedataset.

D2.2

37

• XL-SIEM: Dashboard in XL-SIEM are implemented using web-basedtechnologies like this component. Therefore, it will be possible tointegrateitasaviewinXL-SIEMdashboard.However,minorchangesinthedashboardsourcecodemightbeneededtoenablethat.ForthisSIEM,thisisnotaproblemasATOScontrolsthesourcecodeofthesystem.

• Elastic Stack and Splunk: This will be the most straightforward

integration among the considered SIEMs because both systems useKibana for visualisation purposes. Kibana supports custom plugins thatcan easily accommodate the visualisation component. The idea is toconvert thecomponenttoaKibanaplugin,whichmaybepracticalsinceKibanaalsousesD3.jsasinthecomponent.

If some of these integrations prove to be too difficult, we can implement thedashboard as an independent application that follows the Event InspectorintegrationpatterndefinedinTable1.

4.2.7 VisualAnalysisofDiverseSIEMDataThe data generated from sensors and sent to SIEMs provide a chance forexplorationandassessmentoftheperformanceofdifferentcombinationoftools.This component supports such exploratory visual analysis of configurationsusingproductiondata.Execution Environment. Like the previous component, this one is also anadvanceddashboard,sotheenvironmentandtechnologyemployedisthesame.IntegrationwithSIEMs.TheprovideddashboardcanbeintegratedintheSIEMsdashboardsbythesamemeansdescribedforthepreviouscomponent.

4.2.8 DiversityAssessmentandForecastingTheDiversityAssessment and Forecasting component aids data collection andlabelling,allowingassessmentofdiversity,andforecastingandprediction.Execution Environment. This component requires a local database and anapplication server that runs a set of scripts for analysing the data, generatingforecasts.IntegrationwithSIEMs.ThiscomponentisbothanEventCollectorandanEventGenerator. Each SIEM already collects events. The alert filtering, labelling andassessmentstagewillreadthedatausingthecollectorsandinspectorsprovided.The forecastingandprediction featurecansenddataviasyslog to theSIEM,asdescribed in Section 4.2.6, and to the visualisation component (see precioussection)usingCSVsorJSON,describedabove.

• HPEArcSight:This canbe configured to read forecasts andpredictionssentviasyslog;

D2.2

38

• XL-SIEM:Extra fields forthestatusofeventswillbeaddedand labelleddirectlyinthefrontend.Theassessments,forecasts,andpredictionswillbesenttothevisualisationcomponent;

• Elastic Stackand Splunk: Logstashwill be used to add data to ElasticSearch,usingtheupdate_by_queryAPItoenrichpreviouseventswithdatafrom diverse sensors. CSV and JSON data can be sent to these SIEMSallowing uploading of labelled data and predictions provided as JSONcouldbevisualizeddirectlyorsenttothevisualisationcomponent.

4.2.9 Cloud-backedLong-termEventsArchiveThis component can be used to store SIEM-collected events in the cloud in asecure,dependable,andcost-efficientmanner.Theobjective is toarchivetheseeventsformoretime,enablinglong-termforensicanalysis.Execution Environment. The component is being developed in Java, andrequires a certain amount of local storage for caching data. The amount ofrequired processing should be moderate as the data need to be compressed,encrypted,andencodedbeforebeingwrittentothecloudstorageproviders.IntegrationwithSIEMs.Thiscomponent isanEventCollector (although it canalsobedeployedasanEventInspector,asdescribedinFigure11).Therefore,theintegrationofthecomponentwiththeSIEMsrequiresthereceptionoftheeventssent to the SIEM. This is done in different ways for different systems, asdescribedinTable1.

4.3 FinalRemarksThis chapter presented an overview of the integration plan for the ninecomponents currently devised for the DiSIEM project. This integration plancomprisesboththeintegrationofcomponentsthatcanworktogetherandtheirintegration with the four SIEMs employed for validation in the project: HPEArcSight,XL-SIEM,Splunk,andElasticStack.

D2.2

39

5 SummaryandConclusionsThis deliverable defined the reference architecture adopted in the DiSIEMproject, discussed all the components (also called extensions) underdevelopment in the project, and presented a preliminary version of theintegration work plan of these components (between themselves and with asubsetofselectedSIEMs).Building upon the main limitations of existing SIEMs (initially identified inDeliverable 2.1 [D21]), we defined four high-level architectural principles forSIEM extensions. These principles lead to a general reference architecture(Figure 2) in which the technical contributions being developed in workpackages 3-6 are grouped in nine components. All these components can beintegratedtoSIEMs following threedifferentpatterns -EventGenerator,EventInspector,andEventCollector–thataresupportedbythefourSIEMsselectedintheproject(Table1).Thedeliverable also containsadiscussionabout the technical feasibilityof theintegration of each componentwith existing SIEMs.Within this discussion,weassess the requirements for the components execution environments anddiscuss how they can be integrated with the SIEMs being considered in theproject. These drafted plans are necessarily preliminary as most of thecomponents are still in their early prototype stage, and the componentsintegration tasks (T4.4, T5.4, and T6.4) start only onmonth 22. Therefore, allthese plans will be re-evaluated and further refined at this point, when thecomponentsareexpected tobe ready for initialdeployment.These integrationactivitieswillbereportedinDeliverable7.1(“ValidationPlan”),inthemonth24of theproject.Foradiscussionabout the risksassociatedwith the integration,wereferthereadertoDeliverable9.2[D92].Theresultspresentedinthisreport,togetherwiththeanalysisofthestateoftheart in SIEM technology presented inDeliverable 2.1 [D21], corresponds to themain results of the DiSIEM WP2 (“Requirements and Architecture for SIEMIntegration”). These results will guide the design and implementation of thetechnicalinnovationsdevelopedinthenexttwoyearsoftheproject.

D2.2

40

References[AblonandBogart2017]L.AblonandA.Bogart.ZeroDays,ThousandsofNights:The Life and Times of Zero-Day Vulnerabilities and Their Exploits. RandCorporation,2017.[Alpaydin2014]E.Alpaydin.IntroductiontoMachineLearning.3rdEdition.TheMITPress.2014.[Bessanietal.2013]A.Bessanietal.DepSky:dependableandsecurestorageinacloud-of-clouds.ACMTransactionsonStorage(TOS),9(4),12.2013.[Bessanietal.2014]A.Bessanietal.SCFS:aSharedCloud-backedFileSystem.USENIXAnnualTechnicalConference.2014.[Collins 2014] M. Collins. Network Security Through Data Analysis. O’Reilly.2014.[CorinnaandVapnik1995]CorinnaCortesandVladimirVapnik.Support-vectornetworks.Machinelearning20,3,273–297.1995.[D21] DiSIEM Consortium. In-depth Analysis of SIEMs Extensibility. DiSIEMProjectDeliverable2.1.February2017.[D41] DiSIEM Consortium. Techniques and Tools for OSINT-based ThreatAnalysis.DiSIEMProjectDeliverable4.1.August2017.[D51]DiSIEMConsortium.VisualisationSystemInfrastructureandRequirementAnalysis.DiSIEMProjectDeliverable5.1.August2017.[D61] DiSIEM Consortium. Preliminary Architecture and Service Model ofInfrastructureEnhancements.DiSIEMProjectDeliverable6.1.August2017.[D92]DiSIEMConsortium.RiskAssessmentPlan.DiSIEMProjectDeliverable9.2.September2017.[Elastic2017]ElasticCo.Sysloginputplugin. https://www.elastic.co/guide/en/logstash/current/plugins-inputs-syslog.html.AccessedinAugust2017.[Gartner 2016] Gartner Group, 2016Magic Quadrant for Security InformationandEventManagement.August2016[HPE 2017] HP Enterprise. ArcSight Connectors Documentation.https://community.saas.hpe.com/t5/ArcSight-Connectors/tkb-p/connector-documentation.AccessedinAugust2017.[HPDC2012]Hewlett-PackardDevelopmentCompany. (2012).Get scalable logcollectiontoday.RetrievedAugust26,2015,from

D2.2

41

www.hp.com/hpinfo/newsroom/press_kits/2013/HPDiscover2013/Datasheet_ArcSight_Connectors.pdf[IBM2014]IBM®.(2014).QRadarConnector.RetrievedAugust26,2017, fromhttp://www-01.ibm.com/support/knowledgecenter/SSCQGF_7.2.0.2/com.ibm.IBMDI.doc_7.2.0.2/rg_conn_qradar_intro.html[IBM2014b]IBM®.(2014).RESTfulAPIsinQRadarV7.2.3.RetrievedAugust26,2017, from http://www-01.ibm.com/support/knowledgecenter/SS42VS_7.2.3/com.ibm.qradar.doc_7.2.3/c_rest_api_reference.html?cp=SS42VS_7.2.3%2F8-1-1[Lonvick2001]C.Lonvick.TheBSDSyslogProtocol.IETFRFC3164.2001.[Oliveiraetal.2016]T.Oliveiraetal.ExploringKey-ValueStoresinMulti-WriterByzantine-Resilient Register Emulations.Proceedings of the 20thInternationalConferenceOnPrinciplesOfDIstributedSystems(OPODIS’16).2016.[Splunk2016]UsingSyslog-ngwithSplunk. https://www.splunk.com/blog/2016/03/11/using-syslog-ng-with-splunk.html.2016.[Swift2010]D.Swift.SuccessfulSIEMandLogManagementStrategiesforAuditandCompliance.RetrievedAugust26,2015,fromhttp://www.sans.org/reading-room/whitepapers/auditing/successful-siem-log-management-strategies-audit-compliance-33528[Viegas et al. 2017] E. Viegas et al. A Resilient Stream Learning IntrusionDetectionMechanismforReal-timeAnalysisofNetworkTraffic.IEEEGlobecom2017.2017.