34
1 AWS Summit 2017 Coordinating External Data Importer Services using AWS Step Functions Andre Vella, Director of Data Marcos Rebelo, Principal Data Engineer

AWS Summit 2017 - aws-de-media.s3-eu-west …aws-de-media.s3-eu-west-1.amazonaws.com/images/AWS... · home24.tech.blog home24.de/jobs ... 7 Behold … AWS Step Functions ... Task:

  • Upload
    voanh

  • View
    218

  • Download
    2

Embed Size (px)

Citation preview

1

AWSSummit2017CoordinatingExternalDataImporterServicesusingAWSStepFunctions

AndreVella,DirectorofDataMarcosRebelo,PrincipalDataEngineer

2

home24“Zuhauseist,wasdirgefällt”

THEEUROPEANMARKETLEADERANDGO-TODESTINATIONFORHOME&LIVINGONLINESHOPPING

DynamicGrowth+49%Y-o-Ysalesgrowthin2015

ConsumerDestination>100000Articles

InternationalReach8Countries,2Continents

SignificantScaleStartedin2012, €234m

netsalesin2015

3

100+intechdepartment1

8 Teams2

Data Team working with Scala, Spark, R, ...4

16inDataTeam,14nationalities3

home24.tech.bloghome24.de/jobsLinkedIn - Home24 AG

home24“codesweetcode”

4

home24DataPlatform

5

ExternalDataSources

ImportGBsofDataintoS3everydayfrommultipleServices

6

EvaluatingOptions

ApacheAirflow AmazonSimpleWorkflow

DataVirtuality

PotentialBuyandBuildOptions

funnel.io

andsomeothers...

7

Behold…AWSStepFunctions

“StateMachine” (noun)1. AconceptusedbyComputer

Scienceprofessorsfortorturingundergrads,fullofarcanemath.

2. ApracticalwaytobuildandmanagemodernServerlessCloudapps.

8

CorePrinciplesofourExternalDataImporter

SIMPLE

𝝺SERVERLESSRESILIENT

9

WorkingwithAWSStepFunctions

{"StartAt" : "DispatcherState","Comment" : "An example of the ASF.","States" : {"DispatcherState" : { ... },...,"FinalState" : {"Type" : "Pass","End" : true

}}

}

DefineinJSON VisualizeinConsole MonitorExecutions

10

OurApproach…asanidealscenario

Downloader:CycleofAWSLambdaFunctiondownloadingfilesfromtheremoteservicetoS3“Raw” Bucket

Refine:CycleofAWSLambdaFunctionprocessingeachfilethatarrivetotheS3“Raw” BucketandstoringitinS3“Refined” Bucket

𝝺2

𝝺3

Starter:AWSLambdafunctionstartingStepFunction

𝝺1

11

SolutionDesign

Challenge#15MinuteLambdaLimit

Dispatcher:OneAWSLambdaFunctionthatsplitsworkinsmallerloads

𝝺4

12

SolutionDesign

Challenge#2Maximuminputof32,768characters

13

SolutionDesign

14

SolutionDesign

15

SolutionDesign

16

SolutionDesign

17

OurtakeonAWSStepFunctions

18

DefiningAWSStepFunction“States”

Succeed:stopsanexecutionsuccessfully

Choice:Addsbranchinglogic

6

5

Fail:stopstheexecutionofthestatemachineandmarksitasafailure7

Wait:delaysthestatemachinefromcontinuingforaspecifiedtime

Pass:simplypassesitsinputtoitsoutput,performingnowork

Parallel:canbeusedtocreateparallelbranchesofexecutioninyourstatemachine

Task:singleunitofworkperformedbyastatemachine

2

4

1

3

19

{"StartAt" : "DispatcherState","Comment" : "An example of the ASF.","States" : {"DispatcherState" : { ... },...,"FinalState" : {"Type" : "Pass","End" : true

}}

}

AnatomyoftheTemplate- DefininginJSON

20

{"StartAt" : "DispatcherState","Comment" : "An example of the ASF.","States" : {"DispatcherState" : { ... },...,"FinalState" : {"Type" : "Pass","End" : true

}}

}

AnatomyoftheTemplate- DefininginJSON

21

"DownloaderChoiceState" : {"Type" : "Choice","Choices" : [ {"Variable" : "$.downloaderFinished","BooleanEquals" : false,"Next" : "DownloaderState"

} ],"Default" : "RefinerChoiceState"

},

"DownloaderState" : {"Type" : "Task","Resource" : "arn:aws:lambda:eu-wes...","Next" : "DownloaderChoiceState"

}

CycleonStepFunctions

22

"DownloaderChoiceState" : {"Type" : "Choice","Choices" : [ {"Variable" : "$.downloaderFinished","BooleanEquals" : false,"Next" : "DownloaderState"

} ],"Default" : "RefinerChoiceState"

},

"DownloaderState" : {"Type" : "Task","Resource" : "arn:aws:lambda:eu-wes...","Next" : "DownloaderChoiceState"

}

CycleonStepFunctions

23

"DownloaderState" : {"Type" : "Task","Resource" : arn:aws:lambda:eu-...","Retry" : [ {"ErrorEquals" : [ "States.ALL" ],"IntervalSeconds" : 60,"MaxAttempts" : 5,"BackoffRate" : 2

} ],"Next" : "DownloaderChoiceState"

}

RetryonAWSLambdaFunctionError

24

VisualizinginConsole

25

MonitoringExecutioninConsole

26

MonitoringErrorsinConsole

27

MonitoringinAmazonCloudWatch

28

AWSStepFunctions+CloudFormation

29

FunFacts

20+Servicesandincreasing

~5mandaysfulldev-cycleeffortperService

~50GBofGZIPexternaldataEveryday

30

PriceFacts

4,000statetransitionsarefreeeachmonth

$0.025per1,000statetransitionsthereafter($0.000025perstatetransition)

…comingfromaSaaScostingus$5000/month

Wearedoing~1000statetransitionsaday

to~$45/monthforStepFunctions+Lambda→$44Lambda→<$1StepFunctions

31

KeyTakeaways

Focusonbuildingproductsandnotonoperations.Theworldofserverlessisdevelopingandchangingfast

TaketimetotrynewAWSServices

StepFunctionsisagreattoolforbuildingeffectiveworkflowsandstatemachines

2

1

3

32

GetStartedaws.amazon.com/documentation/step-functions/

33

34

ANYQUESTIONS?

Pleasecomemeetusatthehome24booth