17
Hipster batch making batch processing cool again

Hipster batch - making batch processing cool again

Embed Size (px)

Citation preview

Page 1: Hipster batch - making  batch processing cool again

Hipster batchmaking batch processing cool again

Page 2: Hipster batch - making  batch processing cool again

HTTP Microservices

One solution does not fit all problems

Page 3: Hipster batch - making  batch processing cool again

Batch processing

Page 4: Hipster batch - making  batch processing cool again

+batch processing

+ REST with hypermedia

+ metadata

=

hipster batch

Page 5: Hipster batch - making  batch processing cool again

REST

● Representational State Transfer● Modelled on HTTP resource verbs (GET,

POST, PUT etc)● No reason for resources to be dynamic

Page 6: Hipster batch - making  batch processing cool again

Hipster batch

Step 1: Scheduled scale from 0 to 1 (using “autoscaling” feature)

Page 7: Hipster batch - making  batch processing cool again

Step 2: Process data (CSV files, SQL query)

Hipster batch

Page 8: Hipster batch - making  batch processing cool again

Step 3: Write output to S3 with hypermedia links to allow files to be navigated by a consumer

Hipster batch

HTTP: open(“http://some-service”) { | file | file.read }

S3:open(“s3://some-bucket/index.json”) { | file | file.read }

Page 9: Hipster batch - making  batch processing cool again

{ "_links": { "self": { "href": "s3://rea-reporting/revenue/customer/index.json" }, "all": [ { "href": "s3://rea-reporting/revenue/customer/ancestry-latest-by-month/2015-02.json" }, { "href": "s3://rea-reporting/revenue/customer/ancestry-latest-by-month/2015-03.json" } ], "latest": { "href": "s3://rea-reporting/revenue/customer/ancestry/20150312_052251_2015-03.json" } }}

Index

Page 10: Hipster batch - making  batch processing cool again

Ancestry

● JSON● Metadata about job (time, duration, name,

status)● Links to data files● Links to logs● Links to (or copy of) source data● Includes ancestry of source data

Page 11: Hipster batch - making  batch processing cool again

{ "id": "20150312_050856", "status": "success", "name": "customer-collector", "month": "2015-02", "startTime": "2015-03-12T16:08:56+11:00", "duration": 16, "_links": { "self": { "href": "s3://rea-reporting/revenue/customer/ancestry/20150312_050856_2015-02.json" }, "git": { "href": "https://git.realestate.com.au/mad-dart/customer-collector/commit/git-hash-not-set" }, "splunk": { "href": "https://splunk:8000/en-US/app/search/flashtimeline?q=search%20index=mad-reporting%20service=customer-collector%2092c056cc5996d8dad547d077cebb5c25" }, "data": [ { "href": "s3://rea-reporting/revenue/customer/data/20150312_050856/Customer-25-2-2015.csv.json.gz" } ] }, "ancestry": { "name": "customer", "uploader": "[email protected]", "_links": { "data": { "href": "s3://rea-reporting/revenue/customer/source/20150312_050856/Customer-25-2-2015.csv.gz" }, "box": { "href": "reference/customer/Customer-25-2-2015.csv" } } }}

Page 12: Hipster batch - making  batch processing cool again

File structureindex.json

ancestry/timestamp_1_2015-02.json

data/timestamp_1/Customers-2015-02.csv.json.gz

source/timestamp_1/Customers-2015-02.csv.gz

ancestry-latest-by-month/2015-02.json

Page 13: Hipster batch - making  batch processing cool again

File structureindex.json

ancestry/timestamp_1_2015-02.json

ancestry/timestamp_2_2015-02.json

data/timestamp_1/Customers-2015-02.csv.json.gz

data/timestamp_2/Customers-2015-02.csv.json.gz

source/timestamp_1/Customers-2015-02.csv.gz

source/timestamp_2/Customers-2015-02.csv.gz

ancestry-latest-by-month/2015-02.json

Page 14: Hipster batch - making  batch processing cool again

Why is it cool?

● Security● Flexibility● Auditability

Page 15: Hipster batch - making  batch processing cool again

You didn’t think I could do a presentation without mentioning Pact did you?

Page 16: Hipster batch - making  batch processing cool again

Contracts with non-HTTP Pact

Contract(example data

structures)

Consumer

Correctly handles expected data

Provider

Can produce expected data