As fast as a grid, as safe as a database

  • Published on

  • View

  • Download

Embed Size (px)


From the Gaming Scalability event, June 2009 in London ( In this talk, Matthew Fowler from NT/e looks at the persistence issues on computing clouds. He discusses architectural principles and problems that cloud persistence presents to application developers and presents a possible solution, focusing on the key ideas, the tooling and the deployment options.Matthew Fowler runs the Java business unit of New Technology/enterprise. Matthew received a BSc in Computer Science from MIT. He has developed and marketed products in many areas of software - LANs, WANs, software tools, language processors and generation of enterprise applications. His current interests are system generation and grid/cloud applications.


<ul><li> 1. As fast as the grid, as safe as a databaseMatthew Fowler NT/e 1CloudSave </li></ul> <p> 2. Agenda 1. Introduction 2. CloudSave 3. ACID assessment 4. Architecture shift 2 CloudSave 3. NT/e NT/e New Technology / enterprise Formed 1996 1998 - Server-side Java. BEA Partners 2007-8: 2.5m turnover, 100k profit JeeWiz Version 1.0: March 2002 JeeWiz Version 5: 2008, engine open-sourced Key customers: DeutschePost/SOPERA, UBS, BEA, GigaSpaces 3 CloudSave 4. Products J2EE, StrutsCloudHibernate,Spring, JSFSave Engine4 CloudSave 5. CloudSave HopesBig Data Blinding Speed Scalable - smallhuge Rock Solid5 CloudSave 6. CloudSave Fears OracleDistributed Transactions Low Reliability Complicated Programming ACID has be neutralised 6 CloudSave 7. CloudSave Features GigaSpaces XAP Platform Spaces ... JavaSpaces Scalable Architecture Data Scalability Processor Scalability Appropriate for single-space deployment IMDG == In-Memory Database Mapped back to persistence sources Distributed Transactions7CloudSave 8. As fast as poss 1: IMDG (database in memory) Partition 1 Partition 2Partition 3BackupsCust/Cust/ Cust/ Cust/ StockStock StockUtilsOrderOrder Order Order 12 312 3 4TablesCustomer, Order, OrderLine Part, StockItem, Bin Sequences 8 CloudSave 9. In-Memory Data Bases - Are You Crazy? What's it worth: Loss of sales - down by 7% from 5 - 6 seconds Business justification for $100m/year co: $7m/year for performance cost of Amazon 8Gb AMI: $2,400/yr *4 for software costs? = $10,000/yr say PAYS FOR 700 SERVERS, 500Gb in-memory DB 9 CloudSave 10. IMDG == SOR The grid is the System Of RecordPartition 3 Need to allocate PKs in grid Utils No database auto allocation Same problem for unique transaction IDs anything else Grouping - 50 at onceSequences 10 CloudSave 11. As fast as poss 2: Locality Bulk up your entities Customer main entity Subsidiary entities: Order; OrderLine Locality of reference: same node Avoid network calls (as far as poss.) Routing from master PK "Rows" ... IMDG entries like DB rows11 CloudSave 12. As fast as poss 3:TxB - Transaction Buffer It's the Gear Box! Holds transactions in mem Yes to client, then commit Critical path speed-up TxB Commit threads: 1. saves to local tx log 2. commits tx in grid 3. sends to persistence Pluggable targets - even XA!12 CloudSave 13. Persistence Management Multi-threading Necessary for performance Order must be preserved Mustn't start save before overlapping save complete Back-end resilienceT TT T Persistence targets are slaves The show must go on! Continuous efforts to persist Roll out tx's to disk if no DB13 CloudSave 14. Atomicity 100% or 0% - Commit or abort completely TxB and IMDG must kept in synch TxB controls rollback and timeout 14CloudSave 15. Consistency Changes must be database-like observe DB constraints commit transactionally (sounds like distributed transactions to me) Nodes in different partitions kept in sync e.g. indexes must be consistent with transaction15 CloudSave 16. Isolation / Durability Isolation Repeatable_Read isolation Transactions can choose to wait return busy immediately Durability Running system survives n-1 failures Transactions logged to disk off critical path Critical path: TxB survives n-1 failures16CloudSave 17. Failures and Timeouts All transactions should have a timeout Killed by TxB if timeout exceeded Grid nodes informed too Grid nodes can query TxB for status Design of failure handling... Where did Q1 go? 17 CloudSave 18. The Layers Transaction ClientPartitionsBuffer DataGigaSpaces-aware Generate Classes Java API Management DB server Save/Load GenericPlug-in CommitterUser[e.g. Queues] Transactionstart/GigaSpaces Proxy start/buffer/ Managementcommit/abort CRUD/commit/abort commit/abort18 CloudSave 19. Java/DB Viewpoint Some config to distribute/size grids Some architectural understanding Basically SOA Business Objects represented as services GigaSpaces as embedded product19 CloudSave 20. Cloud-Private DB Link-up Web-app + IMDG in cloud Real DB in Data Centre20 CloudSave 21. In-Cloud Federated Applications Airways Federated Transaction Buffer21 CloudSave 22. DIY Don't The Darwin Award? Out-of-the-box solution makes more sense22CloudSave 23. Why not start here? Appropriate for a small architecture IMDG: faster to read CloudSave: faster to commit Scalable Greener Average 10-15% utilisation on servers Use virtual machines or small machines Then scale out to real/big 23 CloudSave 24. Questions? More information: </p>


View more >