At Telefonica PDI we are developing an internal messaging service to be used by our own products. Sprayer is a low latency, reliable messaging system supporting delivery of messages to a single receiver, predefined group of receivers or specific list of receivers over different channels (SMS, HTTP, WebSockets, Email, Android, iOS and Firefox OS native push…). We are using Redis, MongoDB and RabbitMQ to implement Sprayer. In this talk we will review Sprayer’s architecture. We will see for each of these technologies, why, where and for what they are used as well as some tips. Talk done with Pablo Enfedaque ( @pablitoev56 ) at NoSQL Matters Barcelona 2013.
- 1.Sprayer low latency, reliable multichannel messaging for Telefonica Digital
2. who are we? Pablo Enfedaque @pablitoev56Javier Arias @javier_arilos Javier is a Software Architect and developer, worked in different sectors such as M2M, Telcos, Finance, Airports.Pablo is a SW R&D engineer with a strong background in high performance computing, big data and distributed systems. 3. some context Telefnica is the 4th largest telco in the world 2 years ago Telefonica Digital was established to spread our business to the digital world former Telefonica R&D / PDI was merged into this new company 4. overview we are developing an internal messaging service to be used by our own products we have polyglot persistence using different NoSQL technologies in this talk we will review Sprayers architecture and, for each technology, how it is used 5. why sprayer? a common push messaging service. why? each project with messaging needs was implementing its own server its own way 5 push messaging systems in the company none of them supporting a wide variety of transports independent deployment and operations 6. the problem cross technology push: iOSAndroidWebsocketseMailSMSHTTPFirefoxOSpoint to point and pubsub: 1 to 1PaaS, multitenant1 to N1 to Group 7. inspiration Googles Thialfi: http://research.google. com/pubs/pub37474.html Twitter Timeline:http://www.infoq.com/presentations/Twitter-Timeline-Scalability Pusher: http://www.pusher.com Pubnub: http://www.pubnub.com Amazon SNS: http://aws.amazon.com/sns/ 8. the proposalSPRAYER!Sprayer is a low latency, reliable messaging system supporting delivery of messages to a single receiver, to a predefined group of receivers or to a specific list of receivers over different channels (WebSockets, SMS, Email, HTTP and iOS, Android or Firefox OS native push) 9. the proposalSPRAYER!our motto: you care about business, we deliver your messages 10. server side API 11. ? server side API 12. server side API challenges common interface for all channels reliable, consistent, idempotent route messages efficiently simple and user oriented manage subscriptions send messages: to list or group (topic) get delivery feedback standards based (HTTP + Json) 13. architecture sprayer backendGCMAPPLICATION ACCEPTER REST APIMESSAGES DISPATCHINGAPNssms gateway email gatewayOperational storage 14. messages dispatching 15. ? messages dispatching 16. message dispatching challenges scaling horizontally reliability different channels: HTTP (outbound) Websockets (inbound) iOS push (APNs) Android push (GCM) SMS eMail 17. architecture sprayer backend WEBSOCKETSANDROIDAPPLICATION ACCEPTER REST APIMESSAGES ROUTINGGCMIOSAPNsHTTPSMSEMAILOperational storagesms gateway email gateway 18. outbound-stateless dispatchers simple dispatchers: HTTP, iOS, Android... Take message, get msg subscribers, dispatch to receiver, report feedback Completely statelessACCEPTER REST APIANDROIDOperational storageGCM 19. connection aware dispatchers clients (websockets, HTTP long poll ) messages are stored until clients connect client inits a persistent connection potentially, millions of clients WEBSOCKETSACCEPTER REST APIDELIVE RERROUTERinboxesOperational storage 20. message routing 21. ? message routing 22. message routing challenges routing (two-steps): API routes messages to N dispatchers Each dispatcher routes message to M receivers (subscribers of a group)both steps must be decoupledThe number of receivers could be thousands 23. architecture sprayer backend WEBSOCKETSWS androidANDROIDGCMIOSAPNsiOSAPPLICATION ACCEPTER REST APIHTTP smsHTTPemail SMSSubscriptions storageOperational storageFEEDBACKsms gatewayEMAILemail gateway 24. async message delivery feedback 25. ? async message delivery feedback 26. async delivery feedback challenges make msg feedback available through API to clients feedback must not compromise message delivery or API The number of updates could be millions feedback: msg delivery, connections, push 27. architecture sprayer backend WEBSOCKETSWS androidANDROIDGCMIOSAPNsiOSAPPLICATION ACCEPTER REST APIHTTP smsHTTPemail SMSSubscriptions storageOperational storageSTATUS FEEDERsms gatewayEMAILemail gatewayfeedback 28. technology stack 29. subscriptions storage sprayer backend WEBSOCKETSWS androidANDROIDGCMIOSAPNsiOSAPPLICATION ACCEPTER REST APIHTTP smsHTTPemail SMS?Subscriptions storageOperational storageSTATUS FEEDERsms gatewayEMAILemail gatewayfeedback 30. subscriptions storage sprayer backend WEBSOCKETSWS androidANDROIDGCMIOSAPNsiOSAPPLICATION ACCEPTER REST APIHTTP smsHTTPemail SMSEMAILOperational storageSTATUS FEEDERsms gateway email gatewayfeedback 31. dispatcher receiver inboxesWEBSOCKETSACCEPTER REST APIROUTER? inboxesDELIVE RER 32. dispatcher receiver inboxesWEBSOCKETSACCEPTER REST APIDELIVE RERROUTERinboxes 33. redis Redis is an open source, advanced keyvalue store. It is often referred to as a data structure server (...) - (redis.io) why redis? - amazingly fast - easy to use - usage patterns: shared cache, queues, pubsub, distributed lock, counting things 34. redis use cases use cases in Sprayer: group subscribers x channel channels x group websockets channel queues (potentially million receivers) limitations for our use cases: memory bound queries and pagination high throughput queues 35. redis concerns what happens when dataset does not fit in memory? two strategies partition datasets to different redis clusters sharding: based in tenant would be easy FT and HA easy way: master-slave with virtual IPs, switch slaves IP when masters out. home made daemon sentinel based, some tests done, needs to be supported by client library redis cluster being implemented; limited features 36. operational storage sprayer backend WEBSOCKETSWS androidANDROIDGCMIOSAPNsiOSAPPLICATION ACCEPTER REST APIHTTP smsHTTPemail SMSEMAIL?Operational storageSTATUS FEEDERsms gateway email gatewayfeedback 37. operational storage sprayer backend WEBSOCKETSWS androidANDROIDGCMIOSAPNsiOSAPPLICATION ACCEPTER REST APIHTTP smsHTTPemail SMSEMAILSTATUS FEEDERsms gateway email gatewayfeedback 38. mongodb mongoDB (from "humongous") is a document database (...) features: full index support, replication & HA, autosharding... (mongodb.org) why mongoDB? scaling & HA great performance dynamic schemas versatile 39. mongodb use cases use cases in Sprayer: operational DB, administrative data message delivery feedback updates (potentially millions of records)limitations for our use cases: operations with sets of subscribers high throughput queues 40. mongodb concernsno concerns about mongodb for our usecase. maybe, in the long term, can it handle the huge amount of feedback write operations without affecting the API? 41. async queues sprayer backend WEBSOCKETSWS androidANDROIDGCM?IOSAPNssmsHTTPiOSAPPLICATION ACCEPTER REST APIHTTPemail SMSEMAILSTATUS FEEDERsms gateway email gateway?feedback 42. async queues sprayer backend WEBSOCKETSANDROIDIOS APPLICATION GCMAPNsACCEPTER REST API HTTPSMSEMAILSTATUS FEEDERsms gateway email gateway 43. rabbitmq robust messaging for applications, easy to use (www.rabbitmq.com) why rabbitmq? very fast reliable builtin clustering 44. rabbitmq use cases use cases in Sprayer: jobs for dispatchers (API => dispatchers) feedback status updates: message delivery, connections, device status (dispatchers => API)limitations for our use cases: not scaling well to millions of queues (websocket receiver inboxes) 45. rabbitmq concerns no concerns! rabbitmq is best suited to very high throughput messaging 46. full tech stack sprayer backend WEBSOCKETSANDROIDIOS APPLICATION GCMAPNsACCEPTER REST API HTTPSMSEMAILSTATUS FEEDERsms gateway email gateway 47. sum up 48. design threats 49. design threats related data in different places: redis, rabbitmq and mongo we are not transactional, our components remain sane in case of a DB failure, idempotent operations help here light implementation of Unit of Work architectural pattern 50. architecture guidelines 51. architecture guidelines asynchronous processing / queues everywhere dedicated dispatchers for each transport common API interface used the best tool for each responsibility: polyglot persistence processes as stateless as possible 52. YES, SPRAYER DOES!thanks for coming