Building a Scalable Platform for Sharing 500 Million Photos

  • Published on
    16-Apr-2017

  • View
    67

  • Download
    1

Embed Size (px)

Transcript

<p>PowerPoint Presentation</p> <p>Building a Scalable Platform for Sharing 500 Million PhotosWouter Crooy &amp; Ruben HeusinkveldSolution Architect &amp; Technical Lead, Albumprinter</p> <p>Ruben1</p> <p>Wouter CrooySolution Architect, Albumprinter @wcrooy</p> <p>Ruben2</p> <p>Ruben HeusinkveldTechnical Lead, Albumprinter @rheusinkveld</p> <p>Ruben3</p> <p>Who are weWouter Crooy Solution ArchitectRuben Heusinkveld Technical LeadNeo4j Certified Professionals</p> <p>Ruben</p> <p>At Albelli we want to inspire people to relive and share lifes moments by easily creating beautiful personalized photo products.Vision: To brighten up the world by bringing peoples moments to life.</p> <p>Albumprinter is a Cimpress company. The most known brand here in the US for Cimpress is Vistaprint. Im sure youve all know it.Albumprinter is based in Amsterdam, The Netherlands.We have multiple consumer brands to serve the European marketAlbumprinter aquired FotoKnudsen in June 2014</p> <p>4</p> <p>The photo organizerDeliver well organized, easy to use and secure storage for all your imagesEase the process of selecting photos for creating photo productsStarted as part of a R&amp;D Skunk works project</p> <p>Ruben</p> <p>Goal: Deliver well organized, easy to use and secure storage for all your imagesBuild by team of 5 (1 designer, 1 frontend developer, 1 quality engineer, and Wouter and myself focusing on the backend)</p> <p>5</p> <p>The photo organizer</p> <p>RubenLaunched June of this yearAvailable on all devices6</p> <p>The photo organizer</p> <p>RubenPhotos are automatically grouped together into events7</p> <p>The photo organizer</p> <p>Ruben:Easy to share photos with friends or publicly if you wantPrivately via invites8</p> <p>The photo organizer from photos to products</p> <p>Ruben:The photos can be used to create any product like a photo book, calendar or wall decor9</p> <p>The photo organizer demo</p> <p>https://minnebanken.no</p> <p>Ruben:The photos can be used to create any product like a photo book, calendar or wall decor10</p> <p>The challenge</p> <p>Wouter11</p> <p>The challengeReplace legacy system with the new photo organizerMove 1.3 PB of photos from on premise to cloud storageAnalyze &amp; organize all photos (511 million)Data cleansing while importingUsing the same technology / architecture during import and afterAbility to add features while importingCore of the systems are built in .NET</p> <p>WouterNot uploading duplicates12</p> <p>The importHard deadlineFactory closing that holds the data center with all photosStarted 1st of AprilMinimum processing of 150 images / second~500 queries / second to Neo4jUp to 700 EC2 instances on AWS</p> <p>Wouter13</p> <p>How we did itMicro servicesCommand Query Responsibility Segregation (CQRS)ClusterMultiple write nodesSingle master read only nodesHAProxyCypher only via REST interface.NET Neo4jClient</p> <p>Wouter14</p> <p>Architecture</p> <p>WouterIn Neo4j we only store the metadata. The actual photos are stored in Amazon Simple Storage Service (S3).15</p> <p>Why we choose Neo4jClose to domain modelNot an ordinary (relational) databaseLooking for relations between photos/usersScalableFlexible schema Natural / fluent queriesACID / data consistency</p> <p>Wouter16</p> <p>The design</p> <p>Ruben17</p> <p>Graph model</p> <p>Ruben18</p> <p>Graph model</p> <p>Ruben19</p> <p>Graph model</p> <p>Ruben</p> <p>20</p> <p>Our Neo4j databaseMore than 1 billion nodes4.1 billion properties2.6 billion relationsTotal store size of 863 GB</p> <p>RubenFor all those photos this resulted in:More than 1 billion nodes4.1 billion properties2.6 billion relationsTotal store size of 863 GB</p> <p>21</p> <p>Command Query Responsibility SegregationSeperation between writing and reading dataDifferent model between Query and Command APIIndependent scaling</p> <p>RubenI know its really ambitious to explain CQRS within 2 slides. But I would still like to explain why and how it could work with Neo4j. </p> <p>Events sourcing. Double update to db and cache. In our case we used a cache update/flush on certain rules. Pro: Less work, database is to large for cache.Con: Not always reliable cache source. 22</p> <p>Bumps and Solutions</p> <p>Wouter23</p> <p>CQRS Seperate Reads &amp; WritesNo active event publishing in placeSpecific scenarios for updating / writing dataAbility to create seperate model for read and writeUpdates (pieces) the user graphRequires reliable and consistent readScale out -&gt; overloading locking of (user) graphAfter importLow performance scenarios -&gt; cache with lower update priority</p> <p>WouterNeo4j in its core is very capable of handling CQRS interfaces. Since youre not updating a table but (parts) of the graph. Due to its ACID nature is should also be able to make sure there are no race-conditions. But since this archicture allows to massively scale out that does not always match the capebilities of a ACID DB. Especially in the cases where the writes are more occuring then the reads. </p> <p>Make sure the read is consistentIn our situation, CQRS is extra complex since we have a ordered crawler (5+ steps) which also does the writes. But the crawler(s) and query api are still allowed to do reads. </p> <p>https://www.infoq.com/news/2015/05/cqrs-advantageshttp://udidahan.com/2011/04/22/when-to-avoid-cqrs/http://udidahan.com/2009/12/09/clarified-cqrs/http://udidahan.com/2010/08/31/race-conditions-dont-exist/</p> <p>See also consistent read solution. In cases were we dont need to have consistsent read we can use the case. 24</p> <p>Read after write consistencyAll reads should contain the very latest and most accurate dataReplication delay between serversSplit on consistency</p> <p>Article by Aseem Kishore:https://neo4j.com/blog/advanced-neo4j-fiftythree-reading-writing-scaling/</p> <p>WouterRead fastly outnumber writes in our application as for many applications.Split on consistency, not read vs. writeTrack user last write time for read after write consistencyMonitor and tune slave lag, via push/pull configsStick slaves by user for read after read consistencyhttps://neo4j.com/blog/advanced-neo4j-fiftythree-reading-writing-scaling/</p> <p>Credits to Aseem Kishore and his team at FiftyThree for sharing this on the conference last year.</p> <p>25</p> <p>Graph lockingConcurrency challengeScale-out =&gt; more images from the same userManage the inputHigh spread of user/image combinationPrevent concurrent analysis of multiple images from the same user:GET /db/manage/server/jmx/domain/org.neo4j/instance%3Dkernel%230%2Cname%3DLocking</p> <p>Wouter</p> <p>Mainly during the importing of photos</p> <p>{"description" : "org.neo4j.kernel.info.LockInfo","type" : "org.neo4j.kernel.info.LockInfo","value" : [{"name" : "description","description" : "description","value" : "ExclusiveLock[\nClient[1] waits for []]"}, {"name" : "resourceId","description" : "resourceId","value" : "2612184871"}, {"name" : "resourceType","description" : "resourceType","value" : "RELATIONSHIP"}]}26</p> <p>Batch insert vs single insertCypher CSV import per 1000 recordsPrevent locking caused by concurrency issues</p> <p>Wouter27</p> <p>No infinite scale outFind the sweet spot for the amount of cluster nodes+1 nodes =&gt; more replications updates =&gt; higher load on write master</p> <p>Wouter28</p> <p>TimelineWere looking for photos which should belong to each other based on date-taken. Moving from full property scan to graph walking via the timeline. For large collection 75% less DB-hitsWalking the timeline if looking for photos within a certain timeframeLess photos to evaluate for property scan (SecondsSinceEpoch)Works perfectly for year, month, day selections</p> <p>Wouter29</p> <p>.NET &amp; Rest interfaceCustom headers to REST Cypher endpoint (Filtered by HaProxy)To route to multiple write serversSticky session per userCustom additions to .NET Neo4jclientManaging JSON resultset</p> <p>Wouter30</p> <p>Graph design considerationsProperty scan(User) full-graph-scanDifferentiating propertyCreate nodeNo path/clustered indexes. (yet.. )</p> <p>Making changes to the schema. For 550+ million nodes</p> <p>Wouter31</p> <p>Graph design improvementsProperty searchmatch (u:User { Id: 001"}) 2812 db hits</p> <p>Node/Relationship searchmatch (u:User { Id: "001"})-[:HasFavourites]-(f:Favourites) 13 db hitsdbms.logs.query.* (dont forget to enable parameters resolving)Our alternative: Integrate with Kibana / Elasticsearchhttps://neo4j.com/docs/operations-manual/current/reference/</p> <p>WouterDB hits increase when the number of photos increases if you do the property search32</p> <p>The future</p> <p>Wouter?33</p> <p>The futureNeo4j 3.xBoltDataminingProcedures / APOC</p> <p>Wouter?34</p> <p>Thats a wrapWouter?35</p>