Upload
datastax
View
327
Download
0
Embed Size (px)
Citation preview
Real-time personal trainer on the SMACK stack
@honzam399 Jan Machacek
@anirvan_c Anirvan Chakraborty
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Automated personal trainer - muvr• Suggests the sequence of exercise sessions • Suggests exercises in a session, including exercise
parameters (e.g. weight, repetitions, …) • Provides tips on proper exercise form
• With additional hardware (smartwatch, smart clothes), muvr provides • Completely unobtrusive exercise experience • More accurate tips on proper exercise form • With over–fitting, it is usable for physiotherapy
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Architecture
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Privacy
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
The technologies—iOS• Learns the users’ behaviour
• Exercise sessions • Exercises within exercise session • Short–term prediction of [scalar] labels for the exercises
• Performs the real–time analysis of the incoming sensor data • Advised by the expected behaviour • Signal processing to compute repetitions / strokes • Forward–propagation to label the exercise
• Submits all recorded sensor data and confirmed (!) labels per session • Handles offline / travel modes • Synchronises the data across the user’s devices using iCloud
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
The technologies—Akka• Reactive services for user profiles, model parameters,
and sensor data • CQRS/ES implementation, which helps to
• Handle peaks in load • Handle failures of individual nodes • Reason about the scope of the mutable state we keep
• Uses Cassandra for its journal and snapshot stores • The written values are binary “blobs”
• Writes the sensor data to Cassandra • Writes the sensor data in “readable” form; it can be read outside the Akka / Scala
world
• Reads the model and exercise parameters from Cassandra • It selects the best / newest model parameters to serve to the mobile app
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
The technologies—Spark• Distributed computation framework
• “Big data” tasks • Integrates extremely well with Cassandra
• Reads and processes the profiles and sensor data • Identifies clusters of users on their profile information • Slices the sensor inputs by sensor types • Writes the results to another store
• Runs in batches • Executes by schedule (typically once a day)
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
The technologies—neon• A machine learning framework, including
• “The usual” suspects in tensor algebra • Signal processing • Different ML approaches
• Training and evaluation programs • Both programs terminate either upon discovering the perfect model or when their
budget is up • Reads clustered training and testing data from the Spark job • Writes the model parameters and evaluation result to Cassandra
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
The technologies—Cassandra• Underpins the entire platform
• Journal and snapshot store for Akka • Sensor data store • Model parameter store • “Summary” store
• High availability • No single point of failure • High read and write • Replication factor • Tuneable consistency level
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Spark & Cassandra• Group the sensor data into n clusters by user profile with
biometric ID • Expand the sensor data
• Slices of the sensor data by combinations of accelerometer, gyroscope, heart rate, targeted muscle group strain gauges, …
• 1 user = 1 MiB from one sensor per hour; but 4 sensors expand into 4! MiB
• Trivial tasks • The most popular user–contributed exercises • The most popular exercise sessions and exercises within the sessions • The most effective (by overall fitness improvement, weight loss, muscle mass gain, …)
exercise sessions
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Production MLTake the data from Cassandra (written there by the Spark jobs) and:
• Split into training and test datasets • Fit models for various sensor types • Save model parameters • Evaluate the newly fitted models, and re-evaluate old
data
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Production ML• We are using convolutional network
• 2 seconds of sensor data input (e.g. a @ 50 Hz for accelerometer; a, g @ 50 Hz for accelerometer + gyroscope; u, l @ 10 Hz for smart clothes)
• The exercise classes as the outputs
• The training program • CNN in neon • Loads the mini–batches from Cassandra • Fits the model; evaluates the fitted model • Saves the model parameters into Cassandra
• The re–evaluation program • Re–evaluates past n models against the latest training dataset; computing accuracy,
precision, recall, f1
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Having code is jolly good
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Running it• Simplicity • Ease of orchestration • Ease of development • Support for polyglot frameworks and components • Cost effective resource utilisation
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Docker• Deploy reliably & consistently • Execution is fast and light weight • Simplicity • Developer friendly workflow • Fantastic community
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Dockerize Cassandra Dev Environment• Super low memory settings in cassandra-env.sh
• MAX_HEAP_SIZE=“128M” • HEAP_NEWSIZE=“24M”
• Remove caches in dev mode in cassandra.yml • key_cache_size_in_mb: 0 • reduce_cache_sizes_at: 0 • reduce_cache_capacity_to: 0
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Dockerize Cassandra Production• Use host networking (—net=host) for better network
performance • Put data, commitlog and saved_caches in volume
mount folders to the underlying host • Run cassandra on the foreground using (-f) • Tune JVM heap for optimal size • Tune JVM garbage collector for your workload
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Mesos• Distributed systems kernel • Scales to 10,000s of nodes • Depends on Zookeeper for fault tolerance and high
availability • Creates a highly available, scalable single resource pool
• Automatic failover • Ease of management • Simple to operate • Support for Docker container
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Mesos architecture
image source: https://assets.digitalocean.com/articles/mesosphere/mesos_architecture.png
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Cassandra on Mesos• Running Cassandra as Docker containers
• Custom Dockerfile and entry-point script to control Cassandra configuration
• Marathon to initialize and control
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Cost effective resource in AWS• Embrace AWS spot instances
• About 50-60% cheaper than on demand instances • Can be reclaimed without notice if outbidded
• Run dev and staging on spot instances • Run Spark jobs on spot instances
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Thanks!
Twitter: @cakesolutionsTel: 0845 617 1200
Email: [email protected] Jobs: http://www.cakesolutions.net/
careers