56
URBANESIA Development History Business Connect 29 Oktober 2012 Prepared by: Batista Harahap

Urbanesia - Development History

Embed Size (px)

DESCRIPTION

Urbanesia's brief development history for Business Connect - 29 October 2012

Citation preview

Page 1: Urbanesia - Development History

URBANESIADevelopment History

Business Connect – 29 Oktober 2012

Prepared by: Batista Harahap

Page 2: Urbanesia - Development History
Page 3: Urbanesia - Development History
Page 4: Urbanesia - Development History
Page 5: Urbanesia - Development History
Page 6: Urbanesia - Development History
Page 7: Urbanesia - Development History

URBANESIA BETA V0The first public iteration of Urbanesia

Page 8: Urbanesia - Development History

PROS

• Data structures in MySQL

• Effective memory caching implementations

• Effective SEO implementations

• Effective search server implementations

• Urbanesia is successfully consumed as a Directory

Page 9: Urbanesia - Development History

CONS

• No effective separation of Backend & Frontend web applications

• Source Code = Spaghetti Code• Storing low value, high volume data in MySQL• Many queries using GROUP BY with highly populated tables• A warm boot will cause +20 seconds to generate any page• Difficult to scale horizontally & vertically• Very low concurrency

• The product’s identity is weak• So many features left unused by users

Page 10: Urbanesia - Development History

WHAT WE LEARNED

• Do NOT use MySQL as session storage• Use NoSQL database for low value, high volume

data• Separate backend & frontend web application,

create APIs for backends• Use output caching where available• When using PHP-APC, make sure apc.stat = 0• Increase concurrency by reverse proxying

requests to Apache

Page 11: Urbanesia - Development History

CHALLENGES

• Handle Google Bots traffic of over 1 TB/month with only 2 servers

• Do output caching with Codeigniter

• Achieving sub second page generation even in warm boots

• Redesign backend by creating an API for our native apps

Page 12: Urbanesia - Development History

URBANESIA V1The second iteration based on refined codes

and infrastructure design

Page 13: Urbanesia - Development History

PROS

• Achieved sub second page generation in warm boots• Aggressive & effective caching mechanism• Optimized MY_Controller• Session storage handled by Memcache• MySQL read/write access lowered from ~400 qps to only 1 qps• Lean memory usage in database server• Created an OAUTH enabled API• Concurrency increased by using nginx as reverse proxy• The same server setup can theoretically handle 10x the current traffic

without scaling horizontally• Google bots are only limited by bandwidth instead of efficient codes• Index properly with MySQL• Don’t use MySQL, used custom built MySQL alternative: Percona Server

Page 14: Urbanesia - Development History

CONS

• Source code = Spaghetti code• Unpredictable behavior of codes because of V0 inheritance,

when more rows fill, queries are bottlenecks• Subqueries still exists• Everything is still synchronous, no message queue yet• The end product fails to impress the illusion of speed (fast)

to users• New hires have a steeper learning curve because of the

inherited complexity added with V1’s own complex• Still difficult to scale horizontally & vertically

Page 15: Urbanesia - Development History

WHAT WE LEARNED

• CodeIgniter is enabling fast product delivery but optimization & efficiency of codes are questionable at best

• Need to enable asynchronous architecture• Do not do things realtime, instead offload to message queues• To impress users with the illusion of speed, JavaScript must be

thoroughly implemented• Emails should not be handled by ourselves, use third party email

solutions like AWS SES• Offload server side international bandwidth to clients, for

Facebook, use Facebook JS SDK instead of the PHP SDK• The product gains more engagements with contents that are more

focused (thematic)• Speed of content delivery is important to engagement metrics

Page 16: Urbanesia - Development History

CHALLENGES

• Build a third iteration with a strong identity based on users’ personas

• Focus more on verticals, create the illusion of a discovery/recommendation platform

• Progressive Disclosure of contents• A JavaScript framework that is light, fast and minimal

dependencies• Make everything asynchronous and message/event based• Redefine Urbanesia’s atomic data structure• Do MySQL JOINs in server side• Get the data first FAST, compute later

Page 17: Urbanesia - Development History

PRODUCTS & TECHNOLOGIESDoes the product makes the technology

or the technology makes the product?

Page 18: Urbanesia - Development History

THE PRODUCT MAKES THE TECHNOLOGY!

Page 19: Urbanesia - Development History

REAL WORLD EXAMPLES

• We need to know which part of Urbanesia will really work for users

• Store the preferences for each users’ dynamic activity

• Make calculations of other contents a user might consume

• Present the content unobtrusively

• Do it fast and almost realtime

Page 20: Urbanesia - Development History

TECHNICAL SPEAK

We need to know which part of Urbanesia will really work for users

• Mine all user’s data each time they visit, including anonymous users

• Log everything FAST and asynchronously

• Low value & high volume data

• Avoid MySQL at all cost

• Model data based on choosen NoSQL database model

Page 21: Urbanesia - Development History

TECHNICAL SPEAK

Introducing Redis

• Read/Write data from memory• Stores data on disk• Key/Value similarity with Memcache• Ability to perform atomic tasks without worrying states• Redis’ primitive data types are very simple• Ideal for low value/high volume data• Less is more!

Page 22: Urbanesia - Development History

TECHNICAL SPEAK

Store the preferences for each users’ dynamic activity

• Simple increments• Perfect for Sorted Hashmaps in Redis• Need them sorted so analytics functions is supported

primitively by Redis == High Performance• Fire & Forget – Consider using async frameworks like

Node.js & trigger using JavaScript• Why trigger with JavaScript? To make sure at the very

least that it’s actually users accessing the page

Page 23: Urbanesia - Development History

TECHNICAL SPEAK

Node.js & Socket.io

• Node.js is a Network ready daemon with Chrome’s V8 JavaScript engine inside

• Node.js is asynchronous by default (event based)• Socket.io is the transport used for data• Socket.io is abstracted to fallback gracefully between

Websocket, Flash and plain AJAX• JavaScript clients should only subscribe to onFailed

events to minimize overhead

Page 24: Urbanesia - Development History

TECHNICAL SPEAK

Make calculations of other contents a user might consume

• Use Machine Learning algorithms to learn users behaviors

• Naïve Bayes Classifier to the rescue

• Independent per keyword assumptions

• Proven algorithm used by many big websites

Page 25: Urbanesia - Development History

TECHNICAL SPEAK

Naïve Bayes Classifier

• There is no wrong or right assumptions, only accuracy

• Accuracy is increased with more data and better classifications

• Relatively easy to code

• Lots of libraries out there in different languages

Page 26: Urbanesia - Development History

TECHNICAL SPEAK

Present the content unobtrusively

• Giving users the illusion that we understand them

• Do not make this feature dominant

• Show it where you want the content look smart

Page 27: Urbanesia - Development History

TECHNICAL SPEAK

Do it fast and almost realtime

• Fast is an illusion

• Realtime is overrated

• If you don’t have enough resource to do so, schedule it and pre generate content

• Scale vertically

Page 28: Urbanesia - Development History

Talk is cheap, show me the CODES!

Page 29: Urbanesia - Development History

URBANESIA @ Github

https://github.com/Urbanesia

Page 30: Urbanesia - Development History

URBANESIA @ Github

https://github.com/Urbanesia/Simple-Naive-Bayes-Classifier-for-PHP

Page 31: Urbanesia - Development History

NAÏVE BAYES CLASSIFIER

First Iteration:

• Took ~1000 seconds to classify 1 keyword

• MySQL as storage

• No micro optimizations

Page 32: Urbanesia - Development History

NAÏVE BAYES CLASSIFIER

Second Iteration:

• Took ~400 seconds to classify 1 keyword

• MongoDB as storage

• Macro optimization trimmed 600 of 1000 seconds

• No micro optimizations

Page 33: Urbanesia - Development History

NAÏVE BAYES CLASSIFIER

Third Iteration:

• Took ~1 second to classify 1 keyword

• Redis as storage

• Insane macro optimization boost

• No micro optimizations

Page 34: Urbanesia - Development History

NAÏVE BAYES CLASSIFIER

Fourth Iteration:

• Took 0.01428 second to classify 1 keyword

• Redis as storage

• Reworked classification algorithm

• Get the data first and compute later

• More memory usage, faster execution time

Page 35: Urbanesia - Development History

NAÏVE BAYES CLASSIFIER

Fifth Iteration:

• Reworked the trainer methods

• Created deTrain method to update data

• Created helpers to do keyword blacklists

• Consistent performance from CLI or HTTP

Page 36: Urbanesia - Development History

NAÏVE BAYES CLASSIFIER

What we learned:• Always be open to new things• Geek Talk with peers from the industry• Very talented people will always come up with smarter and

better way to do something• Decide, get smart or get smarter?• Algorithms are the engine but it doesn’t mean anything

without implementation• Consider opening up source codes for others to examine,

the smarter the population, the better products we create• Focus on USERS instead of technology

Page 38: Urbanesia - Development History

GeekballEvery Tuesday, 17.00 – 19.00Basket Hall C, Senayan

Page 39: Urbanesia - Development History

OUR PRODUCTSUrbanesia’s product lineup

Page 40: Urbanesia - Development History

URBANESIA.COM

Page 41: Urbanesia - Development History

URBANESIA.COM SEARCH

Page 42: Urbanesia - Development History

M.URBANESIA.COM

Page 43: Urbanesia - Development History

URBAN’S NOTES

Page 44: Urbanesia - Development History

URBANESIA WINDOWS 8

http://urho.me/vkND6

Page 45: Urbanesia - Development History

URBANESIA ANDROID

http://urho.me/BSsqR

Page 46: Urbanesia - Development History

JAJAN

Page 47: Urbanesia - Development History

JAJAN

Page 48: Urbanesia - Development History

JAJAN

Jajan is Open Source, get the source codes:• Blackberry - https://github.com/Urbanesia/Jajan-Blackberry• Android - https://github.com/Urbanesia/Jajan• HTML5 - https://github.com/Urbanesia/jajan-html5

Platforms:• Blackberry - https://appworld.blackberry.com/webstore/content/54742/• Android - https://play.google.com/store/apps/details?id=com.bango.jajan• iOS - https://itunes.apple.com/us/app/jajan/id527278768?mt=8• HTML5 - https://jajan5.urbanesia.com/

Page 49: Urbanesia - Development History

URBANESIA BALI

http://urho.me/HPLT9

Page 50: Urbanesia - Development History

WHAT’S NEXTOur third iteration of Urbanesia.com

Page 51: Urbanesia - Development History

WHAT’S NEXT

• A rework from scratch both in Product Design and Technical Implementation

• Focusing more on users and our RICH content

• A social network useful for everyday city life

• Machine learning implementation for our recommendation engine

Page 52: Urbanesia - Development History

WHAT’S NEXT

Live Beta opening soon!

Email to [email protected] for access

Page 53: Urbanesia - Development History

KEY TAKEAWAYSSummary

Page 54: Urbanesia - Development History

KEY TAKEAWAYS

• Empower people working with you

• Invest in company culture

• Focus on USERS, not technology

• Macro to Micro optimizations & scaling

• Be open to new ideas (things)

• Geek Talks over whatever like Basketball or Beer

• Good is not Great

• Whatever WORKS

Page 55: Urbanesia - Development History

Hi! From Urbanesia