Lenddo - Data Driven NYC (27)

Empowering the Emerging Market Middle Class

Big Data is not Big Database

Jeff Stewart - CEO Naveen Agnihotri, PhD - CTO

“If you look 5 years out, every industry is going to be rethought in a social way”.

-Mark Zuckerberg, 2010

● Founded in January 2011● Over 500K members around the world● Integrated with Facebook, LinkedIn, Google,

Yahoo, Twitter● Services oriented architecture (LAMP)

○ Front end (clients) in PHP○ Services in PHP and Python

● Technical team based in NY and PH● Data science team based in NY

LENDDO TECH FACTS

Finance in the Age of Social Networks

Lenddo maintains the worlds largest Opt-in, TrustGraph, for trustworthiness and risk management

Lenddo is….

Social

Social sourcing & screeningPeer enforcement

New data sets

Algorithms

Unprecedented processing powerReal-time / ongoing risk management Targeting, underwriting & collections

Rich risk analytic data setUnprecedented processing power

Global

Mobile

New datasets24/7 engagement

new cost structures

Why Finance Works Better with Lenddo

Traditional

• Negative selection bias• Costly

• Fact verification time consuming • Scores incomplete or unavailable

• No peer enforcement• Labor intensive• Hard to maintain contact

DEMANDGENERATION

UNDERWRITING

COLLECTIONS

• Digital, fast and potentially viral• Less Expensive• Social nature cause positive selection bias

• Reduced Fraud and default • Big data and powerful algorithms• Larger addressable market • Easily automatable

• Potential for peer enforcement• Lower cost• More points of contact

With Lenddo

Source: http://www.kpcb.com/insights/2013-internet-trends

ID Verification is easier online

● 100% infrastructure on AWS ● Store social data from all online social

networks● Opt-in Social data storage grows about 10

times faster than member data● Social data currently about 3.5 TB● Largest table (dataset) is > 2 billion records

LENDDO SOCIAL DATA

GOOD AND BAD BORROWERS

n=1347

CLUSTERS

LOAN SCORE IMPROVEMENT

No NLP or network

LOAN SCORE IMPROVEMENT

No NLP or network With NLP and network

WORD CLUSTERS

Words associate closely together, and can be commonly associated with good or bad loans.

WORDS AND LOAN QUALITY

% Association with BAD loans

% Association with GOOD loans

● Started with MongoDB for social data storage● As use cases grew, we added indexes

SOCIAL DATA STORAGE HISTORY

SOCIAL DATA STORAGE

User data

Social data

SOCIAL DATA STORAGE

Social data User data

● We moved to larger and larger servers○ At last iteration, used cr1.8xlarge server○ 32 CPUs, 244 GB RAM○ Still couldn’t keep up with index size

● Data acquisition speeds increased○ provisioned IOPS to the rescue!

● Total cost of social data storage: > $10,000 per month● And we want to grow faster!

● Simple queries (by index)● Complex queries (by multiple indexes)● Pull out all data for a member● Aggregate all data for a member● Calculate score for a member● Aggregate all data for all members● Calculate score for all members

REVELATION: 2013

It’s“BIG DATA”

not“BIG DATABASE”

REVELATION: 2013

● Moved all data to Amazon S3● Data model remains largely unchanged● Hadoop compatible storage format

○ Avro format○ Snappy compressed, chunked

● Created a small ‘cache’ type MongoDB○ stores recent data temporarily

● Using DynamoDB for longer-lived data that needs to be queried all the time

SOCIAL DATA REVAMP - 2013

● Use the cache for data when it first arrives○ Data is available for quick computations and

● Move data from cache to S3 at the end of the day● Use EMR over S3 data for all aggregations● Created a EMR based map-reduce framework for data

science team● Standard EMR jobs for common queries:

○ All social data for a member○ Score one member○ Score all members

NEW SOCIAL DATA USAGE

● Peace of mind○ No more database maintenance○ No more periodic server upgrades

● Scalability○ Storage and access remains identical for the next

10x growth● $$$

○ New cost: < $3000 per month: 70% less!○ Includes EMR clusters running routine jobs

WHAT DID WE GAIN?

Thanks!

Lenddo - Data Driven NYC (27)

Technology

Paula Guntaur, FiftyThree // Design Driven NYC // December 2014

NYC Subways

Tamr // Data Driven NYC // September 2014

X.ai // Data Driven NYC // November 2014

Kim Bost, Cover // Design Driven NYC // December 2014

XXV Congreso Internacional de Credito Educativo - LENDDO

New York City - US Department of Transportation · PDF fileNew York City (NYC) ... operational elements Deployment concepts are needs-driven ... Road Weather • Wireless Inspection

NDSR - NYC

Dataiku - data driven nyc - april 2016 - the solitude of the data team manager

NYC Build it Back NYC Department of Design and Construction · NYC Build it Back . NYC Department of Design and Construction . ... (Approximately 1,000 buildings) Includes Rehabilitation,

NYC & Company - Cloudinary · • NYC Restaurant Week® • NYC Broadway Week SM • NYC Off-Broadway Week SM • Valentine’s Day • Winter Guide February • Black History Month

Instructions for Form NYC-3A, NYC-3A/B and NYC-3A/ATT · Instructions for Form NYC-3A, NYC-3A/B and NYC-3A/ATT l For details on the proper reporting of income and expenses addressed

NYC StreetDesignManual

Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013

Establishing a Brand-driven Message Architecture WebVisions NYC

Insight Data Science - Data Driven NYC

Clarifai // Data Driven NYC // November 2014

Pothole City: A Data-Driven Look at NYC Roadways...ClaimStat Alert | July 2015 1 Pothole City: A Data-Driven Look at NYC Roadways Potholes have been a persistent presence in New York

Dentist NYC || Best NYC Dentist || Dentist

Lenddo FAQ and Fact Sheet