Upload
salesforce-developers
View
1.603
Download
2
Tags:
Embed Size (px)
Citation preview
A Behind the Scenes Look at the
Force.com Platform Walter Macklem, salesforce.com, CTO of Platform
Safe Harbor
Safe harbor statement under the Private Securities Litigation Reform Act of 1995:
This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties
materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results
expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be
deemed forward-looking, including any projections of product or service availability, subscriber growth, earnings, revenues, or other
financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any
statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services.
The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new
functionality for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our
operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of
intellectual property and other litigation, risks associated with possible mergers and acquisitions, the immature market in which we
operate, our relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new
releases of our service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization
and selling to larger enterprise customers. Further information on potential factors that could affect the financial results of
salesforce.com, inc. is included in our annual report on Form 10-Q for the most recent fiscal quarter ended July 31, 2012. This
documents and others containing important disclosures are available on the SEC Filings section of the Investor Information section of
our Web site.
Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently
available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based
upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-
looking statements.
Key Takeaways
Internal & External Data Constructs
Multitenancy
Data Infrastructure
Key Takeaways
Data Infrastructure
• Building blocks of the Force.com service. Relational database,
Distributed File System, Search. High Availability, Backups, and
Disaster Recovery.
Multitenant Data Management
• Platformize the raw data infrastructure to make it work for the cloud.
Enable multiple customers to utilize a shared resource pool.
Internal Development with Data
• Dogfooding. How do internal Salesforce engineers build on top of this
multitenant data platform?
Pod == Hardware Topology
Pod
• Self-contained set of hardware*
• Each customer is in one pod
• Each pod services many customers
• Data persistence and System of Record
• Data processing
• Hardware mirroring
* Exceptions being: Edge router and a few other services
Pod
Relational
Database Distributed
File System
Application Servers
Search
Pod #1
Salesforce
Users
Pod
Horizontal Scalability
POD #1 POD #3 POD #2 POD #5 POD #4 POD #N
NA1 NA7 EU1 CS8 AP0
Data Infrastructure Building Blocks
• Relational Database
• Distributed File System
• Search
Relational Database
• Sharding / Partitioning
• 32-way
• Shard based on customer
• High availability
• 8 machine database cluster
• Automatic failover
Relational Database
• Backups
• 3 lag databases
• Near Realtime
• 2 Hour
• 48 Hour
• Tape / Disk
• Disaster Recovery
• Hardware block-level replication
• 6 logical copies of all bits
• >> 6 physical copies of all bits
Distributed File System
• Binary Object Store
• Homegrown Technology called FileForce
• Optimized for High Availability
Distributed File System
• File Handles are stored in a HA relational database
• Block stores:
• High density cheap machines
• Dumb
• RAID10
• Deployed in “buddy” pairs
• Buddy
• Leader election
• Backup
• DR
Distributed File System
File Handles
Small File Block
Store
Block Store 1
Coordination Service
File API
Block Store 2
Block Store N
Distributed File System
• Small files are a problem
• Examples
• #1. One 10MB file = 10MB
• #2. 10 million one byte file = 10MB
• Stored initially in a HA database
• Bundled with other small files into a big file
• File handles reference an offset into the big file
Search
• Full-text search capability
• Wide variety of data to support:
• Structured data: id, email, phone number
• Unstructured data: long documents, short chatter posts
• Real-time indexing and querying
• 90% of events indexed in < 3 mins
• Lucene & Solr
Next-gen Querying Original architecture
DB
Java
Application
Servers
Java
Application
Servers
Java
Application
Servers
Java
Application
Servers
Java
Application
Servers
Java
Application
Servers
Java
Application
Servers
Java
Application
Servers
Secondary
Indexer
Primary
Indexer SAN
NFS NFS
Back-
ups
?
QUERY TIER
INDEXING TIER STORAGE
Search
Servers
Query
Hosts
Query
Hosts
Query
Hosts
Query
Hosts
Next-gen Querying Current architecture
DB
Java
Application
Servers
Java
Application
Servers
Java
Application
Servers
Java
Application
Servers
Java
Application
Servers
Java
Application
Servers
Java
Application
Servers
Java
Application
Servers
Secondary
Indexer
Primary
Indexer SAN
NFS NFS
Back-
ups
?
QUERY TIER
INDEXING TIER STORAGE
Search
Servers
Query
Hosts
Query
Hosts
Query
Hosts
Query
Hosts
Query Performance
Enabled In-Memory
Querying
Search
Servers
Query
Hosts
Query
Hosts
Solr Next Generation Architecture
DB
Java
Application
Servers
Java
Application
Servers
Java
Application
Servers
Java
Application
Servers
Java
Application
Servers
Java
Application
Servers
Java
Application
Servers
Java
Application
Servers
Query
Hosts
Search
Hosts
FFX
Search
Servers
Query
Hosts
Query
Hosts
Query
Hosts
Search
Hosts
Production DR
Backup
Replic
ati
on
Concludes Data Infrastructure.
On to Multitenancy.
Multitenancy
• Condominium Complex = Data Infrastructure
• Tenant = Organization (aka Company)
• Each Organization has many sub-tenants (aka Users)
How do we take a plain old relational database and
make it multitenant?
Multitenant Database
Customize standard schema
Create columns
Add new schema
Create new tables & columns
Scale
Create indexes and materialized views
Statistics gathering
Adhoc querying with optimized query plans
Multitenant Database
Customers have created 2 million database tables
Tens of millions of columns on those tables
Ten of billions of rows in those tables
Sharing Relational Data Structures is Hard
Your Definitions
Indexes Pivot table for non-unique
indexes
UniqueFields Pivot table for unique
indexes
Relationships
Pivot table for foreign keys
MRUIndex Pivot table for most-recently-
used
…others…
Burberry’s
Clothing
Data
Your Payroll
Data
Dell’s
Product Data
Your Data Your Optimizations
Flex Schema on Steroids: Everyone’s Data
Flex Column: Multiple Data Types
ID Tenant Data 1 Data 2 Data N
1000001 You $190
1000002 You $250
1000003 You $680
1000004 Burberry True
1000005 Burberry False
1000006 Burberry True
1000007 Dell Monitor
1000008 Dell Laptop
1000009 Dell Server
Flex Schema: Everyone’s Optimizations
ID Data 1 Data 2
10002 unus erat toto naturae
10003 vultus in orbe
10004 quem dixere chaeos
10005 rudis indigestaque
10006 meis perpetuum
10007 deducite temopra
10008 carmen ante
10009 mare et terras
10010 tegit et quod
10011 omnia caelum
10012 unus erat toto naturae
10013 vultus in orbe
10014 quem dixere chaeos
10015 rudis indigestaque
10016 meis perpetuum
10017 deducite temopra
10018 carmen ante
10019 mare et terras
10020 tegit et quod
10021 omnia caelum
10022 unus erat toto naturae
10023 vultus in orbe
10024 quem dixere chaeos
10025 rudis indigestaque
10026 meis perpetuum
10027 deducite temopra
10028 carmen ante
10029 mare et terras
10030 tegit et quod
10031 omnia caelum
10032 unus erat toto naturae
10033 vultus in orbe
Multi-tenant Index Muti-Tenant Table
ID Tenant Data 2
1000001 You $190
1000002 You $250
1000003 You $680
1000004 Burberry True
1000005 Burberry False
1000006 Burberry True
1000007 Dell Monitor
1000008 Dell Laptop
1000009 Dell Server
Tenant Text Number Boolean
You $190
You $250
You $680
Burberry True
Burberry False
Burberry True
Dell Monitor
Dell Laptop
Dell Server
Redundant
Storage
Multitenant Database
• To support Custom Objects, we use:
• Arbitrary Transaction Support
• Locking
• Row caching
• To support Custom Objects, we don’t use:
• Native data typing
• Native indexing
• Foreign Key Constraints
• Query Optimization
• Stats Collection
A Real World Question
Michael Dell wants to know if Servers are
selling well in the West
SELECT SUM(Amount)
FROM Opportunities
WHERE Product = ‘Servers’
AND Region = ‘West’
How will we answer this question quickly?
ID Data 1 Data 2
10002 unus erat toto naturae
10003 vultus in orbe
10004 quem dixere chaeos
10005 rudis indigestaque
10006 meis perpetuum
10007 deducite temopra
10008 carmen ante
10009 mare et terras
10010 tegit et quod
10011 omnia caelum
10012 unus erat totonaturae
10013 vultus in orbe
10014 quem dixere chaeos
10015 rudis indigestaque
10016 meis perpetuum
10017 deducite temopra
10018 carmen ante
10019 mare et terras
10020 tegit et quod
10021 omnia caelum
10022 unus erat toto naturae
10023 vultus in orbe
10024 quem dixere chaeos
10025 rudis indigestaque
10026 meis perpetuum
10027 deducite temopra
10028 carmen ante
10029 mare et terras
10030 tegit et quod
10031 omnia caelum
10032 unus erat toto naturae
10033 vultus in orbe
Visibility
Indexes
Millions of Sales
Line Items
The fastest path to
the answer
ID Data 1 Data 2
10002 unus erat toto naturae
10003 vultus in orbe
10004 quem dixere chaeos
10005 rudis indigestaque
10006 meis perpetuum
10007 deducite temopra
10008 carmen ante
10009 mare et terras
10010 tegit et quod
10011 omnia caelum
10012 unus erat totonaturae
10013 vultus in orbe
10014 quem dixere chaeos
10015 rudis indigestaque
10016 meis perpetuum
10017 deducite temopra
10018 carmen ante
10019 mare et terras
10020 tegit et quod
10021 omnia caelum
10022 unus erat toto naturae
10023 vultus in orbe
10024 quem dixere chaeos
10025 rudis indigestaque
10026 meis perpetuum
10027 deducite temopra
10028 carmen ante
10029 mare et terras
10030 tegit et quod
10031 omnia caelum
10032 unus erat toto naturae
10033 vultus in orbe
M. Dell
Servers
West
Multi-tenant Query Optimizer
Run pre-queries
Check user
Visibility
Check filter
selectivity
Write query-based
on results of pre-
queries
Execute query
User
Visibility # of rows that
the user can
access
=
Filter
Selectivity How specific
is this filter? =
Multi-tenant Query Optimizer
Shared
Visibility
Shared
Indexes
ID Data 1 Data 2
10002 unus erat toto naturae
10003 vultus in orbe
10004 quem dixere chaeos
10005 rudis indigestaque
10006 meis perpetuum
10007 deducite temopra
10008 carmen ante
10009 mare et terras
10010 tegit et quod
10011 omnia caelum
10012 unus erat totonaturae
10013 vultus in orbe
10014 quem dixere chaeos
10015 rudis indigestaque
10016 meis perpetuum
10017 deducite temopra
10018 carmen ante
10019 mare et terras
10020 tegit et quod
10021 omnia caelum
10022 unus erat toto naturae
10023 vultus in orbe
10024 quem dixere chaeos
10025 rudis indigestaque
10026 meis perpetuum
10027 deducite temopra
10028 carmen ante
10029 mare et terras
10030 tegit et quod
10031 omnia caelum
10032 unus erat toto naturae
10033 vultus in orbe
ID Data 1 Data 2
10002 unus erat toto naturae
10003 vultus in orbe
10004 quem dixere chaeos
10005 rudis indigestaque
10006 meis perpetuum
10007 deducite temopra
10008 carmen ante
10009 mare et terras
10010 tegit et quod
10011 omnia caelum
10012 unus erat totonaturae
10013 vultus in orbe
10014 quem dixere chaeos
10015 rudis indigestaque
10016 meis perpetuum
10017 deducite temopra
10018 carmen ante
10019 mare et terras
10020 tegit et quod
10021 omnia caelum
10022 unus erat toto naturae
10023 vultus in orbe
10024 quem dixere chaeos
10025 rudis indigestaque
10026 meis perpetuum
10027 deducite temopra
10028 carmen ante
10029 mare et terras
10030 tegit et quod
10031 omnia caelum
10032 unus erat toto naturae
10033 vultus in orbe
Stop
Go
Multi-tenant
Optimizer Statistics
The Machine is Alive!!!!
Automatic creation of indexes
Watches queries, logs certain behaviors, selects potential candidates,
tests and ranks best candidates, and then builds indexes for
candidates
Runtime predictor for long-running queries
Factors in selectivity, cardinality, # of joins, presence of indexes,
current server conditions
Machine Learning via Decision Forest
How can we make internal salesforce developers
more efficient?
We want to create the Quote standard object
• DDL scripts
• Hand-coded SQL
• ORM
• Sharing
• Workflow
• Apex
• Visualforce
• Validation Rules
• API
Why can Force.com developers create a Custom
Object in about 30 secs, but it takes me 30 days?
Our Solution for the Quote Standard Object
Base Platform Objects (BPOs)
Exactly the same as Custom Objects, but exposed to internal
saleforce.com developers.
Schema defined in XML
Our Solution for the Quote Standard Object
Customer’s getting the benefit of it! Zero downtime for major
releases and the shrinkage of maintenance windows.
Conclusion - Key Takeaways
Data Infrastructure
• Building blocks of the Force.com service. Relational database,
Distributed File System, Search. High Availability, Backups, and
Disaster Recovery.
Multitenant Data Management
• Platformize the raw data infrastructure to make it work for the cloud.
Enable multiple customers to utilize a shared resource pool.
Internal Development with Data
• Dogfooding. How do internal Salesforce engineers build on top of this
multitenant data platform?