Transcript
Page 1: BDAM: Big Data Asset Management

BDAM:Big Data

Asset Management

Mark Harrison - Mike Sundy{mh,msundy}@pixar.com

Page 2: BDAM: Big Data Asset Management

No Recording

Page 3: BDAM: Big Data Asset Management

What is Asset Management?

• Long-Lived Data– 50 year charter

• Large Data– Many TB

• Tight Data/Metadata Integration– Shot lists, assignments, rights management

• Scalable Data Services– Human, Render Farm, Build Farm Scale

Page 4: BDAM: Big Data Asset Management

Long Lived Data

• How Templar Project was Started• Things Change

– Vendors– Software– File formats– Hardware, OS, Storage

• Your Own Requirements Change– How flexible, “hackable” can you be?

Page 5: BDAM: Big Data Asset Management

Large Data

• Expanding Expectations (include)• Harrison’s Law of 1 Terabyte (include)• Harrison’s Time Scale of Data (include)• Harrison’s law of mentioning Harrison• Basic Drivers:

– Storage: cheaper– Expectations: higher– Time: stays constant

Page 6: BDAM: Big Data Asset Management

Tight Data/Metadata Integration

• Over Time, you lose information about files• Important Information:

– Assignments, shot lists, rights clearances• Don’t let data disappear into proprietary hole

Page 7: BDAM: Big Data Asset Management

Scalable Data Services

• Picture of single server• Applications need to scale appropriately• Avoid bottleneck of single server (if possible)• Infrastructure should handle data bandwidth• Note: Bottlenecks will always move, but

always exist

Page 8: BDAM: Big Data Asset Management

Templar

• Pixar’s Proprietary Asset Management System• Handles all studio data and metadata

– feature films, shorts, special projects– artwork, scripts, movie frames, simulation data,

project management data• 50 year Timeframe

– All metadata, data can be accessed and used through 2053

Page 9: BDAM: Big Data Asset Management

Templar Asset Management

• Long-Lived Data– 50 year charter

• Large Data– Many TB

• Tight Data/Metadata Integration– Shot lists, assignments, rights management

• Scalable Data Services– Human, Render Farm, Build Farm Scale

Page 10: BDAM: Big Data Asset Management

Templar: Long Lived Data

• Federated Architecture– Loosely Coupled– Software hooks into pipeline

• Pieces can be upgraded incrementally– Software, file formats

• Exit Strategy Orientation– Standards, access to internals

Page 11: BDAM: Big Data Asset Management

Templar Large Data

• Large, Fast Storage– File system caching, etc.

• Scalable Storage Software– proprietary system for non-revisioned files– Perforce

• Both horizontal and vertical scalability

Page 12: BDAM: Big Data Asset Management

Templar Data/Metadata Integration

• “Federated” System– No monolithic application that “does everything”

• Instead, “best in class” programs that interoperate– modeling, rendering, storage, etc.

• Lightly Coupled Applications to Metadata• Metadata in Relational DB, eg Oracle• Expandable Metadata Schema

Page 13: BDAM: Big Data Asset Management

Templar: Scalable Data Services

• Multiple Access Methods for Assets– File system, HTTP, direct Perforce

• Load Balancer, multiple servers (e.g. HTTP)• File System optimizations (clusters, caching)• Perforce: use LINKATRON• Asynchronous Queuing

Page 14: BDAM: Big Data Asset Management

Perforce

• In use at Pixar since 2000 for code only• File revision history goes back to 1983• First Perforce-managed film: Toy Story 3

Page 15: BDAM: Big Data Asset Management

Perforce: Long Lived Data

• Matches “exit strategy” requirements– All data, metadata extractable, hackable– ,d magic – direct flat file storage access on back-end

• Types of Data – not just code!– art – reference and concept art – inspirational art for film– tech – show-specific data. e.g. models, textures, pipeline– studio – company-wide reference libraries. e.g. animation

reference, configuration files, Flickr-like company photo site– tools – code for our central tools team, software projects– dept – department-specific files. e.g. marketing images– exotics – patent data, casting audio, data for live action shorts,

story gags, theme park concepts, intern art show

Page 16: BDAM: Big Data Asset Management

Perforce: Large Data

• Vertical Scalability– 900 GB single file – 6.5 TB checkin– 47 TB largest single depot– 160 TB total Perforce storage across all depots

• Leverage Perforce features to reduce data:– Used +S auto-purge filetype to save 40% of

storage on Toy Story 3 (1.2 TB)– Wrote a script to de-duplicate files, using p4

checksum data. Saved 1 million files and 1 TB

Page 17: BDAM: Big Data Asset Management

Perforce: Data/Metadata Integration

• How does it integrate with Templar?– stores the files– version control– the “authority” for source writes– triggers for synchronous operations (e.g.

LINKATRON)

Page 18: BDAM: Big Data Asset Management

Perforce: Scalable Data Services

• Horizontal Scalability– 190+ depots– 58 VMWare servers– 26 million submitted changelists

• Server architecture– Scale out

• Performance on one depot won’t affect another• Easier administration/downtime scheduling

– Virtualization• 95% of physical hardware performance with greater flexibility• 15 minutes to build new server

• Automated p4 server setup (squire)– 8 seconds to run script to create new p4 instance

Page 19: BDAM: Big Data Asset Management

Conclusion

• Templar and Perforce met our four requirements:– Long-Lived Data

• 50 year charter• confidence in retrieving data due to access to internals

– Large Data• Hundreds of TB• 500 TB depot on horizon

– Tight Data/Metadata Integration• Rock solid file management• users trust it

– Scalable Data Services• 190 depots• hundreds more to come – we keep finding new uses


Recommended