66
SharePoint Content Lifecycle Management Presented by: Mary Leigh Mackie

Best Practices for Content Lifecycle Management with MS SharePoint

  • Upload
    newbu

  • View
    1.340

  • Download
    8

Embed Size (px)

Citation preview

Page 1: Best Practices for Content Lifecycle Management with MS SharePoint

SharePoint Content Lifecycle

ManagementPresented by: Mary Leigh Mackie

Page 2: Best Practices for Content Lifecycle Management with MS SharePoint

Content Lifecycle Management

Organization Workflow Creation Repository Versioning Publishing Archives

Design Produce Consume

Sites Work-flow Office Doc

Library Version Publish-ing Site

Record Center

SharePoint

Page 3: Best Practices for Content Lifecycle Management with MS SharePoint

Agenda

Content Organization &

Storage

Storage Optimization

Content Access

Archiving

Page 4: Best Practices for Content Lifecycle Management with MS SharePoint

Content Organization & Storage

Page 5: Best Practices for Content Lifecycle Management with MS SharePoint

Information Architecture

• Accountability of published content using workflows or approvals• Managing search scopes, security trimming, federation• Isolate intranet content from extranets• Testing for consistency and performance• Training your site/content owners and end users

http://technet.microsoft.com/en-us/library/cc262873.aspx#section2

• Determine the business goals• What will your site structure and

taxonomy look like?• Standardize branding with

templates and master pages

Other considerations

Source: Governance Resource Centre on Microsoft TechNet

Page 6: Best Practices for Content Lifecycle Management with MS SharePoint

0 1 2 3 4

Active Data

Total Data

Storage in a Content RepositoryIncrease in % of inactive data over time

Time in years

Dat

a in

SQ

L

Page 7: Best Practices for Content Lifecycle Management with MS SharePoint

Planning for SharePoint Storage

• Recycle bin• Versioning• Search and index information• Growth

Good rule of thumb for initial planning is: 3.5 x file system

Page 8: Best Practices for Content Lifecycle Management with MS SharePoint

Basic Storage Management Methods

• Set site quotas and alerts!– 10 GB quota, 8 GB alert is my favorite

• Monitor growth trends– Sites: slow over time or large jump in size?– Overall content DB size

• Split Content DBs if they get “too big”

Page 9: Best Practices for Content Lifecycle Management with MS SharePoint

How SharePoint “chooses” a Content DB for a site

• Highest remaining allotment rule– Content DB 1: 100 sites max– Content DB 2: 100 sites max

– Content DB 1: 100 sites max– Content DB 2: 200 sites max

SharePoint Site Content BD selection process: http://blog.jesskim.com/kb/293

Page 10: Best Practices for Content Lifecycle Management with MS SharePoint

Optimal Content DB Sizing

• Backup & Recovery operations(<50-100 GB)

• Performance (<500 GB… nervous at 300 GB)– # of objects– size of objects– Hardware (servers and storage)

• Storage Cost (as small as possible!)

So what is too big?

Page 11: Best Practices for Content Lifecycle Management with MS SharePoint

BLOBs-- What’s the Issue?

• BLOBs = Binary Large Objects • SharePoint Content = BLOB + Metadata• Content DB = database of … BLOBs + Metadata• SQL DB storage needs high IOPS (input/output

operations per second) and low latency• High IOPS + low latency storage = $$$$• BLOBs do not participate in query operations, so no

real reason to have BLOBs in a DB• DB full of BLOBs = wasted $$$

Page 12: Best Practices for Content Lifecycle Management with MS SharePoint

SharePoint WFE

SharePoint Object Model

SQL ServerBL

OBs

& M

etad

ata

Content DB Config DB

Default SharePoint Storage

Page 13: Best Practices for Content Lifecycle Management with MS SharePoint

Database Size ImplicationsBLOBs increase DB size, creating issues with:

• Backup & Recovery operations

• Performance

• Storage Costs

Page 14: Best Practices for Content Lifecycle Management with MS SharePoint

0 1 2 3 4

Active Data

Total Data

Issues with BLOBs Get much worse over time…

Increase in % of inactive data over time

Inactive sites, documents, list, libraries take up SQL storage, hindering performance

Time in years

Dat

a in

SQ

L

Page 15: Best Practices for Content Lifecycle Management with MS SharePoint

Storage Optimization

Page 16: Best Practices for Content Lifecycle Management with MS SharePoint

SharePoint Storage Optimization Methods

• Move the BLOBs out of the database

• Archive content

Page 17: Best Practices for Content Lifecycle Management with MS SharePoint

Planning for Data Use & Growth

What does SharePoint 2010 offer OOTB?• No native archiving tools• EBS extended to include RBS

– Available only in SQL Server 2008 SP2– Only accessible via API

• BCS (BDC in 2007) extended to allow for easier connectivity with legacy data systems

Page 18: Best Practices for Content Lifecycle Management with MS SharePoint

Storage OptimizationExtending BLOBs out of the database

Page 19: Best Practices for Content Lifecycle Management with MS SharePoint

Available APIs for Extending

SQL Remote BLOB Service (RBS)

SharePoint External BLOB Service (EBS)

Page 20: Best Practices for Content Lifecycle Management with MS SharePoint

EBS/RBS OverviewBlob Services to change BLOB storage locations

• EBS = External BLOB Service– SharePoint 2007 SP1+ API

• RBS = Remote BLOB Service– SQL Server 2008R2 Feature Pack API, with SharePoint 2010 support

• Both are interface specifications– Need a provider to actually work

• Cannot have both providers

Page 21: Best Practices for Content Lifecycle Management with MS SharePoint

EBS

• EBS provider can take ownership of the BLOB

• Provider gives SharePoint a token or a stub so SharePoint knows how to retrieve the object  (context)

• Transparent to the end-user 

SharePoint WFE

EBS ProviderBLOB

Metadata

SharePoint Object Model

SQL Server

Content DB Config DB

BLOB Store

Page 22: Best Practices for Content Lifecycle Management with MS SharePoint

EBS

• Implemented by SharePoint• Only 1 EBS Provider per SharePoint farm

• Orphaned BLOBs- no direct method to compare BLOB store and Content DB

• Compliance- what if I don’t want to allow SharePoint to delete the object?

Page 23: Best Practices for Content Lifecycle Management with MS SharePoint

RBS

• Not unique to SharePoint, available to any application

• A Provider Library can be associated with each database

SharePoint WFE

SharePoint Object Model

Content DBX

Content DBY

Relational Access

Provider Library X

Provider Library Y

BLOB Store

RBS Client Library

BLOB Store

BLOB Metadata

BL

OB

& M

eta

da

ta

SQL Server

Page 24: Best Practices for Content Lifecycle Management with MS SharePoint

RBS

• Implemented by SQL• Only 1 RBS Provider per Content DB

• Orphaned BLOBs much less of an issue• Can lock down operations, from a unified

storage perspective• Can be managed via Powershell

Page 25: Best Practices for Content Lifecycle Management with MS SharePoint

RBS: SQL Server 2008 Feature Pack APIHandled natively by database

Default Provider: FILESTREAM1. Enable FILESTREAM provider on SQL2. Provision data store and set storage location3. Install RBS on all SP Web and App servers4. Enable RBS

Page 26: Best Practices for Content Lifecycle Management with MS SharePoint

RBS versus SQL Filestream

• Filestream storage must be file system locally attached to the SQL server

• RBS is an API set that allows storage on external stores - physically separate machines that may be running custom storage code, for instance EMC Centera

Page 27: Best Practices for Content Lifecycle Management with MS SharePoint

EBS

Tighter integration

with application,

allows for more rules and settings

EBS versus RBS, which is better?

Page 28: Best Practices for Content Lifecycle Management with MS SharePoint

EBS

Tighter integration

with application,

allows for more rules and settings

RBS

Simpler, allows unified storage

architecture across

applications

http://www.codeplex.com/sqlrbs

EBS versus RBS, which is better?

Page 29: Best Practices for Content Lifecycle Management with MS SharePoint

It looks like RBS has won…

SQL Remote BLOB Service (RBS)

SharePoint External BLOB Service (EBS)

SharePoint 2007

SharePoint 2010

Future SharePoint Release (SPS 5?)

SQL Server 2005

Future SQL Releases

SQL Server 2008

Microsoft will provide a powershell solution to migrate from EBS to RBS

Page 30: Best Practices for Content Lifecycle Management with MS SharePoint

Benefits of Extending BLOBs

• Backup & Recovery operations– Databases are 60-80% smaller– Need a method to backup BLOBs synchronously

• Performance– Databases are 60-80% smaller– Performance improvement increase as the file/BLOB size increases. Microsoft

research indicates:• <256kb, SQL better• 256kb to 1mb, SQL and file system comparable• >1mb, file system better

• Storage Cost– “Not as expensive” storage– Archiving still needed for true savings

Page 31: Best Practices for Content Lifecycle Management with MS SharePoint

RBS is Completely Seamless for Users

31

• Users can access contents by:– Clicking and downloading directly through SharePoint– Opening the file using their Office client– Referencing the URL– Searching for contents natively in SharePoint

• Users can interact with contents by:– Modifying metadata and content types– Modifying permissions– Applying alerts– Using workflows or publishing templates– Using site Quotas and Locks

Page 32: Best Practices for Content Lifecycle Management with MS SharePoint

Cloud Storage Use CaseSharePoint “Overdraft Protection”

DB alert set at 80 GB, limit at 100 GB

0

80

100

Alert sent to admin

No action takenCloud

Storage

• Could be any storage

• Cloud is ideal “insurance”--cheap to setup, expensive to use

Page 33: Best Practices for Content Lifecycle Management with MS SharePoint

Content Access

Page 34: Best Practices for Content Lifecycle Management with MS SharePoint

Where is it in it’s lifecycle? Do you want to expose it in SharePoint?

• BCS is intended for connecting LOB’s (Databases, Windows Communication Foundation (WCF) or Web services, .NET connectivity assemblies, Custom data sources) into SharePoint, without migrating the data

• No OOTB solutions for getting content out of users desktops, file shares, or other ECM systems

Connecting Legacy Data

SharePoint 2010 Support

Page 35: Best Practices for Content Lifecycle Management with MS SharePoint

Options for Exposing Legacy Data(File Shares, Notes, Exchange Public Folders, eRoom Documentum, LiveLink…

etc?)

• Migrate– Manually download/upload, losing author, time, security, history,

other metadata– 3rd Party Tool

• Connect– BCS Mechanisms– Most major ECM Vendors– AvePoint’s DocAve Connector EBS/RBS API’s preferred

Page 36: Best Practices for Content Lifecycle Management with MS SharePoint

Which option is better?

Connecting vs. Migrating– Value add of legacy system– Maintenance costs

• Hardware• Licensing and support• Knowledge

– Migration costs• Migration process• Tools• Training

Page 37: Best Practices for Content Lifecycle Management with MS SharePoint

Migrating vs. Connecting

Migrating• Data is available in SharePoint• Data is moved into SharePoint• SharePoint replaced legacy

system• Burden of storage is on

SharePoint• Changes saved in SharePoint• Migrate and decommission

Connecting• Data is available through

SharePoint• Data is left in source (legacy)

system• Give legacy system second life by

increasing its value• Burden of storage is on legacy

system• Changes propagate to source• Connect and forget

Page 38: Best Practices for Content Lifecycle Management with MS SharePoint

Connect to SharePoint: BCS Mechanisms

• .NET Assembly Connector– Provided with Microsoft Business Connectivity Services (BCS)– Each .NET connectivity assembly is specific to an external content type– Provides no Administration interface integration

• Custom Connector– Connect to external systems not directly supported by Business

Connectivity Services– Agnostic of external content types that connect to a kind of external

system (all databases or all Web services)– Provides an Administration UI integration

http://msdn.microsoft.com/en-us/library/ee554911.aspx

Page 39: Best Practices for Content Lifecycle Management with MS SharePoint

Which BCS Mechanism Should I Use?

• The .NET Assembly Connector approach is recommended if the external system is static. Otherwise, for every change in the back end, you must make changes to the .NET connectivity assembly DLL. This, in turn, requires recompilation and redeployment of the assembly and the models.

• Custom connector approach is recommended if the back-end interfaces frequently change. By using this approach, only changes to the model are required.

http://msdn.microsoft.com/en-us/library/ee554911.aspx

Page 40: Best Practices for Content Lifecycle Management with MS SharePoint

Connecting: 3rd Parties

40

(File Shares, Notes, Exchange Public Folders, eRoom Documentum, LiveLink… etc?)

• Most major ECM Vendors• Other 3rd Parties

EBS/RBS API’s preferred

Page 41: Best Practices for Content Lifecycle Management with MS SharePoint

Options for Exposing Legacy Data: Migration

How much content needs to be migrated?How long will this take? How much downtime can you tolerate?

How much customization do you have?

Is this a “big bang” migration or can you migrate in a scaled/phased approach?Can you accept loss of metadata and securities?

Can you engage other members to assist in the process and arrange for proper

training?

What minimal requirements do you have for this migration?

Can you properly map non-SharePoint related assets into SharePoint?

Questions to ask yourself…

etc…

Page 42: Best Practices for Content Lifecycle Management with MS SharePoint

ConsPros

SharePoint Migration Strategies

• Environments retaining ample amounts of outdated information

• Moving to new hardware or new architecture

• Puts Power Users in charge to recreate and manage sites

• Migrate relevant content to avoid import of old data

• Completely retains old environment

• Virtually no downtime – requires user switch to new environment

• Manual process, very resource intensive

• Requires willing participants and intensive training

• Requires additional steps to retain original URLs

• Requires new server farm and additional SQL Server storage space for new content

Best For

User-Powered Manual Migration• SharePoint Administrator installs the new version on separate hardware or a

separate farm and allows Power Users to manually recreate content

Page 43: Best Practices for Content Lifecycle Management with MS SharePoint

ConsPros

SharePoint Migration Strategies

• Any size environment, from single server environments to large, distributed farms

• Granular migration

• Retains all metadata

• Virtually no downtime

• Applicable to non-SharePoint repositories

• Costs associated with purchasing of additional software

• Requires new server farm

Best For

Migration via 3rd Party Tool• SharePoint Administrator installs the new version on separate hardware or a

separate farm, and migrates content and users using 3rd Party Tool

Page 44: Best Practices for Content Lifecycle Management with MS SharePoint

What About Access for Geo-Dispersed Users?

• Centralized environment, accessed globally• Centralized environment plus local content (sites,

etc)• Fully distributed, replicated architecture accessed

locally– Centralized or cloud storage backup for high

redundancy

Page 45: Best Practices for Content Lifecycle Management with MS SharePoint

• Out of the box SharePoint

• Lowest complexity, least costly

• Varied User Experience

• Evaluate bandwidth and usage patterns

Global ArchitecturesSingle Centralized Environment

Page 46: Best Practices for Content Lifecycle Management with MS SharePoint

• Local services and sites, in addition to main farm

• Increased infrastructure complexity

• Governance can be an issue

• Relocating teams/users is a pain

Global ArchitecturesCentralized plus local content

Page 47: Best Practices for Content Lifecycle Management with MS SharePoint

• Fast local access to SharePoint content

• Replicate only what is relevant

• Ability to handle remote locations

Global ArchitecturesFully distributed

Page 48: Best Practices for Content Lifecycle Management with MS SharePoint

• Backup locally or to alternative sites

• Consider cloud storage

• Can be used for high redundancy

Cloud Storage

Global ArchitecturesDistributed w/

Centralized Backup

Page 49: Best Practices for Content Lifecycle Management with MS SharePoint

ArchivingAdding Lifecycle Management to the

picture

Page 50: Best Practices for Content Lifecycle Management with MS SharePoint

Time

Acce

ss /

SLA

Re

quire

men

ts

Low

High Initial content creation

Moderate content retrieval

Lifecycle of a Typical Item

Page 51: Best Practices for Content Lifecycle Management with MS SharePoint

0 1 2 3 4

Active Data

Total Data

Time in years

Dat

a in

SQ

L

Storage in a Content RepositoryIncrease in % of inactive data over time

Page 52: Best Practices for Content Lifecycle Management with MS SharePoint

Data Lifecycle Management

• Records Center– Another SharePoint site– Higher % inactive content– Consider separate Content DB, with an RBS provider

implemented for this DB

• Archiving– Backup and delete– Workflow (Expirations)– 3rd Party tools solutions

Page 53: Best Practices for Content Lifecycle Management with MS SharePoint

3rd Party Archiving Tools

• What rules are available?– Last modified time– Author– Versions

• What scope can I apply rules to? (farm to item)• Does it use RBS/EBS APIs?• Does it integrate with other infrastructure

management tools? (backup, replication, etc.)

Page 54: Best Practices for Content Lifecycle Management with MS SharePoint

1

2

3

4

SummaryThink carefully about organization and storageConsider where content will be stored and how it will grow over time

Leverage BLOB Services APIs to Optimize SharePoint StorageEBS/RBS API’s can be leveraged to store BLOBs outside of SQL with

little impact on end-users, to save $$ and optimize storage

Content access is keyDevelop strategies to handle access to legacy data and content access from remote locations

Archive contentPlan for long term growth and optimal system performance

Page 55: Best Practices for Content Lifecycle Management with MS SharePoint

AvePoint – Who we areGlobal Leader in SharePoint Infrastructure ManagementBackup & Recovery, Administration, Replication, Migration, Compliance, Storage Optimization

• Founded in 2001 • Headquartered in Jersey City, NJ, with global offices in:

– USA: Chicago, San Jose, Houston, Washington D.C., Redmond– International: UK, Germany, Australia, Japan, Singapore, Canada

• R&D team of 350+ Largest SharePoint team outside of Microsoft• Winner of 2008 Best of Tech Ed Award for Best SharePoint Product• Exclusive OEM relationships with IBM and NetApp• A Depth Managed Microsoft Gold Certified ISV Partner

– MTC Alliance Member; Notes Transition Partner; Office TAP 14 Member; BPOS TAP Member

Page 56: Best Practices for Content Lifecycle Management with MS SharePoint

Applicable Features of AvePoint Tools

• DocAve Report Center– Storage growth and trending– Server performance and monitoring

• DocAve Administrator– Manage site quotas and alerts– Move sites between Content DBs

• DocAve Replicator– Fully mapped, live or scheduled replication of all

SharePoint contents

Page 57: Best Practices for Content Lifecycle Management with MS SharePoint

Applicable Features of AvePoint ToolsConnecting• DocAve Connectors

– Leverage EBS/RBS APIs to expose File Share Content as fully functional SharePoint object– Content works with Office Applications, alerts, workflows, 3rd party application, etc…

Migrating• DocAve Migrators for SharePoint

– From previous versions of SharePoint, File Shares, Exchange Public Folders, Lotus Notes, Documentum eRoom, EMC Documentum, Livelink, Oracle/Stellant, Vignette

– Offers granular selection of content, full graphical user/domain/properties mapping• DocAve Content Manager

– Consolidates existing SharePoint instances (other sites or farms that are the same SharePoint version) into a single SharePoint instance, while maintaining all metadata

– Offers granular selection of content, full graphical user/domain/properties mapping

Page 58: Best Practices for Content Lifecycle Management with MS SharePoint

Demo?

Page 59: Best Practices for Content Lifecycle Management with MS SharePoint
Page 60: Best Practices for Content Lifecycle Management with MS SharePoint

Thank You!Q&A

Page 61: Best Practices for Content Lifecycle Management with MS SharePoint

Resources - www.AvePoint.com

61

Visit us: http://www.AvePoint.com

Email us: [email protected]@avepoint.com

Follow us: @AvePoint_Inc@mlmackie

Download a FREE, fully-enabled 30 Day trial of DocAve at www.avepoint.com/download

Page 62: Best Practices for Content Lifecycle Management with MS SharePoint

Additional Resources

• Storage Optimization for SharePoint Whitepaper :

http://www.avepoint.com/assets/pdf/sharepoint_whitepapers/Storage_Optimization_Technical_Advisor.pdf

• Configure Content Database for RBS: http://technet.microsoft.com/en-us/library/ee748641(office.14).aspx

• FILESTREAM RBS:http://blogs.msdn.com/opal/archive/2009/12/07/sharepoint-2010-beta-with-filestream-rbs-provider.aspx

• Whitepaper about FILESTREAM:http://msdn.microsoft.com/en-us/library/cc949109.aspx

Page 63: Best Practices for Content Lifecycle Management with MS SharePoint

Backup Slides

Page 64: Best Practices for Content Lifecycle Management with MS SharePoint

SharePoint Migration StrategiesEngage Power Users In Content Migration:

• Create a dedicated Power Users group - have a Power Users SharePoint Site so that all the power users can share best practices and lessons learned with one another

• Provide expensive training on SharePoint to all Power Users • Request Power Users to Migrate Content – they should be empowered

and proactive about content migration and administration • Request Power Users to train new SharePoint users to properly use their

specific sites – provide training materials, videos, etc. to new users to lower TCO for IT training

A Power User should be very familiar with SharePoint and have either Full Control or Design permissions (or their equivalent) for the site they will manage. (Restrict Site Deletion Permission)

TIP

Page 65: Best Practices for Content Lifecycle Management with MS SharePoint

Connecting to SharePoint: .NET Assembly

65

• Write code as Microsoft .NET Framework classes and compile the classes into a primary DLL and multiple dependent DLLs.

• Publish the DLLs into the Business Data Connectivity (BDC) service database.

• Use Microsoft SharePoint Designer to discover the .NET Connectivity Assembly and create a model.

• Map each entity to a class in the DLL, and map each BDC operation in that entity to a method inside that "Class".

At run time, when a user executes a BDC operation, the corresponding method in the primary DLL is executed.http://msdn.microsoft.com/en-us/library/ee554911.aspx

Page 66: Best Practices for Content Lifecycle Management with MS SharePoint

Connecting to SharePoint: Custom

66

• Implement ISystemUtility, IConnectionManager, and ITypeReflector interfaces. • Implementing IAdministrableSystem provides Administration UI property

management support and implementing ISystemPropertyValidator provides import time validation of LobSystem properties (not on the Microsoft Office client).

• Compile the code into a DLL and place it in the global assembly cache (GAC) on the server and clients.

• Author the model XML for the custom data source (SharePoint Designer 2010 does not support a model authoring experience for custom connectors).

At run time when a user executes a BDC operation, this invokes the Execute method in the ISystemUtility class. The responsibility of executing the back-end method is given to the Execute method.

http://msdn.microsoft.com/en-us/library/ee554911.aspx