Upload
cody-byrd
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
© 2010 IBM Corporation
The Core Principles of Information Governance
Brian Kordelski – WW Sales Executive – IBM InfoSphere 12/07/2010
© 2010 IBM Corporation2
Governance is no longer an option
“By 2013, 25% of the companies in highly regulated industries will create and staff positions in accounting, human resources, compliance and audit and law that deal explicitly with the management of information via technology.”
“[A]n [information management] strategy should incorporate life-cycle information governance practices [to ensure] consistent execution of ... business optimization, agility, and transformation [initiatives].”
“If you are going to protect your company's most valuable asset—your data—you will begin to view data security as a component of a more comprehensive information governance strategy.”
– Gartner, Inc. “Organizing for
Information Governance” Debra Logan, November 2009
– Forrester Research, Inc.“Refresh Your Information Management Strategy to
Deliver Business Results” Rob Karel & James
G. Kobielus, August 2009
– Hurwitz & Associates“Why you need an
information governance strategy for 2010”
Marcia Kaufman, December 2009
© 2010 IBM Corporation3
Information Governance Council Maturity Model
Enhances
Requires
Supports
© 2010 IBM Corporation4
If we don’t proactively manage quality
Increase costs and missed revenue opportunities, impacting both financials and customer relationships due to lack of data quality.Incomplete and inaccurate master data created problems in receiving and/or shipping products, marketing literature and regulatory mailings, and 360-degree customer visibility.
Small error in the quality of the rating data leads to negative impact for the company and unhappy customersLarge Telecom provider with massive volume of telephone calls and telephone customers, even a small error in the rating data can mean significant revenue loss or customer turnover.
Data quality issues plague BI initiatives creating a lack of trust in the dataSeveral attempts at implementation of a data warehouse and analytics application at a major retailer had stalled due to data quality issues which created frustration for the project team and a lack of trust of the data on the part of business users.
© 2010 IBM Corporation5
Requirements to manage the quality of data
DevelopDevelop& Test& Test
Cleanse & Cleanse & Manage Continuously Manage Continuously
Design your data structures
Define common vocabulary
Discover your data across systems
RemediateInconsistencies
Actively Monitor & Manage Data
Define Rules &Cleanse Data
UnderstandUnderstand& Define& Define
Validate test results
Create & refresh test data
Develop databasestructures
© 2010 IBM Corporation6
Understand your information
Data can be distributed over multiple applications, databases and platforms
– Where are those databases located?
Complex, poorly documented data relationships
– Which data is sensitive, and which can be shared?
– Whole and partial sensitive data elements can be found in hundreds of tables and fields
Data relationships not understood because:
– Corporate memory is poor
– Documentation is poor or nonexistent
– Logical relationships (enforced through application logic or business rules) are hidden
?
??
??
??
?
???
?
?
?
?
?
?
?
??
?
??
?
??
?
?
?
?
Distributed Data Landscape
© 2010 IBM Corporation7
IT Architect
Marketing Manager
Support Rep
CRM Project Manager
Business Intelligence
Manager
ERP Project Manager
Business Analyst
Financial Officer
Compliance Officer
Sales Lead
How does each user define: “Active Subscriber”?
Mobile user who has used “any” service in the mobile network
User who paid for the service at least 1 time in the past 90 days.
Mobile user who has a phone plan, but not SMS
Only post-paid customers, not pre-paid customers
User who makes at least 1 call over the period of 90 days
Gain consistent terminology
© 2010 IBM Corporation8
Cleanse and continuously manage your data
1. Create reusable quality rules & cleanse your data– Leverage the knowledge gained during the understand
& define steps– Define what quality means to you– Design your data quality rules and matching logic
2. Actively monitor & manage your data– Standardize data formats– Leverage precisely calibrated matching rules and
remove duplicates– Develop rules & quality metrics for monitoring– Manage duplicate data, when required
3. Remediate inconsistencies in your data– Monitor for problems or trends– Investigate data lineage to find source of problem– Repair data and source of problem– Maintain monitoring to capture future problems
Make sure there is an owner of data quality AND management sponsorship
© 2010 IBM Corporation9
Monitor quality with integrated data rules
Create “Checks & Balances” to proactively identify quality concerns throughout the lifecycle– Build & test rules for common or complex conditions– Extend profiling through targeted analysis of specific data conditions or conformance to
expected rules– Establish benchmarks and baselines to help track data quality – is it deteriorating or
remaining constant? – Flag bad data for audit
Examples of Rules: – The Gender field must be populated and must be in the list of accepted values– The Social Security Number must be numeric and in the format 999-99-9999– If Date of Birth Exists AND Date of Birth > 1900-01-01 and < TODAY Then Customer
Type Equals ‘P’– The Bank Account Branch ID is valid in the Branch Reference master list
© 2010 IBM Corporation10
IBM provides the solutions required to create high quality information
DevelopDevelop& Test& Test
Cleanse & Cleanse & Manage Continuously Manage Continuously
Design your data structures
Define common vocabulary
Discover your data across systems
RemediateInconsistencies
Actively Monitor & Manage Data
Define Rules &Cleanse Data
UnderstandUnderstand& Define& Define
Validate test results
Create & refresh test data
Develop databasestructures
© 2010 IBM Corporation11
Organizational challenges from lack of data lifecycle management
New application functionality to meet business needs is not deployed on schedule – No understanding of relationships between data objects repeatedly delays projects– Greater data volumes take longer to clone, test, validate and deploy which equates to
longer test cycles
Increased operational and infrastructure costs impact IT budget– Cloning databases requires more storage hardware– Larger databases impact staff productivity and could mean additional license costs
Application defects are discovered after deployment– Costs to resolve defects in production can be 10 – 100 times greater than those caught
in the development environment
Unintentional disclosure of confidential data kept in test/development environments
“ Forrester estimates that 85% of data stored in databases is inactive
Source: Noel Yuhanna, Forrester Research, Database Archiving Remains An Important Part Of Enterprise DBMS Strategy, 8/13/07
© 2010 IBM Corporation12
1 TB1 TB
Actual Data Burden = Size of production database + all replicated clones
The data multiplier effect
1 TB1 TB
1 TB1 TB
1 TB1 TB
Development
Test
UserAcceptance
Production
1 TB1 TB Backup
1 TB1 TBDisasterRecovery
6 TB6 TBTotal
© 2010 IBM Corporation13
Requirements to manage data across its lifecycle
Develop &Develop &TestTest
Discover &Discover &DefineDefine
Optimize, Archive Optimize, Archive & Access& Access
Consolidate &Consolidate &RetireRetire
Move only the needed information
Rationalize application portfolio
Validate test resultsDefine policiesReport & retrieve
archived data
Create & refresh test data
Manage data growthClassify & define
data and relationships
Develop database structures & code
Enhance performance
Discover where data resides
Enable compliance with retention &
e-discovery
© 2010 IBM Corporation14
DevelopmentDevelopmentEnvironmentEnvironment
QAQAEnvironmentEnvironment
TestTestEnvironmentEnvironment
TrainingTrainingEnvironmentEnvironment
Production orProduction orProduction CloneProduction Clone DevelopmentDevelopment
EnvironmentEnvironment
Implement test data management with masking
Create targeted, right-sized test environments instead of cloning entire production environments
Mask data to protect privacy
Compare data pre/post test to identify quality issues
© 2010 IBM Corporation15
Current
Production
Historical
Retrieve
Universal Access to Application Data
Application Application XML ODBC / JDBC
Archive to manage data growth
Archives
Reporting Data
Historical Data
Reference Data
Archive
Mashup
Archiving is an intelligent process for moving inactive or infrequently accessed data that still has value, while providing the ability to search
and retrieve the data
Retrieved
© 2010 IBM Corporation16
Diagnose and solve performance problems
Identify problems before they impact business
Diagnose performance problems quickly & easily
Implement a permanent solution, not a temporary workaround
Plan for the future while avoiding past mistakes
© 2010 IBM Corporation17
When you retire or consolidate applications don’t move all of the data
Application portfolio has redundant systems acquired via mergers and acquisitions Line of business divested; application is no longer needed Legacy technologies not compatible with current IT direction
– Old database and/or application versions no longer supported by manufacturer Required technical skills or application knowledge no longer available Budget pressures – do more with less
In almost ALL cases, access to legacy data MUST be retained while the application and database are eliminated
© 2010 IBM Corporation18
IBM provides the solutions required to manage information throughout its lifecycle from requirement to retirement
Develop &Develop &TestTest
Discover &Discover &DefineDefine
Optimize, Archive Optimize, Archive & Access& Access
Consolidate &Consolidate &RetireRetire
Move only the needed information
Rationalize application portfolio
Validate test resultsDefine policiesReport & retrieve
archived data
Create & refresh test data
Manage data growthClassify & define
data and relationships
Develop database structures & code
Enhance performance
Discover where data resides
Enable compliance with retention &
e-discovery
© 2010 IBM Corporation19
The data privacy and protection risk continues
Confidential data that should be redacted can be hidden or embeddedApril 2010: A PDF of a subpoena in the case of “United States vs. Rob Blagojevich” was posted to public website. However, the “redacted” text simply had black box placed on top to hide the content – the actual text was still available.
Unprotected test data sent to and used by test/development teams as well as third-party consultants.February 2009: An FAA server used for application development & testing was breached, exposing the personally identifiable information of 45,000+ employees.
Confidential data inadvertently exposed or otherwise available to unauthorized viewers.February 2010: About 600,000 customers of a major NYC bank received their annual tax documents with their Social Security numbers (combined with other numbers & letters) printed on the outside of the envelope.
SQL injection is fast becoming one of the biggest & most high profile web security threats.July 2010: Hackers obtained access to the user database and administration panel of a popular website by exploiting several SQL injection vulnerabilities. The exposed data included user names, passwords, e-mail addresses and IPs.
© 2010 IBM Corporation20
“ Larry Ponemon, founder of the group that bears his name, said that survey shows a shift in the way C-level executives think about security software. Investing in data protection, he said, is now seen as less expensive than
recovering from a data breach. -- InformationWeek
Can today’s organizations successfully protect their information?
Where does your sensitive data reside across the enterprise?
How can your data be protected from both authorized and unauthorized access?
Can your confidential data in documents be safeguarded while still enabling the necessary business data to be shared?
How can access to your enterprise databases be protected, monitored and audited?
Can data in your non-production environments be protected, yet still be usable for training, application development and testing?
© 2010 IBM Corporation21
Requirements to manage the security and protection of data
Discover &Discover &DefineDefine
Secure &Secure &ProtectProtect
Monitor Monitor & Audit& Audit
Define policies & metrics
De-identify confidential data in non-production
environments
Assess database vulnerabilities
Classify & define data types
Safeguard sensitive data in documents
Monitor and enforce database access
Discover where sensitive data resides
Protect enterprise data from both authorized &unauthorized access
Audit and report for compliance
© 2010 IBM Corporation22
Discover where sensitive data may be hidden
Relationships and sensitive data can’t always be found just by a simple data scan
– Sensitive data can be embedded within a field
– Sensitive data could be revealed through relationships across fields & systems
When dealing with hundreds of tables and millions of rows, this search is complex – you need the right solution
Patient Result Test3802468 N 534182715 N 534600986 N 325061085 N 535567193 N 726123913 Y 476736304 N 347409934 N 348150928 N 478966020 N 34
System A Table 15
Sensitive Relationship Discovery
Code Name53 Streptococcus pyogenes72 Pregnancy 32 Alzheimer Disease47 H1N134 Dermatamycoses
System Z Table 25
Number Name4600986 AlexFulltheim8150928 BarneySolo6736304 BillAlexander3802468 BobSmith5567193 EileenKratchman7409934 FredSimpson6123913 GregLougainis5061085 JamieSlattery4182715 JimJohnson8966020 MartinAston
System A Table 1
Patient ID # embedded within another fieldPatient ID # embedded within another field
Compound sensitive data: Test results could potentially be revealed.
Compound sensitive data: Test results could potentially be revealed.
© 2010 IBM Corporation23
Protecting data is both an external and internal issue
Prevent “power users” from abusing their access to sensitive data (separation of duties)
– DBA and power users
Prevent authorized users from misusing sensitive data– For example, third-party or off-shore developers
Prevent intrusion and theft of data– For example, someone walking off with a back-up tape– Hacker– Database vulnerabilities (user id with no password or
default password)
© 2010 IBM Corporation24
Protection of data requires a 360-degree strategy
Secure sensitive data values– Across both structured and unstructured
De-identify data– Restricted data sharing with 3rd parties– Generation of fictionalized test data for non-production– Support off-shore deployment model
Stop unauthorized data access – Render data useless via encryption– Lock down SQL to prevent SQL injection– Block suspicious network traffic
Security makes it possible for us to take risk, and innovate confidently.
© 2010 IBM Corporation25
Protect sensitive data values within documents
Redact (or remove) sensitive unstructured data found in documents and forms, protecting confidential information while supporting the need to share critical business information
– Support compliance with industry-specific and global data privacy requirements or mandates
Leverage an automated redaction process for speed, accuracy and efficiency– Ensure hidden source data (or metadata) within documents is redacted as well
Prevent unintentional disclosure by using role-based masking to confidently share data
Ensure multiple file formats are support, including PDF, text, TIFF and Microsoft Word documents
Redact Full Name& Street Address
© 2010 IBM Corporation26
De-identify data without impacting test & development
Mask or de-identify sensitive data elements that could be used to identify an individual
Ensure masked data is contextually appropriate to the data it replaced, so as not to impede testing
– Data is realistic but fictional– Masked data is within permissible range of values
Support referential integrity of the masked data elements to prevent errors in testing
Personal identifiable information is maskedwith realistic but fictional data for testing & development purposes.
JASON MICHAELSJASON MICHAELS ROBERT SMITHROBERT SMITH
© 2010 IBM Corporation27
“Most organizations do not have mechanisms in place to prevent database administrators and other privileged database users from reading or tampering with sensitive information [in business applications]…Fewer than two out of five respondents said they could prevent such tampering by super users.
-- Independent User Group
What happens with security complacency
Not being able to report compliance can lead to regulatory fines – No audit report mechanism– No fine grain audit trail of database activities
Don’t know if there is a data breach until it’s too late– Lack of awareness of suspicious access patterns– On-going vs. single-invent: problems identifying patterns of unauthorized use
Not able to monitor super user activity to ensure data security standards– Unable to detect intentional and unintentional events
© 2010 IBM Corporation28
Streamline and simplify compliance processes
Alerts of suspicious activity Audit reporting and sign-offs
– User activity– Object creation– Database configuration– Entitlements
Separation of duties – creation of policies vs. reporting on application of policies
Trace users between applications, databases Fine grained-policies Sign-off and escalation procedures Integration with enterprise security systems (SIEM)
© 2010 IBM Corporation29
IBM provides the solutions required secure and protect data privacy
Discover &Discover &DefineDefine
Secure &Secure &ProtectProtect
Monitor Monitor & Audit& Audit
Define policies & metrics
De-identify confidential data in non-production
environments
Assess database vulnerabilities
Classify & define data types
Safeguard sensitive data in documents
Monitor and enforce database access
Discover where sensitive data resides
Protect enterprise data from both authorized &unauthorized access
Audit and report for compliance
© 2010 IBM Corporation30
The IBM security strategy: Make security, by design, an enabler of innovative change
IBM as a trusted partner, delivering secure products and services
IBM as a trusted security vendor, providing key solutions across all security domains
15,000 researchers, developers and SMEs on security initiatives
– Data Security Steering Committee
– Security Architecture Board
– Secure Engineering Framework 3,000+ security & risk management patents 200+ security customer references and 50+
published case studies 40+ years of proven success securing the
zSeries environment Managing more than 7 Billion security
events per day for clients
© 2010 IBM Corporation31
Delivering trusted information for smarter business decisions across your entire information supply chain
AnalyzeIntegrate
Transactional & Collaborative Applications
Manage
Business Analytics Applications
External Information Sources
Cubes
Streams
Big Data
Master Data
Content
Data
StreamingInformation
Data Warehouses
GovernQuality
Security & PrivacyLifecycle
© 2010 IBM Corporation32
Enabling successIBM Information Governance Unified Process
Define Business Problem
Obtain Executive
Sponsorship
Conduct Maturity
Assessment
Build Roadmap
Establish Organization
Blueprint
Build Data Dictionary
Understand Data
Create Metadata
Repository
Define Metrics
Appoint Data Stewards
Manage Data Quality
Implement Master Data Management
Create Specialized Centers of
Excellence (COE)
Manage Security &
Privacy
Manage Life-cycle
Measure Results
= Enable through Process
= Enable through Technology
© 2010 IBM Corporation33
What can you do next …
Start small with a project, don’t try to do it all at once– Free workshops and assessments– Best of breed solutions to help you succeed
Join a movement: www.infogovcommunity.com– Benchmark your organization online – Work with others on the Maturity Model– Compare best practices in online peer reviews– Be recognized for what you contribute on the leader
board
Read the book: – The IBM Data Governance
Unified Process: Driving Business Value with IBM Software and Best Practices
Visit our web page: – ibm.com/informationgovernance
© 2010 IBM Corporation
Thank you