Hippocratic Databases Paper by Rakesh Agrawal, Jerry Kiernan, Ramakris hnan Srikant, Yirong Xu CS 681 Presented by Xi Hua March 1st,Spring05

Hippocratic Databases Paper by Rakesh Agrawal, Jerry Kiernan, Ramakrishnan Srikant, Yirong Xu CS 681 Presented by Xi Hua March 1st,Spring05

  • View

  • Download

Embed Size (px)

Citation preview

Hippocratic Databases

Paper by Rakesh Agrawal, Jerry Kiernan, Ramakrishnan Srikant, Yirong Xu

CS 681Presented by Xi HuaMarch 1st,Spring05


Introduction of Current Database Systems

Concept of Hippocratic Database Principles of Hippocratic Database Strawman Design Problems Conclusion

Fundamental Properties and Capability of current database

1. Managing persistent data.2. Accessing a large amount of data efficiently.In addition, the following capability are found

universally.1. Support for at least one data model.2. Support for certain high-level languages.3. Transaction management4. Access control5. Resiliency

Statistical Databases Goal Providing statistical information without

compromising sensitive information about individuals

Broadly classified Techniques Query restriction Data perturbation

Common character with Hippocratic databases

Preventing disclosure of private information

Secure Databases Goal Sensitive information must be transmitted

over a secure channel and stored securely. Comparing with Hippocratic Database Hippocratic database benefit from secure

databases and has been inspired a lot from it.

Principles of a Hippocratic Database

Privacy Regulations and Guidelines OECD Guidelines (Organization for

Economic Co-Operation and Development) Most well known Set out 8 principles for data protection:

collection limitation, data quality, purpose specification, use limitation, security safeguards, openness, individual participation and accountability.

Ten PrinciplesRooted in the privacy regulations and guidelines.1. Purpose Specification2. Consent3. Limited Collection4. Limited Use5. Limited Disclosure6. Limited Retention7. Accuracy8. Safety9. Openness10. Compliance

Strawman Design A Use Scenario


Architecture as below

Strawman Design

Strawman Design

Privacy MetadataDefine purpose, and for each piece of information collected for that purpose.

- external-recipients- retention-period- authorized-users

Strawman Design

Strawman Design

Strawman Design

Strawman Design

Strawman Design Data Collection- Matching Privacy Policy with User Prefere

nce- Data Insertion- Data Preprocessing

Strawman Design Queries- Before Query Execution- During Query Execution- After Query Execution

Strawman Design Retention

Deletes data items that have outlived their purpose. If has more than one purpose, kept the period time based on the longest retention time, e.g. Alice’s information in the order table will be deleted after 1 month, while Bob’s information will be kept for 10 years.

Strawman DesignFor the purchase purpose: All the attributes have a retention period

of 1 month The name and shipping-address are give

n to the delivery company The name and credit-card-info are given

to the credit-card company

P3P Platform for Privacy Preference - Developed by the World Wide Web Consortium - Motivation: enable user to gain more control on th

eir personal information. - Technology: encode data-collection in a XML form

at known as a P3P policy programmatically compared against user’s privacy preference.

- Problem: no mechanism for making sure sites act according to their stated policies.

P3P and Hippocratic Databases Similarity

The concept of Hippocratic Databases is similar with the concept of P3P’s purpose and retention.

How to implement in Hippocratic Databases?Take P3P policies, process them through the privacy metadata processor, and generate the corresponding data structures in Hippocratic Databases system.

Problems Language- Are P3P formats are sufficient for specifying

policies and preferences in Hippocratic Databases?P3P is for web shopping, but Hippocratic Databases being used in many fields, e.g. finance, insurance and etc. Hence, we need to develop a policy specification language use the work done for P3Pas the starting point.

- Tradeoff between expressibility and usability

Problems Efficiency- Cost of privacy checking

Techniques for reducing the cost of each check e.g. encode the set of purposes associated with each record by

setting a bit in a word. The record access control check then requires a bit-wise AND of two words, and check the result.

- Impact disk space and the complexity of adding checkse.g. chosen an alternate implementation in the strawman desi

gn where we only tag the records in the customer table with purpose. When scan records in the order table, we do a join on customer-id to get the purpose for those records.

Problems Limited Collection- Principle: a query accesses only the data values

needed to fulfill its purpose and the database store the minimal information necessary to fulfill all the purposes.

- Problems Access analysis Granularity analysis Minimal query generation


Limited Disclosure- Dynamically determine the set of

recipients provides limited disclosure a challenge.

- Solution: borrows from public-privacy key technology.


Limited RetentionWe can delete a record from a Hippocratic database when no longer any purpose associated with it. But how do we delete a record or field from the logs and past checkpoints, without affecting recovery?


Safety- The storage media on which the

tables are stored might suffer from attacks.

- Solution: encryption of database files on disk or selective encryption of fields might help


OpennessHow does the user access the information he need? How does the database know he is really that user not someone else?


Compliance- Universal logging- Tracking Privacy Breaches


Enunciated the key privacy principles that Hippocratic databases should support

Presented a strawman design for a Hippocratic databases.

Identified the technical challenges and problems.