16
Razi Mokatren Golan Salman Privacy in a Demographic Privacy in a Demographic Database Database

Razi Mokatren Golan Salman Privacy in a Demographic Database

Embed Size (px)

Citation preview

Page 1: Razi Mokatren Golan Salman Privacy in a Demographic Database

Razi Mokatren Golan Salman

Privacy in a Demographic DatabasePrivacy in a Demographic Database

Page 2: Razi Mokatren Golan Salman Privacy in a Demographic Database

Israel Central Bureau of Statistics (CBS, הלמ"ס)

1. Governmental unit that works under the auspices of the Prime Minister's Office.

2. Annually holds a comprehensive survey, provide information on the role of Israel's population welfare and living conditions.

3. Order of the size of the survey is about 7000 people.

4. The results of the survey publish online on the CBS website.

Page 3: Razi Mokatren Golan Salman Privacy in a Demographic Database

The way the data appeared: The website allow to see the results when it’s

divided and filtered according to different categories

Up to two filters and four variables.

Page 4: Razi Mokatren Golan Salman Privacy in a Demographic Database

The question we’ll deal in The question we’ll deal in this projectthis project : :

Giving the data in the website, whether is it possible to restore the answers of one of the

participants.

Let’s look at the following table created after the following selection:

Filter A - Status / WidowerFilter B - Military service / YesVariable A - SexVariable B - number of children.

Page 5: Razi Mokatren Golan Salman Privacy in a Demographic Database

We found the following legal:Every 1 in the table created after choice of two filters and two variables represents a participant who took part in the survey. Now, we can restore his records. How?

Recall that the table on the previous page we got after choosing two filters and two variables.

Now, we’ll create the table with the same two filters, the same two variables in addition to another new variable

A widower, served in the army, without children and male.

Page 6: Razi Mokatren Golan Salman Privacy in a Demographic Database

Let’s look at the following table: Same as the previous one, only with the addition of

a third variable (employment status)

From the Previous table: A widower, served in the army, without children and male.

From this table: Employed. This is 5 things.

Page 7: Razi Mokatren Golan Salman Privacy in a Demographic Database

In this point we know 5 things: In this point we know 5 things:

How can we get the others? In loop, How can we get the others? In loop, we’ll switch the third parameter.we’ll switch the third parameter.

How can we make sure that How can we make sure that the records are correct?the records are correct?

Page 8: Razi Mokatren Golan Salman Privacy in a Demographic Database

-We created a lot of random samples of size 7064.

-Our algorithm ran on those samples and extracted records from them.

-Because the sample is random, we can make sure that the records extracted are true.

After getting the records from the CBS, we can't compare them to any database,

to make sure they are correct.

So, how can we know that the records extracted from the CBS website reflect

the real data?

Page 9: Razi Mokatren Golan Salman Privacy in a Demographic Database

The ResultsThe Results::The random samples helped us understand

that the algorithm is working. Let’s look at the real result:

From the 7064 records in the real survey, we managed to restore amount of XXX records.

Each records include personal information of a person, who received a promise of Confidentiality.

Page 10: Razi Mokatren Golan Salman Privacy in a Demographic Database

Once we realized that we can restore the Once we realized that we can restore the records, we went to the nextrecords, we went to the next destination:destination:

Attempt to find one of the people who took Attempt to find one of the people who took part in the survey of 2011part in the survey of 2011..

Friends and family.

Forums in the Internet.

The data we extracted.

Facebook.

Page 11: Razi Mokatren Golan Salman Privacy in a Demographic Database

At the same time we tried to look at the At the same time we tried to look at the records and look for specific details which records and look for specific details which

will help us to find a participate in the survey.will help us to find a participate in the survey.

Then we noticed the following information:

1. A Muslim woman, 30 years old and single.2. Monthly salary of over 21,000 ₪.3. Lives in a small village in the north (probably a village).4. Commute time to work: an hour and a half.

Razi suspected he knew the girl. To make sure he contacted her and asked if previously participated in the CBS poll. She said yes.

Page 12: Razi Mokatren Golan Salman Privacy in a Demographic Database

Nada Shaladi, 30 years Nada Shaladi, 30 years old, live in old, live in Kpar EicselKpar Eicsel..

Morning Yom Kippur eve, 7.10.2011: Representative of the Morning Yom Kippur eve, 7.10.2011: Representative of the CBSCBS knocked on Nada door. He emphasizeknocked on Nada door. He emphasize that the survey

is anonymous.

When we presented to Nada the information When we presented to Nada the information we have, she couldn’t believe itwe have, she couldn’t believe it..

Page 13: Razi Mokatren Golan Salman Privacy in a Demographic Database

Did you serve in the army?

No

What was your gross salary last month, before deductions, from all places of work? (in NIS)

More then 21,000 NIS

What is you Religion?

Moslem

How long does it usually take you to get to workplace?

60-89 minutes

Some of the details we managed to find out about Nada.

Page 14: Razi Mokatren Golan Salman Privacy in a Demographic Database

Return to the question we tried to answer:Return to the question we tried to answer: Is it possible to restore the answers Is it possible to restore the answers of a particular person from the data of a particular person from the data

appeared in the website?appeared in the website?

This project prove that the answer to the question is YES.

What does it say about the security in What does it say about the security in the CBS websitethe CBS website??

Page 15: Razi Mokatren Golan Salman Privacy in a Demographic Database

We could see in the CBS We could see in the CBS website severe privacy of the website severe privacy of the

survey participants.survey participants.

1. Even tough it’s not Immediate, a person who want to find out some personal details of a specific participant, could easily achieve it.

2. Most of the participants don’t aware to the fact that their personal data exposed to all in the website.

3. It is not clear whether the CBS people aware to the failure.

Page 16: Razi Mokatren Golan Salman Privacy in a Demographic Database

The question we’ll deal in The question we’ll deal in this projectthis project : :

Giving the data in the website, whether is it possible to restore the answers of one of the

participants.