1
PARALLEL PROCESSING tools for today's optiker Electronic Information Metadata: What You Don't Know Might Hurt You BY ROBERT M. JOPSON M etadata are data about data. Here we discuss metadata that describe the contents of a word processor document. Because a doc- ument can contain information of which you are unaware, if you're not careful, you will be providing this information to others when you give them the document. The most notorious case involves the hacker who wrote the Melissa virus, which was a Microsoft Word document. The metadata it contained was used to track down the unsuspecting sus- pect. We concentrate in this article on metadata produced by Microsoft Word 97 Service Release 2. Al- though Microsoft Word generates documents in a proprietary format, this format is one commonly ac- cepted for technical publications and is used extensively. Documents written using any popular word processor are likely to contain some metadata similar to the types de- scribed here. Document metadata often in- clude the name of the author and perhaps an address. A virus writer would want to eliminate this infor- mation, for obvious reasons, but anonymity is also desirable in more innocent pursuits, such as refereeing a paper. OSA does not forward ref- eree reports by e-mail, but some or- ganizations do. If the organization sends your report to the author without modification, the author can find out who wrote it. You can eliminate this "feature" from future documents by clicking "Tools," then "Options..." When you click on "User Information" you will see boxes containing the information the word processor has for your name, initials and address. The name information seems to be pop- ulated automatically, perhaps when Word is installed, so it is usually present. The other information is often absent. Modify or delete any information as desired and then click "OK." All newly created docu- ments will have the new informa- tion. However, if you do not change the file name, existing documents retain the "User" information with which they were created even when modified and saved. A comment contains the name and initials of the user who created it. Comments in your document may be visible or not depending on your view options (Click "Tools," then "Options...." then "View"). To see if any comments are present, put check marks next to "Hidden Text" and "Screen Tips" by clicking on the boxes if they are empty and then click OK. A yellow comment mark will appear in the location of each comment. You can write anonymous comments by modify- ing your User Information as de- scribed above. Remove an existing comment by right-clicking on the associated comment mark and clicking "Delete Comment." A record is also created when you work on your document with revision tracking enabled (click "Tools," then "Track Changes," then "Highlight Changes..." put a check mark next to "Track changes while editing" by clicking the box, and click "OK"). Again, your "User In- formation" is used so if you have "anonymized" it, your name will not appear. However, a record will still be made of the revision history if you allow changes to be tracked. To avoid this, remove the check- mark next to "Track changes while editing" using the procedure de- scribed above. This will put a halt to future revision tracking. To see if a document contains a revision his- tory, click "Tools," then "Track Changes," then "Highlight Changes...," and put a checkmark next to "Highlight changes on screen." To cleanse any revision his- tory you find, click "Tools," then "Track Changes," then "Accept or Reject Changes..." and follow the prompts for each separate revision. Undesired information may be saved as a "property" of the docu- ment. To view and modify this in- formation, click "File" and then "Properties." Information con- tained in the "Summary" tab can be changed. This can become difficult when you are connected to a net- work. The system may inject your network user name into the docu- ment when you open or save the document. If your system allows you to do so, modify the "Summa- ry" information as desired while not logged into the network. Then log on and transfer the file as de- sired without opening it. A simple technique for stripping off some types of metadata is to save the file in rich-text format (rtf), open it, and then save it as a MSWord document. MSWord documents contain more types of metadata than can be described here. A more complete de- scription of them together with methods of elimination can be found at http://support.microsoft.com/sup- port/kb/articles/Q223/7/90.asp. However, if you wish to know ex- actly what you are sending, convert your document to a textfileand send the text file. Bob Jopson works on lightwave systems at Bell Lab- oratories. He can be reached at [email protected]. 40 Optics & photonics News/ October 2000

Metadata: What You Don't Know Might Hurt You

Embed Size (px)

Citation preview

Page 1: Metadata: What You Don't Know Might Hurt You

PARALLEL PROCESSING tools for today's optiker

Electronic Information

Metadata: What You Don't Know Might

Hurt You BY ROBERT M. JOPSON

Metadata are data about data. Here we discuss metadata that

describe the contents of a word processor document. Because a doc­ument can contain information of which you are unaware, if you're not careful, you will be providing this information to others when you give them the document. The most notorious case involves the hacker who wrote the Melissa virus, which was a Microsoft Word document. The metadata it contained was used to track down the unsuspecting sus­pect.

We concentrate in this article on metadata produced by Microsoft Word 97 Service Release 2. Al­though Microsoft Word generates documents in a proprietary format, this format is one commonly ac­cepted for technical publications and is used extensively. Documents written using any popular word processor are likely to contain some metadata similar to the types de­scribed here.

Document metadata often in­clude the name of the author and perhaps an address. A virus writer would want to eliminate this infor­mation, for obvious reasons, but anonymity is also desirable in more innocent pursuits, such as refereeing a paper. OSA does not forward ref­eree reports by e-mail, but some or­ganizations do. If the organization sends your report to the author without modification, the author

can find out who wrote it. You can eliminate this "feature" from future documents by clicking "Tools," then "Options..." When you click on "User Information" you will see boxes containing the information the word processor has for your name, initials and address. The name information seems to be pop­ulated automatically, perhaps when Word is installed, so it is usually present. The other information is often absent. Modify or delete any information as desired and then click "OK." All newly created docu­ments will have the new informa­tion. However, if you do not change the file name, existing documents retain the "User" information with which they were created even when modified and saved.

A comment contains the name and initials of the user who created it. Comments in your document may be visible or not depending on your view options (Click "Tools," then "Options...." then "View"). To see if any comments are present, put check marks next to "Hidden Text" and "Screen Tips" by clicking on the boxes if they are empty and then click OK. A yellow comment mark will appear in the location of each comment. You can write anonymous comments by modify­ing your User Information as de­scribed above. Remove an existing comment by right-clicking on the associated comment mark and clicking "Delete Comment."

A record is also created when you work on your document with revision tracking enabled (click "Tools," then "Track Changes," then "Highlight Changes..." put a check mark next to "Track changes while editing" by clicking the box, and click "OK"). Again, your "User In­formation" is used so if you have "anonymized" it, your name will not appear. However, a record will still be made of the revision history

if you allow changes to be tracked. To avoid this, remove the check­mark next to "Track changes while editing" using the procedure de­scribed above. This will put a halt to future revision tracking. To see if a document contains a revision his­tory, click "Tools," then "Track Changes," then "Highlight Changes...," and put a checkmark next to "Highlight changes on screen." To cleanse any revision his­tory you find, click "Tools," then "Track Changes," then "Accept or Reject Changes..." and follow the prompts for each separate revision.

Undesired information may be saved as a "property" of the docu­ment. To view and modify this in­formation, click "File" and then "Properties." Information con­tained in the "Summary" tab can be changed. This can become difficult when you are connected to a net­work. The system may inject your network user name into the docu­ment when you open or save the document. If your system allows you to do so, modify the "Summa­ry" information as desired while not logged into the network. Then log on and transfer the file as de­sired without opening it.

A simple technique for stripping off some types of metadata is to save the file in rich-text format (rtf), open it, and then save it as a MSWord document.

MSWord documents contain more types of metadata than can be described here. A more complete de­scription of them together with methods of elimination can be found at http://support.microsoft.com/sup­port/kb/articles/Q223/7/90.asp.

However, if you wish to know ex­actly what you are sending, convert your document to a text file and send the text file.

Bob Jopson works on lightwave systems at Bell Lab­oratories. He can be reached at [email protected].

40 Optics & photonics News/ October 2000