Image Tagging Attaching textual meta-information or semantic linkages to images By Perry Rajnovic

Image Tagging

Attaching textual meta-information or semantic linkages to images

By Perry Rajnovic

What is a Digital Image?

Digital Images are usually defined as an organized display of pixels, often called a bitmap.

Each pixel is a numeric representation of the color intensities of that point.

What is a Digital Image?

Each pixel may be explicitly defined or be the result rendered by a vector or graphics package functionality.

These representations include no inherent textual elements or semantic description.

Due to the above, images are not easily machine-readable.

Why machine-readability?

Most searches are done via textual queries, thus there must be a mechanism to link applicable keywords or phrases to images.

For blind persons, being able to convey information about the image in another medium would be good for accessibility.

Image Contents

The contents of an image can be a full description written in prose (i.e. the adage “A picture is worth 1000 words”), or might simply have a few keywords describing spatial, temporal, or emotional aspects.

In many cases, accurately identifying the content of images requires human intervention.

Identifying Image Contents

Many good pattern recognition algorithms exist, however few are able to interpret the patterns extracted.

Artificial Intelligence algorithms can learn recognized patterns, but such a system’s flexibility is limited by its predefined knowledgebase.

Identifying Text in Images

CAPTCHAs (or Completely Automated Public Turing test to tell Computers and Humans Apart) are images which contain a distorted rendering of some text.

Identifying Text in Images

Their goal is to provide an easy task for humans to do, but that is extremely hard for computer programs to perform equally.

For this task, OCR is generally not sufficient enough to extract the text.

This is a good example of why machine-readable information should be available.

Example Tag Contents

As an example of what might be provided to tag an image, to the right is a list of words and phrases to describe this slide’s header.

Navy Blue Squares Fade-out Horizontal Bar Minimalist Decorative

User Applications

Many applications take advantage of image tagging, below are a few examples.Apple iPhotoGoogle PicasaAdobe Photoshop Elements

Generally these programs use tagging for organization and user-defined searching.

Web Applications

Several Web-based applications are now including tagging for images, as well as other non-image based features.Google ImageLabelerFlickr.comFacebook.com23hq.comFotki.com

Google Images

Luis von Ahn developed the “ESP Game” which could be used to tag images.

He presented a Google tech talk about the game as a form of human computation.

Google later licensed the technology to create a similar web application called the Google Images ImageLabeler.

Google ImageLabeler

The ImageLabeler game allows to random users to generate tags that accurately describe images.

The tags should be accurate due to game constraints, and gain specificity after several rounds.

The computed tags can improve searches.

Flickr

Flickr is a “Web 2.0” photo hosting and sharing site.

Users are encouraged to upload photos, then to name, describe, tag, annotate, geotag, comment on, and group their photos in collaborative ways.

Flickr - Tags

Tags are words or phrases meant to act as keywords.

They are searchable within the site, and can show popular topics.

They improve search relevance.

Flickr - Geotagging

Geotagging is a term for adding geospatial metadata to images such as the latitude, longitude and other directional indications of where a photo was taken at.

What are annotations?

Wikipedia defines them as: Extra information

associated with a particular point in a document or other piece of information.

The US DoD defines them as: A marking placed on

imagery or drawings for explanatory purposes or to indicate items or areas of special importance.

Annotating Images

The use of annotations with images can provide several useful functions. Below are some examples:Point out a specific piece of content.Explain some icon or graphic.Summarize the meaning of some region.Provide additional information via text.

Flickr - Notes

Flickr provides a feature called Notes. It uses a Flash-based implementation of an annotation system.

You can dynamically size a rectangular region over a portion of the image, then attach a snippet of text to describe it.

FotoNotes

FotoNotes is a data format for annotating images.

Allows you to embed the metadata directly into the image files for portability.

Flickr’s Notes feature is inspired by this standard and accompanying visualization implementation.

FotoNotes - More

It was developed by Greg Elin. The homepage provides links to groups

working with the standard. Additionally, an implementation which

works in most browsers is provided as-is for customization.

Facebook

Facebook.com has a tagging feature that is integrated with “My Photos”.

It allows you to add a textual descriptor (tag or person’s name) to a specific point in the image.

This allows the module to describe who or what are included in a specific album.

Facebook – Tag Display

When the images are viewed, placing the mouse over a tag displays a fixed sized square indicating where the tag (person) is located within the image.

This enables users to identify objects by visual inspection or by matching the list of contained objects with their tag displays.

Facebook – Links

Another capability incorporates the site’s concept of friends. If the person you tag is identified as your friend on the site, their name will link to their profile.

The site will also count this image in the “photos of” feature on their profile, allowing inclusion of photos added by other users.

Other Image Metadata: MPEG

The MPEG-7 standard is a “Multimedia Content Description Interface”

“MPEG-7 is not aimed at any one application in particular; rather, the elements that MPEG-7 standardises shall support as broad a range of applications as possible.”

Other Image Metadata: Adobe

Adobe Systems created a new MetaData framework for images called XMP (Extensible Metadata Platform).

It is publicly documented, based on W3C standards, built on XML, and is designed to eliminate growing incompatibility for metadata storage.

Other Image Metadata: IPTC

The International Press Telecommunications Council created standards for the interchange of news data over a decade ago.

These standards still persist in their IIM standard, as well as being usable in the newer XMP framework.

Improving Clustering Search Interfaces

Joint Term Project

By Perry Rajnovic

and Mark Zalar

Term Project

For my term project, I will be working with Mark Zalar to develop a new search engine interface

It will draw inspiration from all of the top search engines today, along with the enhancements now possible using emerging technologies.

Project Goal

The Goal of the project is to implement a search site that provides a highly usable interface for query refinement.

Our backend will use clustering mechanisms to allow for easy refocusing of search topics

Our frontend will use AJAX for flexibility.

Frontend Design

The frontend will be designed with a technology known as Asynchronous JavaScript and XML (AJAX).

This technology allows the site designer to run unseen requests to the server and parse XML-based results in the scripting language for interactivity.

Frontend Theory

Most clustering based search solutions available today use minimally interactivity.

Our theory is that making the ability to harness the power of clustering dynamically as you refine your search will improve results as well as time necessary to finish a search.

AJAX Functionality

Our site will use AJAX to dynamically reconfigure the clustering menu. This allows a quicker browsing of clusters to identify the optimal range of pages to search within.

The menu will also use a novel interface that shows sibling and parent clusters.

AJAX Functionality 2

The results will be displayed to the user with some animation.

This will help to alert users when changes are made to the order or set of results.

Another advantage of this is that users will be more aware of the difference between clusters as they browse them.

Search Target

This search engine could target both websites and images.

Valid keywords improve content knowledge.

Clustering would be highly useful in finding an image with a desired scene or set of objects.

Example: search “creature”

The engine might identify a general cluster of “animal” or “being”.

Animal might have more results, so the medium level clusters are shown for that.

Example: search “creature”

User wants a general discussion of mammals. Selects that cluster.

The results change to focus on those related to mammals as a group and in specific.

Results Display

To provide animation, a similar technology to that found in “TiddlyWiki” will be used.

This interface allows topics to be added and removed dynamically with animation.

Additionally, extra links can be attached to each topic for more functionality (open in new window, similar items, etc.)

Future Enhancements

Our implementation will provide a basic mockup of the interface and refinement techniques made available.

Several enhancements could be made to this interface that would improve its usability or functionality.

Enhancements in Search

Taking advantage of a meta-search would allow the clustering algorithm to have a higher volume of data with which to generate data topologies to be explored.

Using adaptive search (by userID or global optimization) would improve clustering by choosing ones more often used.

Enhancements in Interface

Because the site will be AJAX based, a large amount of flexibility is possible with respect to changes in the interface.

The browser window is similar to a canvas, with all of the site’s underlying Document Object Model available for addition, modification or deletion.

Documents

Image Tagging Attaching textual meta-information or semantic linkages to images By Perry Rajnovic