11
Recommender Systems Recom Syst High volume and personal taste makes Usenet news an ideal candidate for collaborative filtering techniques. Recommender Systems Joseph A. Konstan, Bradley N. Miller, David Maltz, Jonathan L. Herlocker, Lee R. Gordon, and John Riedl Applying Collaborative Filtering to Usenet News T HE GROUPLENS PROJECT DESIGNED, IMPLEMENTED, AND EVALUATED a collaborative filtering system for Usenet news—a high-vol- ume, high-turnover discussion list service on the Internet. Usenet newsgroups—the individual discussion lists—may carry hundreds of messages each day. While in theory the newsgroup organization allows readers to select the content that most interests them, in practice most newsgroups carry a wide enough spread of messages to make most individuals consider Usenet news to be a high noise information resource. Furthermore, each user values a different set of messages. Both taste and prior knowledge are major factors in evaluating news articles. For example, readers of the rec.humor news- group, a group designed for jokes and other humor- ous postings, value articles based on whether they perceive them to be funny. Readers of technical groups, such as comp.lang.c11 value articles based on interest and usefulness to them—introductory questions and answers may be uninteresting to an expert C11 programmer just as debates over subtle and advanced language features may be useless to the novice. The combination of high volume and personal taste made Usenet news a promising candidate for collaborative filtering. More formally, we determined the potential predictive utility for Usenet news was very high. The GroupLens project started in 1992 and completed a pilot study at two sites to establish the feasibility of using collaborative filtering for Usenet news [8]. Several critical design decisions were made as part of that pilot study, including: • The requirement that GroupLens integrate with GroupLens: COMMUNICATIONS OF THE ACM March 1997/Vol. 40, No. 3 77

GroupLens: applying collaborative filtering to Usenet news

  • Upload
    john

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

Page 1: GroupLens: applying collaborative filtering to Usenet news

Reco

mm

ende

rSy

stem

sRecomSyst

High volume and personal taste makes Usenet news an ideal candidate for collaborative filtering techniques.

Rec

omm

ende

r Sy

stem

s

Joseph A. Konstan, Bradley N. Miller, David Maltz,

Jonathan L. Herlocker, Lee R. Gordon, and John Riedl

Applying Collaborative Filtering

to Usenet News

THE GROUPLENS PROJECT DESIGNED, IMPLEMENTED, AND EVALUATED

a collaborative filtering system for Usenet news—a high-vol-

ume, high-turnover discussion list service on the Internet. Usenet

newsgroups—the individual discussion lists—may carry hundreds of

messages each day. While in theory the newsgroup organization allows

readers to select the content that most interests them, in practice most

newsgroups carry a wide enough spread of messagesto make most individuals consider Usenet news to bea high noise information resource. Furthermore, eachuser values a different set of messages. Both taste andprior knowledge are major factors in evaluating newsarticles. For example, readers of the rec.humor news-group, a group designed for jokes and other humor-ous postings, value articles based on whether theyperceive them to be funny. Readers of technicalgroups, such as comp.lang.c11 value articles basedon interest and usefulness to them—introductoryquestions and answers may be uninteresting to anexpert C11 programmer just as debates over subtle

and advanced language features may be useless to thenovice.

The combination of high volume and personaltaste made Usenet news a promising candidate forcollaborative filtering. More formally, we determinedthe potential predictive utility for Usenet news was veryhigh. The GroupLens project started in 1992 andcompleted a pilot study at two sites to establish thefeasibility of using collaborative filtering for Usenetnews [8]. Several critical design decisions were madeas part of that pilot study, including:

• The requirement that GroupLens integrate with

GroupLens:

COMMUNICATIONS OF THE ACM March 1997/Vol. 40, No. 3 77

Page 2: GroupLens: applying collaborative filtering to Usenet news

78 March 1997/Vol. 40, No. 3 COMMUNICATIONS OF THE ACM

existing news reading applications, since users are extremely reluctant to change news readerprograms.

• The requirement that GroupLens support a singlekeystroke rating input (or, when possible, replac-ing an existing keystroke) since users typicallyspend very little time or attentionon any particular article. (Otherresearch has shown that more exten-sive textual ratings can be effectivein close-knit communities [2, 4].

• The requirement that GroupLensprovide predictions of the rating thesystem expects the user will giveeach article, rather than only winnowing down the list of articles.We consider it very important toprovide advice rather than exercisecensorship.

The pilot study, successful yet lim-ited in scope, demonstrated that collaborative filter-ing could be implemented for Usenet news. Sincethen, the project has continued forward to undertakethe challenge of applying collaborative filtering to alarger set of users and on a larger scale. Moreover, wehave focused our efforts on overcoming some of thechallenges of applying collaborative fil-tering to Usenet news, including:

• Integration of collaborative filter-ing into an information systemwith existing users, existing appli-cations and interfaces, and an openarchitecture that supports manynews reader applications.

• Addressing the dynamic, distrib-uted nature of Usenet news. Arti-cles have short lifetimes and thereis no central repository of news articles.

• Working with extremely sparse sets of ratings.Typical users read only a tiny fraction of Usenetnews articles.

• Delivering acceptable performance to users andproviding mechanisms to scale the system as thenumber of users and articles grows.

This article discusses the challenges involved increating a collaborative filtering system for Usenetnews. The public trial of GroupLens invited usersfrom over a dozen newsgroups selected to represent across-section of Usenet (listed in Table 1) to apply ournews reader software to enter ratings and receive pre-

dictions (we provided GroupLens-adapted versions ofGnus, xrn, and tin). Over a seven-week trial startingFebruary 8, 1996, we registered 250 users who sub-mitted a total of 47,569 ratings and received over600,000 predictions for 22,862 different articles.These users were volunteers who saw our announce-

ment postings or our Web page.They downloaded specially modifiednews browsers that accepted ratingsand displayed predictions on a 1–5scale where 1 was described as “thisitem is really bad! a waste ofnet.bandwidth” and 5 as “this articleis great, I would like to see more likeit.” For privacy reasons, users wereknown to us only by pseudonyms.

Qualitative results are therefore thecompilation of feedback from the

GroupLens mailing list and private email rather thana comprehensive survey. In [5] we present a moredetailed summary of the trial results, along withcomparisons with noncollaborative approaches tomanaging Usenet news.

Assessing Predictive UtilityPredictive utility refers generally to the value of hav-ing predictions for an item before deciding whetherto invest time or money in consuming that item. ForUsenet, the items are news articles, but the conceptis general enough to include physical items such asbooks or videotapes as well as other informationitems. In each domain predictive utility is not sim-

Movie:Legal Cite:Sci. Art.:Restaurant:

+high+high+high+med

Movie:Legal Cite:Sci. Art.:Restaurant:

– low–very high– low– low

Movie:Legal Cite:Sci. Art.:Restaurant:

–$7+30 min–med–5 min–high

Movie:Legal Cite:Sci. Art.:Restaurant:

+med/high+low/med+med/high+high

HIT MISS

False Positive Correct Rejection

Predict Good Predict Bad

Desirable

Undesirable

rec.humorrec.food.recipesrec.arts.movies.current-filmscomp.lang.c++comp.lang.javacomp.groupwarecomp.human-factormn.generalall groups in comp.os.linux.*

▲ Figure 1. Predictive utility cost/benefit analyses for fourselected tasks. Different domains have different values for correctand incorrect predictions. Missing a desirable legal citation can beextremely costly, while missing a good movie is not since thereare many desirable movies. Similarly, the cost of mistakenly pick-ing an undesirable restaurant is higher than the cost of picking anundesirable science article due to the time and money invested.

Table 1. Newsgroups supported in the public trial

Page 3: GroupLens: applying collaborative filtering to Usenet news

ply a measure of accu-racy; it is a measure ofhow effectively predic-tions influence user con-sumption decisions. Adomain with high pre-dictive utility is onewhere users will adjusttheir decisions a greatdeal based on predic-tions. A domain withlow predictive utility isone where predictionswill have little effect onuser decisions.

Predictive utility is afunction of the relativequantity of desirable andundesirable items and thequality of predictions.The desirability of anitem is a measure of a par-ticular user’s personalvalue for that item. Itemsare not intrinsically goodor bad.

The cost-benefit analy-sis for a consumptiondecision compares thevalue of consuming adesirable item (a hit), thecost of missing a desirableitem (a miss), the value ofskipping over an undesir-able item (a correct rejec-tion), and the cost ofconsuming an undesir-able item (a false posi-tive). Figure 1 shows fourcost-benefit analyses. Forwatching a movie, thevalue of finding desirablemovies is high to moviefans, but the cost of miss-ing some good ones is lowsince there are manydesirable movies for mostmovie fans. The cost offalse positives is the priceof the ticket plus theamount of time before thewatcher decides to leave.The value of correct rejec-tions is high because there

150

100

50

0–1 0 +1

num

ber

of o

ccur

renc

es

Rec.Humor

Pearson r correlation coefficient

60

50

40

30

20

10

0num

ber

of o

ccur

renc

es

–1 0 +1Pearson r correlation coefficient

Rec.food.recipes30

25

20

15

10

5

0num

ber

of o

ccur

renc

es

–1 0 +1Pearson r correlation coefficient

Comp.os.linux.development.system

COMMUNICATIONS OF THE ACM March 1997/Vol. 40, No. 3 79

1

0.8

0.6

0,4

0.2

00

1 2 3 4 5 6

1

0.8

0.6

0,4

0.2

0

1

0.8

0.6

0,4

0.2

0

1

0.8

0.6

0,4

0.2

00 1 2 3 4 5 6 0 1 2 3 4 5 6

0 1 2 3 4 5 6Rating Rating

Rating Rating

Perc

ent

artic

les

Perc

ent

artic

les

Perc

ent

artic

les

Perc

ent

artic

les

All Groups Rec.Humor

Comp.os.linux.development.system

Rec.food.recipes

Figure 2. Ratings profiles for four Usenet news groups The percentage of articles assigned each rating varies significantly from newsgroup to newsgroup. Most

articles in rec.humor were given the worst rating (1 out of a possible), while the ratings in comp.os.linux.development.system were distributed more uniformly.

Figure 3. User pair correlations for three newsgroups. One way to compare the similarity

of users is to compute the Pearson coefficient between their ratings. Here, the number of user pairs with

each Pearson coefficient is plotted for three different newsgroups. The presence of many high

correlations in the rec.humor newsgroup indicates general agreement about quality in that domain.

In the moderated newsgroup rec.food.recipes correlations are nearly evenly distributed about the

origin, suggesting that individual taste matters more in this domain.

Page 4: GroupLens: applying collaborative filtering to Usenet news

are so many undesirable movies that it would beimpractical to see movies at all without rejectingmany of them.1 Similarly, finding desirable general-interest scientific articles benefits from predictionssince there are so many toselect from (even though manyare good thanks to peer reviewand editors). Restaurant selec-tion follows a similar patternthough the risk of going to anundesirable restaurant ishigher since you typically stillhave the meal and the bill.Legal research is very different.The cost of missing a relevantand important precedent isvery high, and may outweighthe cost of sifting through allof the potentially relevant cases(especially when that cost isbeing billed to the client andserves as protection against malpractice).

The costs of misses and false positives representthe risk involved in making a prediction. The valuesof hits and correct rejection represent the potentialbenefit of making predictions. Predictive utility is thedifference between the potential benefit and the risk.Thus, the risk of mistakes is lowest for movies or sci-entific articles, and the potential benefit is highestfor movies, articles, and restaurants.

One important component of the cost-benefitanalysis is the total number of desirable and undesir-able items. If 90% of the items being considered aredesirable, filtering will generally not add much valueover simply predicting that all items are desirablebecause there are few correct rejections and the prob-ability of a hit is high even without a prediction. Ofcourse when there are many desirable items, usersmay refine their desires to select only the most inter-esting of the interesting ones given their limitedtime. On the other hand, if there are many items andonly 1% are good, then filtering can add significantvalue because the aggregate value of correct rejectionsbecomes high requiring a very high miss cost beforeit becomes preferable to predict that all items aredesirable.

Usenet news is a domain with extremely high pre-dictive utility. While statistics vary by newsgroup, we

have found that users generally consider only 5% to30% of articles in typical newsgroups to be desirable.(Figure 2 shows the distribution of ratings for themost widely rated technical, recreational, and moder-

ated newsgroups from the trial.) Because of the highvolume of news, the value of correct rejections is high(in many groups it is infeasible to read the entiregroup). At the same time, the fact that so many usersread Usenet articles implies the value of a hit is alsomoderately high. Thus, Usenet has a high potentialbenefit. It also has low risk. False positives are cer-tainly annoying, but it takes only a few seconds for auser to dismiss an unwanted article. And misses turnout to be low cost as well since truly valuable articlestend to reappear in follow-up discussion, reducing thechance of missing something particularly important.Later, we show the effect of predictions on user behav-ior to confirm high-predictive utility.

We should point out that high-predictive utilityimplies that any accurate prediction system will addsignificant value—why then do we need a personal-ized collaborative filtering system? Would it not beeasier to simply calculate average ratings across allusers as was done by Maltz [3] and reap the benefitsof high-predictive utility? We have found that per-sonalized predictions are significantly more accuratethan nonpersonalized averages. In general, users donot agree on which articles are desirable. Figure 3shows that users do not agree overall. The grouprec.humor has unusually high agreement, primarilydue to a large number of cross-posted articles that donot even attempt to be funny, but there are a substan-tial number of low and negative user-pair correlations.Rec.food.recipes, a group in which agreement literallyis based on taste, has a large number of near-zero cor-relations that we believe represent people with over-lapping but different tastes, such as a vegetarian and a

80 March 1997/Vol. 40, No. 3 COMMUNICATIONS OF THE ACM

NNTPServer

Xrnreader

ClientLibrary

Tin newsreader

ClientLibrary

GroupLensServer

GeneratePredictions

Process Ratings

Database

Figure 4. GroupLens architecture overview. Usenet clients connect to the GroupLens serverthrough the GroupLens client library, and to a separate NNTP server as usual. The GroupLens

Server accepts ratings and provides predictions for articles delivered by the NNTP server.

1Our analysis includes the effect of frequency of occurrence in the cost orbenefit. Hence, correct rejections are worth more when there are many un-desirable items. If we isolate frequency of occurrence, then the benefit ofa correct rejection is zero since its value is simply the absence of the costof a false positive. We find the combined analysis more intuitive, thoughseparating the frequency from the per-item cost can be useful for someanalyses.

Page 5: GroupLens: applying collaborative filtering to Usenet news

meat-eater who both enjoy chocolate desserts. Hence,it is better not to lump all votes together since thereare systematic differences in taste. Moreover, even inan area where users agree overall, such as rec.humor,Table 2 shows that correlation between ratings andpredictions is dramatically higher for personalizedpredictions than for all-user average ratings.

GroupLens Architecture OverviewThe GroupLens system architecture is designed toblend into the existing Usenet client-server architec-ture. At a high level, Figure 4 shows that a newsreader such as xrn, tin, or Gnus connects to twoservers: The NNTP serverthat holds Usenet news arti-cles and the GroupLensserver that holds ratingsand generates predictions.The GroupLens clientlibrary encapsulates theinterface to the server. Thetypical usage pattern is for anews reader to request a set of headers for unreadarticles from the NNTP server and pass the articleidentifiers to the GroupLens client library to obtainpredictions. As the user reads articles in the news-group, the news readerrecords ratings with theclient library whichsends them back to theserver. The server usesthese ratings both toprovide predictions toother users and to bettercapture this user’s tastes.

One of the major chal-lenges distinguishingUsenet news from otherdomains that have beenused to demonstrate thevalue of collaborative fil-tering is that Usenet is areal, preexisting system

with millions of users and hundreds of software com-ponents already written. In some ways, building col-laborative filtering into an existing domain providedus with significant benefits. We already knew theinformation resource was useful, as attested to by themillions of users already reading Usenet news. We alsodid not have to worry about content creation, sincetens of thousands of articles are posted daily. Wealready had a natural partitioning of content into hier-archical newsgroups that evolved through a democra-tic voting process and were likely to represent realclusters of content and interest.

In other ways, however, working with Usenet newsraised research problems. Twoimportant problems were theneed to integrate into preexist-ing clients and the integrationof predictions with differentnews presentation models.

The problem of integratingwith the sheer volume anddiversity of news readers led us

toward the client library and an open architecturemodel [6]. A quick survey showed over a dozen widelyused news readers, and typically several versions ofeach in active use. These news readers ranged from

COMMUNICATIONS OF THE ACM March 1997/Vol. 40, No. 3 81

Newsgrouprec.humorrec.food.recipescomp.os.linux.development.system

Avg0.490.050.41

Pers0.620.330.55

Table 2. Correlations between ratings and predictionsfor average and personalized predictions

-- Gnus rec.food.recipes/17026 (143 more) 2:48pm (Summary

Heat the oil over medium heat in a deep fryer or in a wide, deep panon the stove. In a large, shallow bowl, combine the flour, salt, pepper,and paprika. Break the eggs into a separate shallow bowl and beatuntil blended. Check the oil by dropping in a pinch of the flour mixture.If the oil bubbles rapidly around the flour, it is ready. Dip each pieceof chicken into the eggs, then coat generously with the flour mixture.Drop each piece into the hot oil and fry for 15 to 25 minutes, or untilit is a dark golden brown. Remove the chicken to paper towels or a rackto drain.

>From the book written by Todd Wilbur.

6 cups vegetable oil2/3 cup all-purpose flour1 tbls salt2 tbls white pepper1 tsp cayenne pepper2 tsp paprika3 eggs1 frying chicken w/skin, cut up

From: The Ripper <[email protected]>Subject: Popeye's Famous Fried ChickenNewsgroups: rec.food.recipesDate: 1 Jan 1996 07:28:51 -0700Organization: OnrampReply-To: The Ripper <[email protected]>Followup-To: rec.food.cooking

[[[[[[[

55:123:97:58:46:40:57:

Art PoeArt PoeArt PoeArt PoeArt PoeRobyn WaltonRobyn Walton

]]]]]]]

Quiche LorraineCOLLECTION (4) Persimmon DessertsCOLLECTION (2) BriocheCorn TortillasFlour Tortillas*Czechoslovakian Cabbage SoupCollection (2) Rice Pudding

| ***| *****| ***| NA| NA| **| **

|

|

|

|

|

|

|

-- Cnus rev.food.recipes! 17026 Popeye's Famous Fried Chicken1:48pm (Article

] Popeye's Famous Fried Chicken| ***** |[ 30: The Ripper

File Edit Apps Options Buffers Tools Article Threads Misc Post Score Mascrypt Help

nnn mmmmmm ooooo mmmmmmm mmm nnnnnnn mmmm nnnnnn mmmmmmm nnnnnnn nnnnn

Figure 5. The Gnus interface with GroupLens

predictions are shown here. Predictions are indicated as

an ASCII bar-chart on the left edge of the summary part

of the interface. The longer bars indicate articles that

are predicted to be of greater interest.

Page 6: GroupLens: applying collaborative filtering to Usenet news

text-only to graphical toweb-based and ran onevery platform includingMacintosh, DOS, Win-dows, and Unix platforms.We quickly determined itwas infeasible for us toupdate and maintain afleet of news readers.Instead, we would need tomake it easy for newsreader authors to incorpo-rate GroupLens into theirown code. Since there wasno standard protocol forexchanging ratings andpredictions, we defined anopen protocol for commu-nication between newsreaders and the Group-Lens server. To furthersimplify the task of caching data and following theprotocol, we implemented and distributed clientlibraries written in C and in Perl.

The client libraries define a simple API that newsreaders can use to request predictions and to transmitratings. They also define utility functions to manage auser’s initialization file and to provide user-selectabledisplay formats for predictions. We consider the clientlibrary and its API to be a substantial success as we’vefound several news reader authors and one user will-ing to use it to provide GroupLens support. (Group-Lens support is provided or forthcoming in Gnus 5.2and SLRN 0.8.8.5.) One of our test users in Polandwrote a proxy GroupLens server to download ratingsand predictions each evening to help him deal withnetwork throughput as low as 10bps. This type ofuser participation can only come about with an openand protocol and a usable API.

The problem of integrating predictions into differ-ent presentation models was more formidable. Theoriginal GroupLens system was designed for newsreaders in which the user selected a newsgroup andwas then given a split screen with one part containinga list of unread articles (in either chronological or dis-cussion-thread order) and the other part showing thetext of the currently selected article. In this presenta-tion model, it is simple and effective to display pre-dictions along with other header information to helpusers choose which articles to read and which to skip.An example of this interface is the Gnus interfaceshown in Figure 5. Several news readers have adoptedother interface models that are more difficult to inte-grate predictions into. Some discussion-thread news

readers, for example,show only a singleentry for each thread.It is not clear what pre-diction value should beshown for this entry:the average prediction,the first prediction, themaximum, the range,or some other value.This problem requiresfurther research.

In part, the chal-lenge of effectivelyintegrating predic-tions into differentpresentation modelsstems from competinggoals of users readingnews. Users typicallywant to read news in

roughly chronological order, grouped by discussionthread. When predictions are provided, users add thegoal of reading news in order of decreasing quality, sothey can read the good things first and then bail outof the newsgroup. We found that a new interface com-ponent added to one news reader, a keystroke to moveto the highest-predicted unread article, was extremelypopular in the rec.humor newsgroup where discussionthreads were rarely rated highly and chronologicalorder was less important.

The diversity and sheer number of installed newsreaders led us to adopt a library and open protocolapproach. With this approach the implementers ofeach news reader could easily add access to the Group-Lens server and could also use the returned predictionsin whatever manner they found to be most consistentwith their news reader interface.

A Dynamic and Fast-Paced Information SystemItem volume and lifetimes are another way in whichUsenet news differs from other domains where col-laborative filtering has been applied. Across allnewsgroups, users will see 50,000 to 180,000 newmessages each day, and the volume of postings isdoubling each year. The useful lifetime of a Usenetmessage is short; most sites expire messages afterapproximately one week. Furthermore, there is nocentral authority or official repository of Usenetnews articles. Usenet is a truly distributed systemwhere articles appear at different sites at differenttimes, and there is no unique timestamp orsequence.

82 March 1997/Vol. 40, No. 3 COMMUNICATIONS OF THE ACM

Figure 6. Correlation between time spent reading and explicitratings. Readers who spend a long time with an article are morelikely to rate it highly. The points mark the average time spent

reading an article for each rating, while the ranges span the 95%confidence interval from that mean.

80

70

60

50

40

30

20

10

800 1 2 3 4 5 6

Rating entered by user

Ave

rage

tim

e sp

ent

read

ing

( in s

econ

ds)

All groupsrec.humorcomp.os.linux. development.system

Page 7: GroupLens: applying collaborative filtering to Usenet news

The implications of the high volume and fast paceof Usenet news include:

• The need for GroupLens to discover new contentwhen it first learns about it, that is at the firstrating or request for predictions.

• The need for ratings to affect subsequent predic-tions almost immediately—a delay of a full daywould result in no predictions for as many ashalf of the system users, and even a delay of 5-10minutes would result in large prediction gapsduring the “morning rush” when many usersread news.

To address these implications, the GroupLensserver has a two-part database (shown in Figure 6).The ratings database stores all ratings that users havegiven to messages. The correlations database storesinformation about the historical agreement of pairsof users. The GroupLens architecture has three sepa-rate process pools that access these databases. Theprediction processes always have the highest priority.They read both correlations and ratings and generatepredictions in real time based on the latest availabledata. The ratings processes have the next highest pri-ority. They write ratings into the ratings databaseand are expected to do so quickly to ensure that cur-rent data is available for generating predictions. Theratings processes are also responsible for identifyingnew articles and adding them into thedatabase. Ratings, for both existing andnew articles, are almost always storedinto the database within 60 seconds ofthe time they are received. Finally, thecorrelation process reads the ratingsdatabase to update the correlations data-base. This process is scheduled so thateach user pair’s correlation is updatedapproximately every 24 hours. Since cor-relations are measures of historicalagreement, they should not changerapidly. New users can be correlatedindividually after their first batch of rat-ings to make it possible for them to usethe system quickly.

Ratings SparsityUsers of Usenet news read only a smallfraction of the articles posted to the sys-tem. Our studies found that users take anaverage of 10- to 60-seconds to read anarticle. Even using the conservative esti-mate of 10-seconds, users can read only360 articles in an hour. Even a user read-

ing news several hours each day will struggle to read1% of all articles posted. Of course, we are heartenedby this fact because it points to the value of filtering.But sparsity also poses a problem for collaborative fil-tering:

• When each user has read a tiny percentage of thetotal number of articles, it becomes more diffi-cult to find other users with whom to correlate,since the overlap between users is small on aver-age and we devalue correlations with too fewcommon ratings to avoid spurious correlations.Worse yet, there is not a set of very popular newsarticles, unlike box office hits for the moviedomain or best-sellers in the book domain.

• A consequence of sparsity is that an enormousnumber of raters is needed to cover all of thearticles. Until then, many users will experiencethe “first-rater problem” of finding articles withno prediction whatsoever.GroupLens addresses the challenge of sparsity:

algorithmically and at the user interface. The primaryalgorithmic technique for attacking sparsity is parti-tioning the set of Usenet news articles into clustersthat are commonly read together. The newsgrouphierarchy provides a natural partitioning that suc-cessfully identifies clusters of articles. We partitionour ratings database by newsgroup and therebyimprove the local density of ratings. We also parti-

tion our correlations database bynewsgroup to ensure that users can beclustered with other users who haveread and rated the same articles.Essentially, we have created a subset ofUsenet news where users are known toread a greater percentage of content,compared with Usenet overall, andtherefore where there are likely to beenough common ratings to computemeaningful correlations. Partitioningthe database by newsgroup also pro-vides more accurate predictions. Theuser pair correlations shown in Figure3 provide sufficient agreement to gen-erate meaningful predictions. Retro-spectively, using the data to makepredictions based on correlation acrossall newsgroups provided lower corre-lations and less accurate predictions.This data confirms our hypothesis thatagreement in one domain (such ashumor) is not necessarily predictive ofagreement in a different domain (suchas recipes) and suggests that

One otherapproach tosparsitythat we are examining is theincorporation ofagent-stylefilter-botsinto theGroupLensframework.

COMMUNICATIONS OF THE ACM March 1997/Vol. 40, No. 3 83

Page 8: GroupLens: applying collaborative filtering to Usenet news

approaches that simply model users uniformly acrossdomains are diluting their predictive power.

Even partitioning articles into newsgroup clustersdoesn’t fully address the sparsity concerns. During ourtrial, it was still the case that users rated as few as 1%to 2% of articles in high-volume newsgroups. While

a Pearson correlation coefficient-based predictionalgorithm was able to generate useful predictions (asshown in Table 2), we identified opportunities forincreased accuracy if the ratings density could beimproved. We identified two causes for this sparsity:

• Efficiently reading high-volume groups requiresbeing highly selective. We apply collaborative fil-tering specifically to help users be selective, butthe result is they skip over articles that don’tinterest them, either due to topic or low prediction.

• Even users who have read articles often do notrate them, even though the ratings interfaceinvolves at most one additional keystroke. Infor-mal feedback suggests users are “lazy” in thatthey would prefer not to even think about theappropriate rating. Being advised that each ratinghelps perfect their own profile motivates someusers, but others will avoid rating nonetheless.

Some researchers have proposed compensation sys-tems that reward users for entering ratings. While theeconomic consequences of this solution are interest-ing, we wonder whether compensation would be nec-essary if ratings could be captured without any efforton the part of the user.2 (We believe an ideal solution

is to improve the user interface to acquire implicit rat-ings by watching user behaviors. Implicit ratingsinclude measures of interest such as whether the userread an article and, if so, how much time the userspent reading it. Our initial studies show that we canobtain substantially more ratings by using implicitratings and that predictions based on time spent read-ing are nearly as accurate as predictions based onexplicit numerical ratings. Figure 6 shows an analysisof the relationship between time spend reading andexplicit ratings. Our results also provide large-scaleconfirmation of the work of Morita and Shinoda [7] infinding the relationship between time and ratingholds true without regard for the length of the article.We are continuing to explore further implicit ratings

84 March 1997/Vol. 40, No. 3 COMMUNICATIONS OF THE ACM

Figure 7. GroupLens server architecture. The beige box encloses the GroupLens server. The ratings broker serves as a single point of contact for clients to the server.

2Indeed, in this issue Avery speculates that even no-cost rating may not becheap enough, since there is a positive benefit to waiting long enough forothers to filter information for you. While we have observed this phenom-enon, we expect that other factors, including the desire of many readers toread the most current articles at specific times of the day, will mitigate thisdesire to wait for predictions.

correlations

predictionprocess

pool

ratingprocess

poolUsenet

news clients

correlationprogram

ratings

databases

datamanager

ratingbroker

GroupLensserver

Page 9: GroupLens: applying collaborative filtering to Usenet news

for Usenet including using actions such as printing,saving, forwarding, replying to, and posting a follow-up message to an article. Of course, other domains alsohave their own implicit ratings (for example, a librarymay record borrowing a book as an implicit rating infavor of the book).

One other approach to sparsity that we are examin-ing is the incorporation of agent-style filter-bots intothe GroupLens framework. Filter-bots are programsthat read all articles and follow an algorithm to ratethem systematically. Since they are auto-mated, they can read and rate each articleas soon as it is visible at their location. InGroupLens, they are treated as just anotherset of ordinary users; if a user correlateswell with a filter-bot, then the filter-botwill contribute to predictions for that user.We are experimenting with a range of sim-ple filter-bots that examine syntactic prop-erties such as whether an article is a replyor an original message, degree of cross-posting to different newsgroups, thelength and reading level of an article,among others.

Performance ChallengesThe final set of challenges inherent in theUsenet news domain are the severedemands for low latency and highthroughput to make it feasible to attractand serve a large number of users. The crit-ical performance measures are the latencyfor handling prediction requests and rat-ings submissions and the throughput ofthe system measured by the number ofusers and articles that a GroupLens servercan handle before performance degradesunacceptably. After examining the critical path at theuser interface, we discovered that most news readerswould be unable to request predictions or send ratingsasynchronously. Accordingly, we established theseperformance goals based on the assumption thatrequesting predictions would delay the appearance ofthe articles in a newsgroup and that transmitting rat-ings would delay the return to newsgroup selectionmode:

• A request for predictions for 100 articles in anewsgroup should complete in under two seconds(end to end) at least 95% of the time.

• A transmission of ratings for 100 articles (includ-ing any implicit ratings) should complete inunder one second (end to end) at least 95% of thetime.

There are several techniques that we were able toemploy to help improve latency. A newsgroup canhave several ratings and prediction processes active somultiple requests can be handled concurrently. TheGroupLens ratings broker assigns each incomingrequest to a free process which can then fulfill therequest as shown in Figure 7. Ratings processes releasethe client as soon as the ratings are received and writethe ratings to the database afterwards, allowing theuser to return to reading news as quickly as possible.

Finally, we organized our database tostore ratings so the correlation andprediction processes can efficientlyretrieve either all ratings from a givenuser or all ratings for a given message.

Using a Sun Sparcstation 5 work-station as the server, we were able tosurpass the ratings latency goal (100ratings required approximately 250ms) during the trial. We did not meetour prediction latency goal, however,as 100 predictions averaged just overfour seconds. Later performance tun-ing, including the use of more mem-ory, has allowed us to reduce thelatency to approximately 150 ms for100 ratings and below 500 ms for100 predictions.

The primary throughput goal forthe trial was to be able to handle10,000 users for up to 20 Usenetgroups. While we never had activeusage at that level, we ran severalexperiments with simulated users(that interacted through the standardclient library interface) and foundthat 10,000 users was realistic even if

users concentrated their news reading into only 1/3 ofthe day. Obviously 10,000 users and 20 newsgroupsare only a tiny fraction of Usenet. To achieve the scaleneeded for Usenet as a whole requires applying addi-tional throughput enhancements:

• Partitioning the server by newsgroup. Sepa-rate servers can handle different newsgroups withnearly perfect parallel speed-up (only log-in costsare replicated).

• Partitioning the server by user. Different clus-ters of users can be assigned to different servers.Partitioning would be particularly effective ifuser clusters are based on historical agreement,but our trial suggests that even random assign-ment within a newsgroup would provide enoughagreement to obtain useful predictions.

COMMUNICATIONS OF THE ACM March 1997/Vol. 40, No. 3 85

Once usersinvesttime inGroupLens,we have found

they likethe systemand arelikely tocontinueusing it.

Page 10: GroupLens: applying collaborative filtering to Usenet news

• Use of composite users. When millions ofusers are involved, even replication may beimpractical. In that case, prototype users can bedefined and users can be defined as combinationsof those prototypes. Readers would obtain pre-dictions based on the prototypes and their rat-ings would feed back into the prototypes toupdate the prediction profile. Users would stillreceive personalized predictions, but these pre-dictions would be based on a personal combina-tion of composite user opinions rather than acombination of individual user ratings.

Discussion andConclusionsUsenet news is a domainthat can greatly benefitfrom collaborative filter-ing, but it poses manychallenges that will helpus build more efficientand effective collaborativefiltering systems. TheGroupLens project is anotable success in collab-orative filtering. Usagedata gathered during aseven- week public trialshows that predictions aremeaningful and valuableto users. To verify thatthis success was notcaused by the bias of aprediction on a user’s rat-ing, we repeated ouranalysis retrospectivelyon users who did not seepredictions before enter-ing a ratings, and found the same results. Mostnotably, however, we found that users valued predic-tion because they tended to read and rate articleswith high predictions more than those with low pre-dictions as shown in Figure 8.

In addition to quantitative results, we gatheredsubstantial anecdotal evidence about the challengesand successes of providing collaborative filtering forUsenet news. The start-up problem is composed oftwo parts:

• Users need to rate several articles before they canreceive predictions. Accordingly, many usersabandon the system before ever receiving benefitsfrom it because they perceive effort withoutreward.

• Early adopters find there are not many otherraters and therefore they receive predictions foronly a fraction of the articles that they read.

We can address these problems in three ways. First,we can provide some predictions, if only the averagerating for all users, so new users see some value in thesystem. Second, the use of implicit ratings reduces oreliminates the perceived effort, making it more likelythat users will continue using the system. Third, wecan combine the use of implicit ratings and the use offilter-bots to create faster perceived payback for

reduced effort. Weare experimentingwith theseapproaches now.

Once users investtime in GroupLens,we have found theylike the system andare likely to continueusing it. While wefound that more thanhalf of the users whosigned up for Group-Lens discontinuedactive rating after acouple of weeks,many of the trialusers were still usingthe system sixmonths after the trialended. Users oftencommented theywould like Group-Lens for all of theirUsenet newsgroups,though we do not

have the resources to serve that large a population anddata set except perhaps with an overall average pre-diction rather than personalized predictions.

Usenet presents a different set of challenges to col-laborative filtering than domains such as music [12]or movies [4] where new items are relatively infre-quent and lifetimes are relatively long. In addition toaddressing critical performance issues, the Group-Lens system continues to address several key prob-lems involving ratings sparsity and start-up usage byapplying techniques including partitioning the sys-tem by newsgroup (which provides more accuratepredictions), using implicit ratings, and exploringthe use of filter-bot rating agents. We still have sev-eral interface challenges to address, including filter-ing and display interfaces that handle threads,

86 March 1997/Vol. 40, No. 3 COMMUNICATIONS OF THE ACM

5

4

3

2

1

01 2 3 4 5

rating given by at least one person

num

ber

of p

eopl

e to

rea

d an

d ra

te a

rtic

le

Average Number of Ratings per Article

Figure 8. Number of people who read an article based on the rating it was given by some other user. For each

rating of an article, the number of users who read and rate it iscounted. The totals show that highly rated articles are read

more often than less highly rated articles.

Page 11: GroupLens: applying collaborative filtering to Usenet news

integration with search engines such as InReferenceand DejaNews,3 and other ways of making predic-tions more useful to users. We also are very interestedin comparing GroupLens with, and exploring theintegration of collaborative filtering with, informa-tion retrieval approaches to filtering information suchas the SIFT system [10].

We are often asked “What would it take to makeall of Usenet use GroupLens?” The answer involvesperformance, availability, and convincing users to usethe system. Our current architecture and implemen-tations support 10,000 users for 10 to 20 newsgroupson a single economical workstation. Partitioningcould allow us to economically expand to cover all ofUsenet for tens of thousands of users, or to cover spe-cific newsgroups for all users, but probably will notallow us to support all groups for all users. Consider-ing that Usenet news already relies upon a wide net-work of servers, we believe that creating a worldwidenetwork of GroupLens servers is a practical and feasi-ble approach to collaborative filtering for all ofUsenet.

Availability is determined almost entirely by thewillingness of news reader authors to incorporateGroupLens into their systems. We have received verypositive feedback on both our client library and ouropen architecture. These tools make adding Group-Lens quite easy, especially compared with the effortundertaken to communicate with the NNTP (news)server. The remaining hurdle is to provide the groundswell of support that requires the existence of serverssupporting most or all Usenet newsgroups. Webelieve most users will prefer having GroupLens pre-dictions, though they may prefer not to have to do anywork to enter ratings. For this reason, we believeimplicit ratings are critical for convincing users to usethe system.

In conclusion, GroupLens collaborative filtering forUsenet news is an experimental success and it showspromise as a viable service for all Usenet news users.We are currently conducting a second public trial.This trial will test the effect of providing predictionsto new users more rapidly by providing overall aver-ages until the user has rated enough articles to corre-late, and it will make full use of time spent readingmeasures to capture implicit ratings unobtrusively.Readers interested in using GroupLens, in adaptingtheir own news readers to use GroupLens, or in fol-lowing the ongoing trial, are invited to the GroupLenshome page at http://www.cs.umn.edu/Research/GroupLens.

AcknowledgmentsMany people participated in making GroupLens asuccess. Paul Resnick deserves special recognitionfor cofounding the project with John Riedl. Also,Danny Iacovou, Mitesh Susak, and Pete Bergstromwho worked on earlier versions of the system.

References1. Goldberg, D., Nichols, D., Oki, B. and Terry, D. Using collaborative filter-

ing to weave an information tapestry. Commun. ACM 35, 12. (1992) pp. 61–70.2. Hill, W., Stead, L., Rosenstein, M. and Furnas, G. Recommending and

evaluating choices in a virtual community of use. In Proceedings of the 1995ACM Conference on Human Factors in Computing Systems (1995). ACM, NewYork, pp. 194–201.

3. Maltz, D. Distributing information for collaborative filtering on Usenetnet news. Master’s thesis, M.I.T. Department of EECS, Cambridge, Mass.May 1994.

4. Maltz, D. and Ehrlich, K. Pointing the way: Active collaborative filtering.In Proceedings of the 1995 ACM Conference on Human Factors in Computing Sys-tems. (1995). ACM, New York.

5. Miller, B., Riedl, J. and Konstan, J. Experiences with GroupLens: MakingUsenet useful again. In Proceedings of the 1997 Usenix Winter Technical Con-ference. Jan. 1997.

6. Miller, B., Riedl, J., Konstan, J., Resnick, P., Maltz, D. and Herlocker, J. The GroupLens Protocol Specification.http://www.cs.umn.edu/Research/GroupLens/protocol.html.

7. Morita, M. and Shinoda, Y. Information filtering based on user behavioranalysis and best match text retrieval. In Proceedings of SIGIR ’94. ACM,New York.

8. Resnick, P., Iacovou, N., Sushak, M., Bergstrom, P., and Riedl, J. Group-Lens: An open architecture for collaborative filtering of netnews. In Pro-ceedings of the 1994 Computer Supported Cooperative Work Conference. (1994)ACM, New York.

9. Shardanand, U. and Maes, P. Social information filtering: Algorithms forautomating “word of mouth.” In Proceedings of the 1995 ACM Conference onHuman Factors in Computing Systems. (1995) ACM, New York, pp. 210–217.

10. Yan, T. and Garcia-Molina, H. SIFT: A tool for wide-area information dis-semination. In Proceedings of the USENIX 1995 Winter Technical Conference.(Jan. 1995) Usenix Assoc., New Orleans, La .

Joseph A. Konstan ([email protected]) is an asistant professor of computer science at the University of Minnesota, Minneapolis. He is also cofounder and consulting scientist at NetPerceptions, a new company developing and marketing GroupLenscollaborative filtering software based in Eden Prairie, Minn.Bradley N. Miller (bmiller @netperceptions.com) is cofounder and vice presidnet of product development at Net Perceptions, Inc., Eden Prairie, Minn.David Maltz is a Ph.D. student in computer science at Carnegie Mellon University studying mobile networking and computer-supported cooperative work.Jonathan L. Herlocker ([email protected]) is a Ph.D. student in computer science at the University of Minnesota wherehe is conducting research on flexible multimedia authoring andplayback systems.Lee R. Gordon ([email protected]) is a senior softwareengineer at Net Perceptions, Inc., Eden Prairie, Minn., and a M.S. candidate at the University of Minnesota, Minneapolis.John Reidl is an associate professor of computer science at theUniversity of Minnesota. He is also cofounder and chief technicalofficer of Net Perceptions. Inc. (http://www.netperceptions.com)

Permission to make digital/hard copy of part or all of this work for personal or class-room use is granted without fee provided that copies are not made or distributed forprofit or commercial advantage, the copyright notice, the title of the publication andits date appear, and notice is given that copying is by permission of ACM, Inc. To copyotherwise, to republish, to post on servers, or to redistribute to lists requires prior spe-cific permission and/or a fee.

© ACM 0002-0782/97/0300 $3.50

c

COMMUNICATIONS OF THE ACM March 1997/Vol. 40, No. 3 87

3InReference (http://www.reference.com/) DejaNews (http://www.de-janews.com/)