5
I n traditional broadcasting, the advertising model has long been a financial pillar. The onslaught of new media such as broadband and 3G has caused advertisers to sit up and take notice of the opportunities that new channels of video delivery have to offer—for instance, live sporting events are increasingly webcast over IP networks and 3G streaming servers. However, the standard model of a fixed-slot advertising run might not apply to these new media infrastruc- tures: Why would anyone incur expensive 3G airtime to watch a commercial break? In addi- tion, media’s mass digitization and the conve- nience of time-shifted, nonlinear viewing of digital content have also brought new challenges to the advertising industry. For instance, TiVo’s ability to fast-forward through advertisements has raised concerns over whether the 30-second spot has lost its bite. In response to this complex media environ- ment, advertisers are turning more and more to product placement, branded TV programming, and virtual advertising insertion. The key is to blend all advertising opportunities into the video content. Pioneering systems include the virtual first-down line in American football 1 and the graphical overlay of athletes’ country flags and other performance-related information such as skiing paths 2 and tennis shots. 3 They demon- strate that an important factor in user acceptance is the value the technology adds to the game. These systems generally apply camera-based sensors at the video source and reference manu- ally input 3D grids for accurate subpixel object tracking and chroma keying. Hence, the systems decide all virtual effects centrally and perma- nently encode them into the video for broadcast downstream. Moreover, the system setup is expensive, making a viable business model an issue. The webcast infrastructure offers signifi- cant advantages over broadcast in this regard, including Webcasts can reach a demographic segment. Webcasts combine traditional media’s familiar- ity with the Internet’s one-to-one interactivity. Webcast viewers are likely to be more tolerant of extraneous video effects such as advertising insertions than TV viewers because webcasts’ video quality is generally lower than TV’s. Webcast audiences are generally more technologically savvy, affluent, and likely to spend money on advertised items than TV audiences. 4 By carefully timing the exposures and appro- priately placing the virtual content in strategic positions in the video, webcasters can balance the advertisers’ need for more eyeballs and the viewers’ need for less clutter and interference. To this end, we created a sports advertising inser- tion system that combines automated sports content analysis with manual techniques that exploit an understanding of gameplay that only humans possess. Prior work in sports content analysis Substantial research has focused on machine algorithms for automating sports content analy- sis. 5,6 Motivating this research is the notion that the ability to efficiently tag sports video with relevant metadata will enable content reuse and access. In particular, Ekin and Tekalp’s algorithm uses cinematic features, such as shot type and shot length, to detect play breaks in a sports video. 6 Xie et al. also noted that, during a game, the changes in a sports video’s visual features can be good indi- cators of the game’s natural play-break structures. So, they proceeded to train a detection system based on hidden Markov models. 7 Unfortunately, as Chang’s paper clarified, the difficult task of machines achieving a level of understanding 78 1070-986X/07/$25.00 © 2007 IEEE Published by the IEEE Computer Society Multimedia at Work Qibin Sun Infocomm Research Kongwah Wan and Xin Yan Infocomm Research Advertising Insertion in Sports Webcasts

Advertising Insertion in Sports Webcasts

  • Upload
    xin

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Advertising Insertion in Sports Webcasts

In traditional broadcasting, the advertisingmodel has long been a financial pillar. The

onslaught of new media such as broadband and3G has caused advertisers to sit up and takenotice of the opportunities that new channels ofvideo delivery have to offer—for instance, livesporting events are increasingly webcast over IPnetworks and 3G streaming servers. However, thestandard model of a fixed-slot advertising runmight not apply to these new media infrastruc-tures: Why would anyone incur expensive 3Gairtime to watch a commercial break? In addi-tion, media’s mass digitization and the conve-nience of time-shifted, nonlinear viewing ofdigital content have also brought new challengesto the advertising industry. For instance, TiVo’sability to fast-forward through advertisementshas raised concerns over whether the 30-secondspot has lost its bite.

In response to this complex media environ-ment, advertisers are turning more and more toproduct placement, branded TV programming,and virtual advertising insertion. The key is toblend all advertising opportunities into the videocontent. Pioneering systems include the virtualfirst-down line in American football1 and thegraphical overlay of athletes’ country flags andother performance-related information such asskiing paths2 and tennis shots.3 They demon-strate that an important factor in user acceptanceis the value the technology adds to the game.

These systems generally apply camera-basedsensors at the video source and reference manu-ally input 3D grids for accurate subpixel objecttracking and chroma keying. Hence, the systemsdecide all virtual effects centrally and perma-nently encode them into the video for broadcastdownstream. Moreover, the system setup isexpensive, making a viable business model anissue. The webcast infrastructure offers signifi-cant advantages over broadcast in this regard,including

❚ Webcasts can reach a demographic segment.

❚ Webcasts combine traditional media’s familiar-ity with the Internet’s one-to-one interactivity.

❚ Webcast viewers are likely to be more tolerantof extraneous video effects such as advertisinginsertions than TV viewers because webcasts’video quality is generally lower than TV’s.

❚ Webcast audiences are generally moretechnologically savvy, affluent, and likely tospend money on advertised items than TVaudiences.4

By carefully timing the exposures and appro-priately placing the virtual content in strategicpositions in the video, webcasters can balancethe advertisers’ need for more eyeballs and theviewers’ need for less clutter and interference. Tothis end, we created a sports advertising inser-tion system that combines automated sportscontent analysis with manual techniques thatexploit an understanding of gameplay that onlyhumans possess.

Prior work in sports content analysisSubstantial research has focused on machine

algorithms for automating sports content analy-sis.5,6 Motivating this research is the notion that theability to efficiently tag sports video with relevantmetadata will enable content reuse and access.

In particular, Ekin and Tekalp’s algorithm usescinematic features, such as shot type and shotlength, to detect play breaks in a sports video.6 Xieet al. also noted that, during a game, the changesin a sports video’s visual features can be good indi-cators of the game’s natural play-break structures.So, they proceeded to train a detection systembased on hidden Markov models.7 Unfortunately,as Chang’s paper clarified, the difficult task ofmachines achieving a level of understanding

78 1070-986X/07/$25.00 © 2007 IEEE Published by the IEEE Computer Society

Multimedia at Work Qibin SunInfocomm Research

Kongwah Wanand Xin Yan

InfocommResearch

Advertising Insertion in Sports Webcasts

Page 2: Advertising Insertion in Sports Webcasts

equivalent to that of humans remains.8 Moreover,Chang goes on to say that in the premium sportsdomain, where content value is high, contentowners can easily afford manual annotations.

These considerations prompted us to adopt atoolkit approach to designing our sports adver-tisement insertion system. We aimed to expeditethe operator’s laborious job of annotation, ratherthan to replace the operator altogether.

Overall system architectureFigure 1 shows a high-level snapshot of the

system in action during a baseball game. Videofrom a live sporting event feeds into a terminal,where a human operator monitors the game andmanually inserts control signals by selecting fromGUI buttons that denote game status, such as playbreak, home run, and pitching. The operator canalso specify coordinate information, such asbounding boxes for advertising images, by click-dragging the mouse. The system time-stampsthese control signals and multiplexes them intothe video stream for decoding by a downstreamplayer/receiver.

While normal video decoding and playbackruns on a client player, the system simultaneous-ly inserts advertising content. First, it preloads adatabase of indexed advertising content (logo,images, or video) onto the client players/receivers.Then, during playback, control signals with thecurrent presentation time stamps are decodedinto a set of client-side actions. By predefining

actions such as video overlay, the system canblend advertising content into the video.

Automatic machine detectionFigure 1 also shows a parallel video processing

module that takes the live video as input. Thisautomatic module complements the manualinput system. For example, the machine runs ahard-cut detection algorithm and accurately iden-tifies shot boundaries. This produces a more pleas-ant advertising insertion effect when combinedwith the operator’s manual inputs. For instance,while observing a particular baseball pitchingscene, the operator can indicate on the videoscreen corresponding to the backstop a preferredbounding box for advertising image insertion. Thesystem then automatically extends the advertis-ing insertion interval to all frames in the shot,hence causing the eventual advertising exposureto appear more natural and realistic. Moreover, ifthe image area within the operator-specifiedbounding box is homogeneous, such as a part ofthe backstop that is not already cluttered by adver-tising banners, the machine can also compute theseparation of foreground objects from the back-ground. Then, through the system’s chroma key-ing functionality, we can make the advertisinginsertion look more natural by causing the fore-ground objects to walk over the overlay content.

Having an automatic module would alsogreatly facilitate detection of events or visuallandmarks that regularly appear in the sporting

79

Ap

ril–June 2007

Video-on-demand center

Decode Encode

Streaming center

Livestream

Machineauto detect

Manualinput

Live sporting event

Manualcontrol

signal

Automaticcontrol

signal

Vide

o

PC

PC

PC

Multiplexer/encode

PC

PC

PC

Ad-DB

Decode Encode File

Ad-DB

Figure 1. System

workflow.

Page 3: Advertising Insertion in Sports Webcasts

80

Multimedia at Work

event video. For example, in a tennis match, loudand sustained audience applause typically occursat the end of a good play. The system can performautomatic audio detection of applause to isolatethese moments9 as possible candidates for adver-tising content placement in a postclimax scene.10

As another example, when a soccer team adoptsa highly defensive strategy, we expect gameplayto be occurring mostly midfield, generally per-ceived as a lull in the game. Intuitively, at thismoment, the acceptance of a higher advertisingfrequency is greater. Because the soccer midfieldis generally a long, white, vertical line, machinedetection is relatively easy. Figure 2 shows otherexamples of visual landmark detection and adver-tising effects. Although computationally moreinvolved, detecting clear line markings on theplaying field is technically feasible.11,12

An economy of advertising spaceAdopting an advertising model similar to that

of broadcasting, the webcast architecture (see Fig-ure 1) makes decisions on advertising spaces cen-trally—that is, the available advertising slots arespecified at the video source end on the basis ofthe current video’s visual content. However, the

system doesn’t apply any advertising effects untilfarther downstream when the video reaches astreaming center or a video-on-demand center.You can think of these centers as middlemannodes that might cater to a large subscriber basemade up of advertisers.

When distributed hierarchically, the centersbecome local and geographical stations that holdniche knowledge of the demographics they’reserving. With the video received, each centernode can decode and retrieve all available (andallowable) advertising slots. Based on establishedcriteria, the node then decides whether to buy.The system logs these decisions over the Web,and they become legally binding purchase agree-ments between parties.

A node that buys a slot can choose to embeddesignated advertising content into the video,making the insertions permanent fixtures in thevideo downstream. Or the node can choose toresell the slot to downstream takers. By cus-tomizing the advertising content database atthese centers, each center maximizes its market-ing opportunities by sending meaningful adver-tising messages to its targeted audience. At thehierarchy’s root node, the potential for person-alized advertising to the individual is clear.

ImplementationsA prototype for streaming advertising inser-

tions in sports webcasts has been implementedon the Windows Media and Windows 2003Streaming Server platform.13 Figure 3 shows thehigh-level schematic of the three main systemcomponents: the encoder, transcoder, and play-er. Developed for the baseball domain in collab-oration with a Japanese partner, the encoderprovides the basic video monitoring and anno-tating functionalities expected of a broadcast stu-dio tool. Simple templates are available formonitoring teams, players, scores, and advertis-ing. An illustrative score bar overlays the score inthe top of the video.

(a) (b) (c) (d)

30%

18%

Figure 2. Automatic

detection of visual

landmarks for

advertising insertion

in sports video.

Internetor

broadband

TranscoderSYS-B

TranscoderSYS-B

TranscoderSYS-B

Windows 2003

streamingserver

Overlay player(SYS-C)

or Windows Media Player

(WMP)

Overlay player(SYS-C)or WMP

Overlay player(SYS-C)or WMP

EncoderSYS-A

Figure 3. Windows

Media implementation.

Page 4: Advertising Insertion in Sports Webcasts

Figure 4 shows a snapshot of the encoderimplementing two simple forms of advertisingoverlay: text and image. These overlays are per-manently encoded into the video, and each hasa fixed location of exposure. As these types ofinsertion are common in video presentations, weoffer them as a standard feature in our system.Our system’s niche, however, is that operatorscan add additional manual control signals byspecifying signal duration (the minimal durationof the signal embedding into the video payload)and bounding box coordinates for advertisingcontent placement (for example, the red bound-ing box in Figure 4). The system sends the signaldownstream to the transcoder module, whichdecodes the control signal, decides whether tobuy the advertising space specified by the bound-ing box coordinates, and, if it decides to buy, per-forms the image overlay.

Figure 5 shows two corresponding snapshotsof the transcoder modules at two different receiv-ing locations downstream from the encoder,which has decided to perform an insertion in thebounding box. Each transcoder module can serveaudiences from different demographic locationsand, hence, insert different advertising contentin each bounding box. Note also that thetranscoder module shows the “VOX” overlay,which the encoder module permanently encodedupstream. Figure 6 (next page) also shows anexample of the chroma keying effect after sepa-rating the foreground objects (players) from thehomogeneous background (backstop). The play-er now appears to be standing in front of theinserted image. From here, the transcoder sendsthe new content downstream to the player, asFigure 7 shows.

ConclusionsSports content continues to generate a global

appeal that transcends national, cultural, reli-gious, and gender boundaries. Historically, TVtechnology’s success has been intertwined withthe development of televised sports. In thefamous words of pioneering TV sports directorHarry Coyle, “Television got off the groundbecause of sports.” Sports showcases offer a splen-did platform to promote new media technologies.During the 2006 FIFA (Fédération Internationalede Football Association) World Cup tournamentin Germany, major mobile companies launcheda plethora of mobile TV services with such offer-ings as video streaming, text-based services, ring-tone downloads, and mobile blogging.

By augmenting a video presentation withadvertising content, our semiautomatic system

81

Figure 4. Two types of encoder overlay: “EPSON” text overlay on the top-right

score-bar and a “VOX” image overlay at the bottom right.

Figure 5. Transcoder modules at different receiving locations separately insert

images into the bounding box location that the control signal specified.

Page 5: Advertising Insertion in Sports Webcasts

proposes a new way of enhancing the commer-cial value of sports video webcasts. Clearly, theamount of advertising exposures must be man-aged to avoid viewers’ perceiving them as unnec-essary clutter in the video. As with legacy videobroadcasting systems staffed by human opera-tors, this check can be easily managed manually.Automatic machine detection techniques, how-ever, can further facilitate advertising contentinsertion. MM

AcknowledgmentsThe authors would like to thank Yiqun Li for

her implementation of the advertising effect in

baseball video and Akira Miyata of Digital VOXCorporation for valuable discussions.

References1. PVI Virtual Media Services, http://www.pvi.tv/

pvi/index.html.

2. Sportvision, http://www.sportvision.com.

3. Hawk-Eye Sports Tracking, http://www.

hawkeyeinnovations.co.uk.

4. Arbitron/Edison Media Research, Internet Study V:

Startling New Insights About the Internet and Stream-

ing, The Arbitron Company & Edison Media

Research, 2000; http://www.arbitron.com/study_

m/internet_study_v.asp.

5. A. Kokaram et al., “Browsing Sports Video: Trends

in Sports-Related Indexing and Retrieval Work,”

IEEE Signal Processing Magazine, vol. 23, no. 2,

2006, pp. 47-58.

6. A. Ekin and A.M. Tekalp, “Generic Play-Break Event

Detection for Summarization and Hierarchical

Sports Video Analysis,” Proc. IEEE Int’l Conf. Multime-

dia and Expo (ICME), IEEE Press, 2003, pp. 169-172.

7. L. Xie et al., “Structure Analysis of Soccer Video

with Domain Knowledge and Hidden Markov Mod-

els,” Pattern Recognition Letters, vol. 25, no. 7,

2004, pp. 767-775.

8. S.F. Chang, “The Holy Grail of Content-Based

Media Analysis,” IEEE MultiMedia, vol. 9, no. 2,

2002, pp. 6-10.

9. Z. Xiong et al., “Audio Events Detection Based

Highlights Extraction from Baseball, Golf and Soc-

cer Games in a Unified Framework,” Proc. IEEE Conf.

Acoustics Speech and Signal Processing (ICASSP), IEEE

Press, vol. 5, 2003, pp. 632-635.

10. K. Wan and C. Xu, “Automatic Content Placement in

Sports Highlights,” Proc. IEEE Int’l Conf. Multimedia

and Expo (ICME), IEEE Press, 2006, pp. 1893-1896.

11. Y. Li, “Real Time Advertisement Insertion in Baseball

Video Based on Advertisement Effect,” Proc. ACM Int’l

Conf. Multimedia, ACM Press, 2005, pp. 343-346.

12. K. Wan et al., “Real-Time Goal-Mouth Detection in

MPEG Soccer Video,” Proc. ACM Int'l Conf. Multime-

dia, ACM Press, 2003, pp. 311-314.

13. C. Simonetti, B. Birney, and J. Travis, “Using Win-

dows Media Technologies for Advertising on the

Internet,” Microsoft Digital Media Division;

http://msdn2.microsoft.com/en-us/library/

ms983661.aspx.

Readers may contact Kongwah Wan at kongwah@i2r.

a-star.edu.sg.

Contact Multimedia at Work editor Qibin Sun at qibin@

i2r.a-star.edu.sg.

82

Multimedia at Work

Figure 6. Foreground objects can also be automatically separated from the

homogeneous background.

Figure 7. The player

plays back video

received from the

transcoder in Figure 5

that inserted the A-Star

image.

Renew your IEEE Computer Society

membership today!

www.ieee.org/renewal