20
J Data Semant DOI 10.1007/s13740-013-0032-2 ORIGINAL ARTICLE Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications Alessandro Bozzon · Piero Fraternali · Luca Galli · Roula Karam Received: 7 January 2013 / Revised: 20 August 2013 / Accepted: 28 October 2013 © Springer-Verlag Berlin Heidelberg 2013 Abstract User models have been defined since the 1980s, mainly for the purpose of building context-based, user- adaptive applications. However, the advent of social net- worked media, serious games, and crowdsourcing/human computation platforms calls for a more pervasive notion of user model, capable of representing the multiple facets of social users and performers, including their social ties, inter- ests, capabilities, activity history, and topical affinities. In this paper, we define a comprehensive model able to cater for all the aspects relevant for applications involving social networks and human computation; we capitalize on existing social user models and content description models, enhanc- ing them with novel models for human computation and gam- ing activities representation. Finally, we report on our expe- riences in adopting the proposed model in the design and implementation of three socially enabled human computa- tion platforms. Keywords Crowdsourcing · Human computation · User modeling · Social networks · Serious games A. Bozzon Delft University of Technology, Delft, The Netherlands e-mail: [email protected] P. Fraternali · L. Galli (B ) · R. Karam Dipartimento di Elettronica e Informazione, Politecnico di Milano, Milan, Italy e-mail: [email protected] P. Fraternali e-mail: [email protected] R. Karam e-mail: [email protected] 1 Introduction Human computation is a paradigm that advocates the inte- gration of the computation power of machines with the per- ceptual, rational or social contribution of humans to solve a computational problem too hard to be solved by comput- ers alone [36, 42]. Within the human computation domain, crowdsourcing and Games With a Purpose play a central role. Crowdsourcing addresses the distributed assignment of work to an open community of executors [24], and it is a means and a facilitator for achieving human computation. Games with a Purpose (GWAPs) leverage the time spent online playing computer games by embedding complex problems that require human intelligence to be solved by the players [27, 41]. Although the idea of distributing work among several users is not obviously new, human computation exploits the involvement of the user at a different scale and for a broader variety purposes. First of all, the aid is sought of users who are not preregistered to the application, but must be “recruited” dynamically; furthermore, in many applications the contri- bution of humans is a means for compensating deficiencies in automated algorithm, exploiting unique abilities of humans such as their visual perception and intuition. More an more, crowdsourcing is being integrated within other applications; for example, multimedia search engines integrate the contribution of humans to label visual content and make it easier to retrieve by text based search engines that work on the available metadata. Business process manage- ment applications have been extended socially, so to engage in a business process people (e.g., customers) who would oth- erwise be excluded, for example for rating and commenting alternative product concepts [11]. We envision a future in which crowdsourcing will be not only the foundation for standalone applications, like it is in 123

Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

  • Upload
    roula

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

J Data SemantDOI 10.1007/s13740-013-0032-2

ORIGINAL ARTICLE

Modeling CrowdSourcing Scenarios in Socially-Enabled HumanComputation Applications

Alessandro Bozzon · Piero Fraternali ·Luca Galli · Roula Karam

Received: 7 January 2013 / Revised: 20 August 2013 / Accepted: 28 October 2013© Springer-Verlag Berlin Heidelberg 2013

Abstract User models have been defined since the 1980s,mainly for the purpose of building context-based, user-adaptive applications. However, the advent of social net-worked media, serious games, and crowdsourcing/humancomputation platforms calls for a more pervasive notion ofuser model, capable of representing the multiple facets ofsocial users and performers, including their social ties, inter-ests, capabilities, activity history, and topical affinities. Inthis paper, we define a comprehensive model able to caterfor all the aspects relevant for applications involving socialnetworks and human computation; we capitalize on existingsocial user models and content description models, enhanc-ing them with novel models for human computation and gam-ing activities representation. Finally, we report on our expe-riences in adopting the proposed model in the design andimplementation of three socially enabled human computa-tion platforms.

Keywords Crowdsourcing · Human computation ·User modeling · Social networks · Serious games

A. BozzonDelft University of Technology, Delft, The Netherlandse-mail: [email protected]

P. Fraternali · L. Galli (B) · R. KaramDipartimento di Elettronica e Informazione, Politecnico di Milano,Milan, Italye-mail: [email protected]

P. Fraternalie-mail: [email protected]

R. Karame-mail: [email protected]

1 Introduction

Human computation is a paradigm that advocates the inte-gration of the computation power of machines with the per-ceptual, rational or social contribution of humans to solvea computational problem too hard to be solved by comput-ers alone [36,42]. Within the human computation domain,crowdsourcing and Games With a Purpose play a central role.Crowdsourcing addresses the distributed assignment of workto an open community of executors [24], and it is a meansand a facilitator for achieving human computation. Gameswith a Purpose (GWAPs) leverage the time spent onlineplaying computer games by embedding complex problemsthat require human intelligence to be solved by the players[27,41].

Although the idea of distributing work among severalusers is not obviously new, human computation exploits theinvolvement of the user at a different scale and for a broadervariety purposes. First of all, the aid is sought of users who arenot preregistered to the application, but must be “recruited”dynamically; furthermore, in many applications the contri-bution of humans is a means for compensating deficiencies inautomated algorithm, exploiting unique abilities of humanssuch as their visual perception and intuition.

More an more, crowdsourcing is being integrated withinother applications; for example, multimedia search enginesintegrate the contribution of humans to label visual contentand make it easier to retrieve by text based search engines thatwork on the available metadata. Business process manage-ment applications have been extended socially, so to engagein a business process people (e.g., customers) who would oth-erwise be excluded, for example for rating and commentingalternative product concepts [11].

We envision a future in which crowdsourcing will be notonly the foundation for standalone applications, like it is in

123

Page 2: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

A. Bozzon et al.

the majority of cases today; rather, it will become a normalingredient of the application architecture, much as web ser-vices are now in many contemporary applications.

The commoditization of crowdsourcing will require a shiftin the way in which such a functionality is incorporated intoapplications. The current landscape of crowdsourcing plat-forms is highly fragmented: the interfaces for such activi-ties as defining and posting tasks, controlling their execu-tion, handling communication between the provider and theworkers are mostly proprietary, platform-dependent and noninteroperable. Even the central asset of a human computa-tion application, which is the representation of humans andof their contribution and work history, is not standardized.

This situation leads to difficulties in developing applica-tions in a “crowdsourcing platform-independent way” andmakes porting an application to a different platform a hardtask.

An approach to alleviate the problem of platform-dependence is modeling [10]. Models are high-level repre-sentations of systems that abstract away the implementationdetails and are portable across technological spaces; theycan be mapped, by means of model transformations, to thespecific platform, by incorporating the platform-dependentfeatures not in the model but in the transformation. Thisapproach has been successfully applied to both standaloneapplications and to distributed applications integrated withexternal web services. The goal of this paper was to explorethe usage of modeling techniques in the realm of crowd-powered applications, starting from the central assets to bemodeled: the user and its activities.

Research in user modeling has been crucial for the defini-tion of adaptive and context-aware applications [15]. A usermodel is a knowledge source that describes the aspects of theuser that may be relevant for the behavior of a system, and ittypically focuses on (1) the individual and his personal char-acteristics (e.g., profile data, history, preferences, and con-text of usage), and, with the advent of social applications, (2)the characterization dimensions related to the rich context ofrelations (e.g., friendship or following) and activities (e.g.,content sharing, chatting, game playing) carried on in a net-worked social context. Recent trends in problem solving andapplication development call for richer users representation,where the modeled aspects go beyond mere access control,context-awareness, and personalization.

More recently, several works [5,7,8] called for methodsable to blend human computation, which typically targetanonymous workers on human computation marketplaceslike Amazon Mechanical Turk (https://www.mturk.com/)and Microtask.com(http://www.microtask.com/), with socialnetworks, which are capable of interacting with real people,in real time, to capture their opinions, suggestions, and emo-tions. The rise of a new class of socially enabled humancomputation applications (e.g., Duolinguo http://duolingo.

com/) and games with a purpose (e.g., Ingress http://www.ingress.com/) calls for novel, more expressive user models,capable of representing all the articulations of the digital lifeof the user, as content producer and consumer, social net-work member, volunteer or paid worker in a crowdsourcingscenario, or player in a game-based application.

1.1 Original Contribution

The goal of this paper was to define a comprehensive modelable to cater for all the relevant characteristics of a social userand performer, to enable the development of socially enabledhuman computation platforms that could gather, store andmanage user- and content- related information coming frommultiple knowledge sources, including social networks, gam-ing platforms and communities, and human computationmarkets. The main contributions of the paper in this directionare as follows:

1. The identification of the relevant components (sub-models) of a Social and Human Computation modelthat could cover all the aspects required by applicationsinvolving social networks and human computation;

2. the design of a metamodel that encodes the features ofeach of the identified sub-models for social and contentdescription modeling, to provide a comprehensive set ofconcepts and abstractions augmented with novel detailsrelated to human computation and gaming activities, and

3. a discussion on the concrete usage of the devised modelin the context of two real-world socially enabled humancomputation platforms developed within the CUbRIKProject and the CrowdSearcher project, and a generalplatform for games with a purpose, namely the FWAP(Framework With A Purpose) platform.

1.2 Running Example

To ease the discussion, we define a running example that willbe consistently used through the whole paper to illustrate thecapabilities of the proposed model.

The example exploits the architectural settings of Fig. 1a,where Jane, the user of the Fashion Trend crowdsourcingapplication submits a picture of herself (along with a tex-tual question) to receive comments about her personal dress-ing style and suggestions about other garments that couldbe used to improve her look. The Fashion Trend applicationinteracts with the Socially enabled Human Computation Plat-form (SHCP) to involve human computation performers fromsocial networks (e.g., Facebook), third-party human compu-tation marketplaces (e.g., Amazon Mechanical Turk), andgaming platforms (e.g., Apple Game Center). Jane’s queryis processed by an intermix of automatic and human contri-butions activities, as described in Fig. 2.

123

Page 3: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

(a) (b)

Fig. 1 a Running example architecture and b a screenshot of the Sketchness GWAP

Fig. 2 Running example process

Before going for a walk with her friends, Jane takes aphoto of herself wearing her new hat, and she submits thephoto to the Fashion Trend application along with the ques-tion “Does this work? New hat :)”. The application splits theproblem to be solved into two related tasks: (a) the retrievalof garments similar to the one worn by the user that sub-mitted the photo, and (b) the collection of human opinionrelated to the appeal of the object in the photo. To retrievesimilar garments, the application first identifies the specificclothes shown in the photo by exploiting known automaticalgorithms (e.g., [44]); when the applied algorithm fails torecognize a garment, the activity is delegated to the SHCP,which solves the garment identification task by means ofthe Sketchness game with a purpose, a multiplayer game inwhich the players can assume two different roles, the Sketch-ers and the Guessers. In each round of the game a player is

chosen at random to be the Sketcher while all the others willplay as Guessers. A Sketcher is given as input an image and(s)he will be the only player with the rights to see it; (s)heis asked to provide a tag for a garment visible in the image,such as “tie” or “trousers” and (s)he will then be asked todraw the contour of the object specified in the tag over theimage within a limited period of time. A Guesser is asked totype guesses about the object being drawn in a text box. TheGuessers cannot draw on the whiteboard and are able to seeonly the content that is being drawn by the Sketcher that hasbeen chosen for that round, not the image. The winner of thegame is the player who is able to guess more words or theone who lets the other users guess his own tags. In Fig. 1bthe user interface of the game is shown.

The collection of human opinions is performed directlyon the Facebook social network, by posting the picture on

123

Page 4: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

A. Bozzon et al.

the wall of the Freedom And Fashion fashion addicts groupand by inviting fashion experts belonging to such group toprovide comments and suggestions.

Once the similar garments and the opinions have beencollected, the application outcomes are shown to Jane. Addi-tional details about the application logic and the exchangeddata will be provided during the course of the paper.

1.3 Outline

The paper is organized as follows: Sect. 2 presents the pro-posed model, organized in different packages: the User andSocial Model; the Content Model; the Action Model; theGaming Model; and the Human Computation Model. Sec-tion 3 reports on three human computation platforms thathave been implemented by using the proposed model. Sec-tion 4 overviews the related work on social user models, con-tent modeling, and human computation models and finallycompares the proposed approach to the state of the art in therespective fields. Section 5 draws the conclusions.

2 The Social and Human Computation Model

One of the key challenges for the development of a sociallyenabled human computation platform is the design of a uni-fied data model for representing the relevant aspects of users,their social ties and activities, the communities where theyare active, the actions they can contribute, and the contentswhich are the object of such actions.

As we will discuss in Sect. 4, currently there is no univer-sal data model or standard able to embrace all the facets ofthe personal and social contribution of users. The envisionedmodel must convey in an integrated manner the profile fea-tures and social links and roles of users [18], the characteris-tics of the content objects they produce and consume [4], andthe elementary actions and tasks they perform in virtual orreal contexts of interest. Such actions and tasks are organizedinto processes to meet some global, community-wide goal,or special-purpose aspects, as required, e.g., when specialtools like gaming applications are exploited to better engageusers and foster their participation or exchange of opinions.

Such a data model should also be capable of (1) expressingthe uncertainty of data, which is introduced by the automatedcollection procedures that are normally used to harvest user’sfeatures and (2) managing arisen conflicts due to approximatefeature extraction algorithms, contradictory data, or conflict-ing user’s actions.

Figure 3 depicts a bird’s eye view on the main sub-modelsthat compose our community management framework, andthat will be described in details in the next sections. TheProcess model focuses on the global aspects of coordinat-ing a set of human and automatic actions to achieve a spe-

Fig. 3 The social and human computation model

cific purpose; as it does not differ from the workflow modelsof popular business processes and service orchestration lan-guages (e.g., BPEL or BPMN [12]), its description is, there-fore, omitted.

2.1 The User and Social Model

The User and Social model is devoted to the representation ofhumans in the context of human computation, by expressingthe roles they can play as social actors, content producers andconsumers. Furthermore, it describes the embedding of usersin social networks by modeling their relations to communitiesand the most common properties that characterize a socialactivity profile.

2.1.1 Modeling Users

Figure 4 depicts the user taxonomy at the core of the Userand Social Model. The main concept is the User, which spe-cializes into Administrator, ContentProvider, and End-User.Administrators and ContentProviders denote roles that servean internal purpose in the specific human computation plat-form: the former controls the system, whereas the latter pro-vides ContentObjects. These internal roles can be extendedto cater for a taxonomy of internal roles depending on theapplication domain. The End-User entity represents socialusers that interact with the platform consuming or producingresources.

Fig. 4 User taxonomy model

123

Page 5: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

When registered in a human computation platform, end-users are further distinguished into

– AppUsers, who interact through an Application, e.g., theFashion Trend Application of the running example; theycan be characterized by application-specific properties(e.g., an ApplicationUserID and application-dependentprofile data). An End-User not registered to the humancomputation platform can also provide useful informa-tion, e.g., by implicitly boosting the relevance of a givencontent object through share or re-post actions.

– Performers, end users that are registered explicitly forcontributing to managed tasks; they have work-dependentattributes, e.g., their work history and quality data (e.g.,error rates and other performance indicators).

Games are treated as a special class of applications forimplicitly solving a human computation task. Therefore, aGamePlayer specializes in both AppUser and Performer.More information about how AppUsers, Performers, andGamePlayers are related to the other entities can be, respec-tively, found in the User and Social Model (described next),in the Action model, in the Gaming model, and in the ConflictResolution model described later in this section.

Considering the running example, Jane has registered her-self to the platform and she is modeled as an AppUser that

interacts with the Fashion Trend Application. There are alsoother users that are involved in the example: Michael, a per-former that works for the platform; Clara and Pippa, whichare two girls that play Sketchness to solve human computa-tion tasks.

Michael is a frequent user of the platform and he is keenon providing suggestions to other users; therefore, he is reg-istered within the system as a Performer. Clara and Pippalove to hang out in the system by playing games with theirfriends, so they are registered to the platform as Gameplay-ers; Clara is a friend of Jane while Pippa plays games veryfrequently during her spare time. Both of them are AppUsersbecause they are using one of the applications of the plat-form, a game. Implicitly they are also Performers since theyare solving human computation tasks by playing games.

2.1.2 Modeling Users’ Social Space

Figure 5 depicts the model in charge of representing users’relationships and interactions in the social space. The modelrefers to the topmost user type of the taxonomy in Fig. 4, so torepresent people regardless of their affiliation to the humancomputation platform. Users are also related to each otherthrough UserRelationships of a given UserRelationshipType(e.g., friendship, geographical proximity).

Fig. 5 User and social model

123

Page 6: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

A. Bozzon et al.

The social space consists of ConflictResolutionPlatforms,i.e., platforms where users can perform tasks. For exam-ple, SocialNetworks (e.g., Facebook, Google+, LinkedIn,Twitter) and Gaming Platforms (e.g., Apple’s Game Center,Microsoft’s Xbox Live!) are specific types of ConflictReso-lutionPlatforms.

A User can be subscribed to zero or more ConflictReso-lutionPlatforms; each subscription goes with a ConflictRes-olutionPlatformMembership, i.e., an entity that contains themain authentication credentials to the social platform, plussome metadata information that describes the User in theplatform. Examples of such metadata are (i) a Platform-Profile, i.e., the set of personal user’s details stored withinthe platform, which includes also an open set of SocialPro-fileAttributes (e.g., birthdate, hometown, preferences); in ourmodel, profile attributes are represented as a flat list of prop-erties, but the adoption of more expressive representations(e.g., graphs) is supported. (ii) A set of UserNetworkRoles,that represent a measure of the importance of the User withinthe social network space; examples are classical indicatorssuch as centrality, prestige, and authority. (iii) A set of Top-icalAffinities, i.e., relationships with topics. An affinity linkis represented by the UserTopicalAffinity, which embodiespointer to topics described as Entities of the Content andContent Description model.

Another central concept is that of Community, defined as agroup of interacting people, living in some proximity (i.e., inspace, time, or relationship) and sharing common values. ACommunity is characterized by a Name and by a set of Topicsthat define the common values that tie together its members(Topics are described by entities).

A CommunityMembership denotes the metadata about thecommunity members, including (i) a set of CommunityMet-rics i.e., measures of the importance of the User within theCommunity; (ii) a set of TopicalAffinities, i.e., topical rela-tionship with a given (set of) topics.

Users may have affinities only to a sub-set of the Topicsthat describe the community, and such affinity can involveother Users. Communities can be real, that is, proper sub-groups of a SocialNetwork (denoted as SocialNetworkCom-munities) or DerivedCommunities, i.e., communities thatspan multiple social platforms according to some criterion(e.g., the union of the Facebook and G+ groups of Star Trekfans). The User and Social Model allows for a definition ofGlobalNetworkMetrics, i.e., metrics that define the aggre-gate importance of a User across both social networks andcommunities.

Figure 6 reports an extract of the User and Social Model ofthe running example.Michael is registered on the “Facebook”social network, and he is a very authoritative member ofa social fashion community called “Freedom and fashion”with the username “Fashionnaire”. Jane is also registered toFacebook along with her close friend Clara. Pippa is just a

gamer and both Clara and Pippa are registered to the GamingPlatform “Apple Game Center”.

2.2 The Content Description Model

The Content Description model contains the concepts thatdenote the assets (e.g., blog posts, tweets, images, videos,entities of interest) associated with the user’s activity, themetadata (i.e., annotations) that describe such objects, andtheir associations with the users that produced them. TheContent Description model is inspired by several exist-ing content representation format such as RuCOD [17] orMPEG-7 [38], on which it can be trivially mapped.

The model element ContentObject shown in Fig. 7 denotesan abstract entity representing a piece of information thatcan be accessible through some kind of storage system (e.g.,relational, no-sql, or graph databases). Each ContentObjectis defined by (i) an IDentifier, to uniquely refer to a piece ofcontent; and (ii) a URI, a string that unambiguously identifiesthe location of the ContentObject in the storage system (e.g.,http://en.wikipedia.org/wiki/File:Mpeg.gif).

ContentObjects can be related to each other; such a rela-tionship, that materializes in ContentRelationship objects,may be motivated by the presence of existing or created phys-ical or logical relationships: for instance, a video object canbe related to the HTML page object that contained its descrip-tion; likewise, a video object can be related to the thumb-nails (i.e., image objects) automatically generated from itskeyframes. The ContentRelType attribute identifies the typeof relationships (e.g., CrawledDescription, or DerivedOb-ject).

The Content Description model also comprises a set ofentities and relationships that express knowledge about aContentObject. This knowledge, typically expressed as ametadata Annotation, can be automatically or manually pro-duced and helps describing the ContentObject for search andretrieval purposes.

A ContentObject can feature zero or more ContentDe-scriptions, where each ContentDescription is characterizedby a unique ID and by a Name that help identifying the scopeof the description (e.g., the same content can be describedmultiple times by several parties)1.

Annotations express metadata that compose a ContentDe-scription, and they describe the ContentObject as a whole.They are characterized by an AnnotationScheme (whichuniquely identifies the type of annotation according, forinstance, to the annotation component that generated it), aName, a CreationTimeStamp, and a textual Description (or

1 Please notice that content descriptions typically also include informa-tion about temporal of physical segments of the described objects. Forthe sake of brevity we omitted the description of this important contentdescription aspect, although fully supported by our model.

123

Page 7: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

Fig. 6 User and Social Model extract for the running example

the DescriptionURI that points to an external description).There exist several kind of Annotations; for instance (i) Tex-tAnnotations contain textual values in a given Language; (ii)Low-Level Features contain array(s) of numerical values,typically representing the result of a numerical analysis ofthe content item; (iii) Entities are semantically defined meta-data that correspond to real-world objects or occurrences asdescribed in ontologies (i.e., DBPedia or Yago [40]).

An Annotation is typically created by automatic or humanActions, as described in Sect. 2.3. Annotations can belongto an AnnotationAggregate, to denote the fact that multi-ple annotations have been created by the same Action (e.g.,a set of image tags created by an instance of a GamePlaywith an image tagging GWAP), or that there exists a logicalor functional dependency (expressed by the AggregateTypeattribute): for instance, one annotation can be created as arefinement of another one after being validated and correctedin a human computation task. An Annotation can be associ-ated with AnnotationConfidence objects that state the level ofuncertainty associated with the truth-value of an annotation.Uncertain Information Representation. An important issuewhen dealing with human and automatic computation is themanagement of uncertain information, because both algo-rithms and user’s contributions are approximate and theirtrust level can be appraised only probabilistically. Uncer-tainty can be related to several concepts in the system, and,typically, is the result of an approximate approach to thedetermination of a given fact. For instance, textual anno-tations produced by automatic classification algorithms arecommonly associated with a trust value, i.e., a number thatestimates the correctness of the given classification. Figure 7

depicts how uncertainty is described in our model: under thegeneric term of Confidence, we define the uncertainty degreeassociated with a piece of information, and we allow suchdegree to be expressed as a Confidence Value (e.g., 0.8), asa Confidence Interval (e.g., [0.6, 0.8]), or as a ProbabilityDistribution of a given type (e.g., normal, Poisson, etc.).

Figure 8 shows the initial status of the Content Objectsmanaged by the running example scenario, that is the imageof the hat and the question submitted by Jane. The imagemust be described by the name of the garment and by itsposition; therefore, two different Content Descriptions, oneassociated with the retrieved tags and another associated withthe segments generated to identify the objects, are needed.The two ContentObjects are related because they have beendefined by the same Query. In the first steps, performed byan algorithm, the image is tagged by an automatic algorithmthat produces the Textual Annotations “Poncho” with a lowconfidence value, indicating that the results of the processingwere not reliable. The automatic garment detection algorithmreturns a LowLevel Feature used to store the polyline thatdefines the contour of the object; this algorithm was alsonot able to identify the object within the image with a goodconfidence. Both annotations will be, therefore, managed byhuman computation tasks.

2.3 The Action Model

The Action model describes two types of actions that canbe performed on content objects: automatic actions doneby software components like classification algorithms, andhuman actions, performed by users to detect and resolve

123

Page 8: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

A. Bozzon et al.

Fig. 7 Content and description model

Fig. 8 Content Model extract for the running example

conflicts, provide relevant feedbacks, etc. These humanactions are called Tasks, which can be executed with a varietyof approaches, from answering a query, to performing workon demand.

The Action Model depicted in Fig. 9 is strictly related tothe User and Social model, as it represents the spectrum ofactions that humans can perform on a ConflictResolution-Platform.

We define an Action as an event (happening in a giventime span delimited by a StartTime and an EndTime) thatinvolves the interaction with, the processing, or the cre-ation of, Content Objects, Annotations, and Annotation-Aggregates. Actions can be associated with one or moreActionQuality values, i.e., values that denote the correct-ness or the completeness of an action. For human computa-tion platforms, we distinguish HumanActions that fall underthree main archetypes: Retrieval, Query, and TaskExecutionactions.

The first two are examples of interactions performed inapplications, and they involve the querying, consumption, orcollection of content items. TaskExecutions, instead, relatespecifically to human problem solving activities (e.g., rat-ing, tagging, disambiguating, recognizing) and, therefore, are

executed by Performers. A GamePlayAction is a specific typeof TaskExecution that leverages the entertainment capabili-ties of online games to exploit Game Players to solve humancomputation tasks. More details about games are describedin the Gaming model of Sect. 2.5.

In Fig. 10 are described the actions performed by differentusers of the running example. Michael, as member of theFacebook social network, will perform the CommentExec1task execution to create a new TextAnnotation containing hisopinion over the posed question (e.g., “Fits well on yourhead!”). Clara and Pippa, as GamePlayers of the Sketchnessgame, performs three different Gameplay Actions, and theywill collaboratively create two LowLevelFeatures that can beused to describe the contours of the hat in the image providedby Jane.

2.4 Human Computation Model

The Human Computation model, depicted in Fig. 11,expresses the uncertainty arising from automatically com-puted social data and content objects’ metadata. It also dealswith conflicting opinions that may arise when humans arerequested to perform a piece of work that may entail judge-

123

Page 9: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

Fig. 9 The action model

Fig. 10 Action Model extractfor the running example

ment or errors. The Human Computation model is relatedto the Action model, as conflicts are the source of tasks forhuman solvers, and it revolved around the concept of Con-flict, i.e., a situation during the analysis of a given ContentO-bject where absence of contradictions about facts recordedin Annotations may arise.

Conflicts typically happen in the following scenario [28]:

– Missing Annotation: during a given annotation Action,the performer (an annotation component or a human), isnot able to find a suitable Annotation for the analyzedcontent.

– Uncertain annotation: during a given annotation Action,the performer creates Annotation having an Annotation-Confidence within a given interval of confidence, thusleaving uncertainty about the truthfulness of such a value.For instance, referring to the running example, the annota-tion component for garment recognition reports the pres-ence of a “poncho” in an image containing a “hat”. Uncer-tainty might also arise when the annotation Action hasbeen performed within a given interval of ActionQuality,

thus raising doubts about the actual quality of the annota-tion activity. For instance, this scenario may occur whenthe actions, performed by a Performer, have been marked(automatically by the system, or manually by another user)as poorly executed.

– Inconsistent annotations: a conflict may happen in merg-ing the Annotations between two different annotators forthe same ContentObject. For example, some Annotationcould be associated with a high AnnotationConfidence,however, they may lead to a wrong conclusion when theyare put together, or they may contradict a fact that thesystem might not know yet.

According to the definition above, a Conflict is, therefore,characterized by an ID (identifier), by a (set of) Conflict-ualContentObjects, and by a (possibly empty) set of relatedConflictualAnnotations, i.e., the set of Annotations that gen-erated the conflict.

When a Conflict occurs, it might need to be managedby human-enacted activities. Such activities are defined ina MacroTask that consists of a set of human-enacted atomic

123

Page 10: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

A. Bozzon et al.

Fig. 11 Human computation model

Tasks. MacroTasks and Tasks are instances of human compu-tation activity archetypes defined in the Human ComputationTask Metamodel of Fig. 11.

A HumanTaskType is the abstraction of a piece of workthat needs to be completed by a specific amount of perform-ers in a given period of time. A HumanTaskType is describedby an ID (identifier), by a Name, and an Example (a textualdescription of the activities associated with the type of task).A HumanTaskType typically details some constraints aboutthe execution of the type of task; for instance, the MinDura-tion and MaxDuration allowed for the overall task execution,the MaxCost allocated for the task, etc.

HumanTaskTypes specialize into MicroTaskType andMacroTaskType. A MicroTaskType represents a unit ofhuman computation activity performed by one or more Per-formers; in the human computation literature, MicroTask-Type can be defined as highly fractioned tasks that do notrequire specialized skills and can be completed in a smallamount of time. A MicroTaskType is characterized by an

OperationType, i.e., a specific human computation activitytype. Examples of OperationType are preference tasks anddata manipulation tasks [7]. The former correspond to typi-cal social interactions (like, dislike, comment, tag), while thelatter (create, order, complete, find, cluster) abstract simpleand classical primitives of relational query languages thatare common in human computation and social computationactivities.

A MacroTaskType represents an aggregation of one ormore MicroTasks, organized in a workflow to achieve a high-level goal. An example of MacroTaskTypes is the humancomputation “Tag/Segment/Verify” pattern that is used inthe game described in the running example. Within a Macro-TaskType aggregation, MicroTaskTypes present precedencerelationships that define their order in the workflow. The“Tag/Segment/Verify” pattern, for instance, can be instan-tiated by pipelining three micro tasks.

A MacroTask has a given MacroTaskType, and it is typi-cally executed on one or more ConflictResolutionPlatforms

123

Page 11: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

(e.g., social networks or human computation frameworks).The selection of the involved ConflictResolutionPlatforms isdone through the application of a given PlatformSelection-Strategy, i.e., a numerical, logical, or heuristic method thatdecides which are the best platforms to adopt for the solu-tion of a Conflict; for instance, if the conflict resolution taskinvolves the evaluation of fashion pictures, then the systemmay decide that the best platform to tap for human com-putation is Facebook rather than Amazon Mechanical Turk.Likewise, a UserSelectionStrategy is a method that decides,for the selected ConflictResolutionPlatforms, which are thebest performers to involve to satisfy the constraints definedin the MacroTaskType definition; for instance, in the runningexample Pippa is chosen over Michael to play a game sinceit is one of her passions. Finally, the ConflictResolutionStrat-egy decides how to split the execution of MicroTasks amongthe selected Performers, deciding, for instance, which con-flictual Annotations or ContentObjects will be assigned toeach Performers; moreover, the ConflictResolutionStrategydictates the result aggregation method (e.g., Majority Vote)to use for creating the final output of a MacroTask, which,typically, consists of one or more new Annotation objects

The decisions undertaken by the PlatformSelectionStrat-egy, UserSelectionStrategy, and ConflictResolutionStrategyare directly mapped into the MicroTasks that compose thegiven MacroTask, as each MicroTask is executed on a Con-flictResolutionPlatform, by a (possibly singleton) set of Per-formers, operating on a (possibly overlapping) subset of theContentObjects and Annotations.

As a MicroTask can be assigned to several Performers,each running instance of a Microtask to be executed is asso-ciated with a TaskExecution, a type of HumanAction that con-tains information about the StartTime, EndTime, and Quality-Metrics of the work performed by the single Performer, plusreference to the Annotations created during the specific exe-cution and related to other Annotations or ContentObjects.

An Annotation (or a set thereof) created during a Macro-Task, can be, by definition, conflictual and thus be the sourceof a new Conflict within the platform. The decision about theright course of action to undertake is typically related to theselected ConflictResolutionStrategy.

Figure 12 shows how the conflicts generated by the auto-matic algorithms of our running example are managed. Tosolve the conflict, the human computation platform has torecognize the object within the image, identify the con-tours of the garment, and validate the results. A Macro-Task able to deal with the problem, following the pattern“Tag/Segment/Verify”, is thus planned. The Macrotask iscomposed by three different Microtask, each one dealingwith a particular job to be solved to complete the wholetask. Sketchness is used to complete all the microtasks; thusthe ConflictResolutionPlatform is a GamingPlatform. A Tex-tAnnotation describes the result of the performers’ work (a

“Hat” with a relatively high ConfidenceValue). The aggre-gated LowLevelFeature used to describe the segments of theimage, on the other hand, is less precise and has a lowerConfidenceValue. For the sake of simplicity, we omit thedescription of Michael’s contribution on giving an opin-ion over the submitted image, since it would be a simplercase of the example that has just been described. Clara andPippa are playing Sketchness together. Clara is playing asa Sketcher and she is associated with two TaskExecutions(ChoiceExec1 and SegmentExec1) that represent the instan-tiations of the Microtasks used to map the tasks to be exe-cuted (Tag the image and Segment it) with specific gameplayactions (Choosing the hidden garment and segment it). As abyproduct of play, Clara generates a TextAnnotation statingthat in the image there is a “Hat” and a LowLevelFeaturecontaining the polyline related to the traced contour. Pippais a Guesser able to see just the contours traced by Clara;during the game she recognized the object that has been seg-mented and so she generates a new TextAnnotation to validatethe results. Several other TaskExecutions are performed onthe same ContentObject but are not represented in the figurefor sake of simplicity. All the annotations generated by thedifferent TaskExecutions are then aggregated to produce twoannotations that are the result of the MacroTask.

2.5 Gaming Model

The Gaming model focuses on a specific class of tasksdeployed in the form of a game with a purpose (GWAP)and expresses the engagement and rewarding mechanismstypical of gaming (including gaming scores, leaderboards,and achievements). The Gaming model is related to Action,User & Social, and Human Computation models to denotethe assignment of a gaming session to a player for solving aTask.

The Gaming Model is depicted in Fig. 13. A Game is anentertainment application described by a Title and character-ized by a Genre (e.g., Puzzle, Educational), a Mode (SinglePlayer, Multi Player), and a Theme (e.g., Abstract, Comic,Crime, Science Fiction). A GamePlay Action is a humancomputation action that the user has performed while play-ing a Game during a specific session of that game, the Game-play. Since the Gameplay tracks all the actions performed bydifferent players during a specific running game, it is possi-ble to retrieve social information regarding the relationshipamong the gamers. A Game may also have a list of avail-able Roles that a GamePlayer may assume during a specificGamePlay; in the running example, the roles for the playercan be the Sketcher and the Guesser. A Role can be describedwith a Name and a list of Abilities that define which are theallowed actions in the game for a particular role. A GamePlayer is a type of user described by customization attributes(e.g., Nickname, Motto) and accomplishments: for instance,

123

Page 12: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

A. Bozzon et al.

Fig. 12 Human Computation Model extract for the running example

Fig. 13 Gaming model

the PlayerLevel attribute represents the proficiency and theexperience of a player; the PlayerTitle is a special recognitiongiven to the player for his actions (e.g., a chivalry role); thePlayerType (e.g., Achiever, Explorer, etc.) is used to associatethe player with a particular cluster of gamer type. GameStatsare stored to keep track of the HoursPlayed by a player on aspecific Game, the Score he has obtained or other meaningfulvariables. Games have Achievements, i.e., means to foster anentertaining experience for users and a way to profile them.

An Achievement is a specific challenge or task that the playercan perform to get a reward in terms of points or other specialfeatures (in-game items, artworks, behind the scene videos);it is defined by a Category that specifies which kind of taskthe achievement was associated with such as Quests, Social-izers, Grinders and the like as they have been defined in [30]and a PointsGiven attribute which contains the amount ofpoints to be given if the requirements for the achievementhave been met. Once a player reaches the goals of a listed

123

Page 13: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

Fig. 14 Game Model extract for the running example

achievement, she will gain a Badge related to that specificachievement. A GameBadge is used to relate a player withthe achievement she has obtained, and it is described by aCompletionPercentage attribute that shows how much theplayer has already achieved in order to complete a specifictask.

Figure 14 represents the gaming scenario for the runningexample. Clara is a Gameplayer that plays a fashion Gamecalled Sketchness. During the Gameplay session in which sheparticipates, she performs different GamePlayActions thatare allowed by her Role. As a Sketcher, she can choose oneof the objects portrayed in the image (e.g., a “hat”) and shecan draw the contour on the picture. Her GameplayActionscan be considered as the TaskExecutions associated with theMacroTask described in Fig. 11, used to generate a tag andsegments to identify the object. Sketchness has a particularAchievement that requires a player to win three games in a rowto be fullfilled. By winning her last game and analyzing theGameStats assigned to Clara, we can see that she was ableto satisfy the requests of the achievement; thus the relatedBadge has been issued, detailing the acquisition statistic withthe use of a Game Badge.

3 Implementation and Experience

The illustrated models have been extensively adopted as coredata and object models in the implementation of three dif-ferent socially enabled Human Computation Platforms: theCUbRIK platform [19], Crowdsearcher [7] and the FWAPgaming framework.

3.1 CUbRIK

The CUbRIK project aims at developing a human computa-tion platform for multimedia content processing and query-ing where humans and social networks are exploited to exe-cute tasks that require human or social intelligence in thesolution process. CUbRIK adopted a model-centric process

for the design of the platform components and processes,with the ultimate goal of facilitating the interoperability ofdifferent components and pipelines used to extract knowl-edge from multimedia assets, based upon a coherent view ofdata and metadata.

CUbRIK is a distributed system layered in four main tiers,as shown in Fig. 15. The Content and user acquisition tier isresponsible for registering content and users into the sys-tem, and it had been described by the models defined inSects. 2.2 and 2.1. A novelty in the approach proposed inCUbRIK is the possibility to exploit the PlatformMetrics andTopicalAffinities that have been described in Sect. 2.1.2 toassign the proper tasks to the performers based on allocationpolicies related to geographical, cultural or topical affinities;this feature is handled by the Performer Manager, which isresponsible for keeping data about performers used to opti-mize task allocation. Unlike traditional Human Computationplatform, CUbRIK handles two classes of users: performersand searchers. Performers, as they have been described inSect. 2.1.1, form the traditional user base of any Human Com-putation platform; they execute tasks to provide contribution,semantic annotation, and conflict resolution. Searchers canbe described as the AppUsers defined in Sect. 2.1.1; theyuse CUbRIK applications for finding and interacting withinformation and may also submit new multimedia content(images, video, audio, text) during their operations.

Content is added to a CUbRIK platform by uploading orcrawling external elements with associated metadata and canbe described with the elements defined in Sect. 2.2.

All the actions performed within the CUbRIK platformare stored to keep track of the users that have performedthem and their associated quality measures; this data is usedto estimate the skills of the user registered in the system.Unlike traditional Crowdsourcing markets, CUbRIK is ableto orchestrate tasks performed not only with the use of tra-ditional applications but also with the use of specifically tai-lored games. The models defined in Sect. 2.3 are well suitedto describe not only all the possible actions that can be per-

123

Page 14: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

A. Bozzon et al.

Fig. 15 The architecture of theCUbRIK system

formed in a multimedia search engine, such as querying orretrieving multimedia content, but also to define the actionstypical of a human computation platform.

The Conflict Manager deals with untrustworthy annota-tions and human computation tasks, according to the HumanComputation model of Sect. 2.4.

3.2 Crowdsearcher

The Crowdsearcher system2 is an interoperable mediator forthe interconnection of search systems with human and socialplatforms.

The aim of Crowdsearcher [7] is to make the crowd con-sultation simpler and more efficient, possibly by including

2 The code and documentation of Crowdsearcher are available fordownload at http://crowdsearcher.search-computing.com

users belonging to social networks. Crowdsearcher relieson a query and execution model directly derived from theHuman Computation model described in Sect. 2.4, whichis used to guarantee human computation platform interop-erability and the independence of human computation tasksfrom the human computation platform (Fig. 16).

Crowdsearcher instantiates micro task templates (i.e.,MicroTaskTypes) by importing information from search sys-tems, sends tasks to the social/crowd platform, gets a collec-tion of performers involved, and gathers the results obtains bythe micro task executions. The system acts in the context pro-vided by a given social/crowd platform user, denoted as taskmaster, who is instrumental to the crowdsourcing process,by being responsible (and possibly covering costs) of taskswhich are spawn to the crowd and by offering friends and col-leagues as performers. TaskExecutions might be performed

123

Page 15: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

Fig. 16 Crowdsearcher system architecture

through a dedicated interface (hosted by Crowdsearcher orby using the native APIs offered by each platform, e.g., theAmazon APIs) or by embedding the query interface as a newapplication of the platform (i.e., a Facebook application [8]).

3.3 Framework with a Purpose

Games with a purpose (GWAPs) [41] are digital games whereplayers generate useful data as a by-product of play. Onemajor issue of this kind of applications derives from the factthat the games are designed and tailored over the specific taskthat has to be solved on ad-hoc basis instead of focusing onthe entertainment of the players. A Gaming Framework ableto provide a set of tools and guidelines that can ease the devel-opment of novel engaging applications is being developed toaddress this issue. To provide incentives for the players andprofile them based on the actions that they have performedduring the game, the Gaming Framework makes use of anAchievement System [30].

An Achievement System can be structured into the com-ponents and data flows illustrated in Fig. 17 and has beendescribed in [30].

Its goal is to receive Gameplay Events from a runninggame played by a specific user, process them, and returnas output the updated gaming history data for that player,including the badges he may have acquired and an updatedgameplayer profile. Gameplay Events represent the occur-rence of meaningful game states, for instance, the achieve-ment of a specified number of collected objects during agameplay session; since the events that are meaningful tobe tracked are related to the actions performed by the user,they can be modeled with the Gameplay Actions defined inSect. 2.5. The Action Detection module is in charge of filter-ing raw gameplay events and of returning only the meaning-

ful ones, which constitute the relevant achievement actions.All the achievement actions are collected and processed bya GameStat Updater Module, which is in charge of trackingand making persistent the monitored actions for the playerthat has performed them under the form of Gameplay Statis-tics. Gameplay statistics can be modeled as the Game Statsdefined in Sect. 2.5 to keep track, for example, of the totalscore obtained by a player or the time he has spent playinga game while the achievement actions are just a subset of allthe possible registered Gameplay Actions, representing justthe actions that are meaningful to be tracked.

The achievement actions, along with the updated playerstatistics, are the input to the Achievement Detector compo-nent, which checks if the required conditions for a particu-lar achievement, defined through Achievement Descriptors,are met, assigns the associated badge to the player, and out-puts the updated profile data of the player. The definition ofAchievement and its associated Badge can also be mappedto the model that has been described in Sect. 2.5, whilethe Achievement Descriptors represent the Event-Condition-Action rules of an active database under the form of triggersused to define the requirements of an achievement.

There is no known representation of an architecture or amodel used to define a general purpose Achievement Sys-tem in literature, since all the commercial systems such asXbox Live!, Playstation Network and Steam are proprietary.The model described in Sect. 2.5 and for which we havedetailed the use in the considered Achievement Frameworkis thus meaningful because it is able to abstract the featurescommon to different systems into a homogeneous concep-tual representation. This could serve as a reference for thepurpose of player data portability across different games andas a blueprint for future interoperable achievement systemdesigns.

123

Page 16: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

A. Bozzon et al.

Fig. 17 Achievement system architecture

3.4 Lesson Learnt

The experiences reported in the previous sections are justthe first three instances of adoption for our model. Whileother usages are ongoing or planned, experiences acquiredso far allow us to outline some remarks about the adoption ofa unified, comprehensive abstraction in several steps of thedevelopment process, from the requirement collection to theAPIs testing.

The set of designed features suffice to cover the most ofthe requirements emerged from the projects analysis. How-ever, adaptation to specific, vertical domains (e.g., multime-dia search in CUbRIK, or social search in CrowdSearcher)were often required; although expected, the additional designefforts had limited cost, mainly thanks to the modularity ofour model, which allowed targeted modification/extensions,e.g., to introduce additional types of low-level feature anno-tations. Moreover, models allowed us the adoption of amodel-driven development approach, enacted by means ofindustrial-strength code-automation suits (e.g., WebRatio[2]), to quickly provide high-quality prototypes for require-ment elicitation and design refinements. Model-driven-development best practices also speeded-up the process ofAPI creation and testing, where the various sub-models havebeen used as common references for the specification of dataexchange formats in REST APIs. To this end, the availabilityof focused sub-models simplified the creation of API-specificadapters from the system data model(s) to the APIs formats,and the compatibility with existing exchange and storage for-mats greatly simplified the implementation of such adapterson top of existing open source code.

4 Related Work

This sections gives an overview of the state of the art in thefields of models for (i) users, (ii) contents and content descrip-tion, and (iii) human computation. In the following section,ten different User and Social models (listed in Table 1) willbe compared over a set of dimension that we consider suf-ficient and meaningful to characterize them. In particular,three macro-dimensions have been defined:Basic User Modeling refers to the minimal set of dimensionsthat every User Model should have, such as the personaldetails of the user, the environment in which she can findherself, her interests and preferences and her competences.Social Features refers to the set of dimensions that charac-terize the social aspects of a user, such as her friendshipsand community memberships, her social activities and inter-actions, her affinity to specific topics, her role in a socialnetwork, and her trustworthiness.Human Computation refers to the set of dimensions thatare not present in traditional User and Social models but arerequired to handle Human Computation (e.g., the definitionof Content and Tasks, the capability of handling Conflicts,the ability to track the role of the user in the system alongwith the history of her activities in special applications, etc.).

4.1 (Social) User Modeling

User models have been defined since the 1980s, mainly forthe purpose of building context-based, user-adaptive appli-cations. The need for more interoperable user models hasbeen already identified in recent works (e.g., [15,43]), which

123

Page 17: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

Table 1 Comparison of related works in (social)user and human computation modeling

CUMO URM GUMO UUCM GUMF OpenS. SIOC FOAF HiddenU SWUM

Basic user modeling

Personal information Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

Context Yes No No Yes No No No No No Yes

Competence Yes No Yes Yes Yes Yes No Yes Yes Yes

Preferences Yes Yes Yes Yes Yes No Yes Yes Yes Yes

Social features

Community, group No No No No No Yes Yes Yes Yes Yes

Social interaction No No No No No Yes Yes Yes Yes Yes

Activity, social actions No No No No No Yes Yes Part Yes No

User network role No No No No No Yes Yes Part Yes Yes

Affinity No No No No No No No No No No

User reliability, trust No No No No No No No No No No

Human computation

Content description No No Yes No Yes Yes Yes No Yes Yes

Task No Yes No Yes No Yes Yes No Yes No

Role No Yes No No No No Yes No No No

Game history No No No No No No No Part No No

Conflict No No No No No No No No No No

focuses on context adaptation of applications. This broadsurvey identifies characteristic dimensions, such as the usersprofile, task features, social relations, and the context, andit promotes interoperability with many generic user mod-els at different levels [26,31] such as cognitive character-istics of the user (area-of-interest, competence, preference),social data (e.g., social relationships), environmental data(e.g., device, current time, language, location), and interac-tion data (e.g., current task, task role, task history).

Another survey [39] overviews the ontological technolo-gies applied to user modeling for aggregating and align-ing many RDF/OWL models in a decentralized approach.Coherent with this decentralized view is the General UserModel Ontology (GUMO) [23], one of the most compre-hensive user modeling approaches that aims at providing auniform interpretation of distributed user models that coversall aspects of a user’s life from contact information and demo-graphics to special information like mood, nutrition or facialexpressions [33]. The Cultural User Modeling Ontology(CUMO) [37] is used to represent the user’s culture back-ground for many applications such as e-governance, whileincluding demographic information such as the user birth-place, the current and past residences, etc.

Unlike GUMO, CUMO is able to take into considerationthe context of the user, even though it cannot consider theinfluence provided by different sources able to modify theuser’s culture, such as her travels abroad.

Building user models from cross-system platforms [6]is another attempt to aggregate the distributed profiles for

the same user in case of obsolete, missing or scattered data.The Unified User Context Model (UUCM) [31] used suchapproach. As a common ontology-based user context modelfor the exchange of user profiles between multiple systems,it advocates the reuse of user profiles in different systems ascontext passport and minimize sparse or missing informa-tion. Its description of the users is lacking expressivity sinceit is limited to four disjoint dimensions and any informationnot fitting those dimensions cannot be modeled.

The User Role Model (URM) [45] is another ontology-based example for modeling users and their related rolesaccording to the service they interact with. Every role isdescribed by the five dimensions: Generic information,Preference, Relationship, Task, and Taskrole, but it can beextended according to the features of the service providerand based on the required information.

User modeling on the Social Web is also widely used asa tool for recommendation strategies based on interests, cul-ture, and topical affinity. The ImREAL3 project or Grap-ple[3]4 are representative examples. The latter is built uponGUMO [23] and it is well suited to the representation of userprofiles and can be freely extended (RDF-based), but doesnot cover social relationships and social actions. The U2M 5

project is an example of a user model and context ontology

3 http://www.imreal-project.eu4 http://www.grapple-project.org5 http://www.u2m.org

123

Page 18: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

A. Bozzon et al.

integrating GUMO with UserML [22] for future Web 2.0extensions [23].

An in-depth comparison of models based on the basic fea-ture listed above is provided in [15]. However, all the mod-els presented so far are lacking the support for social datarepresentation, such as the ability to model social relation-ships between users or their preferences; thus more expres-sive models have to be introduced.

The Friend of a Friend [14] 6 is an ontology for describ-ing users, their links to on line communities and activitieswithin Web pages using RDF/XML. Its simplicity coversjust basic users information (contact and demographics),their professional and personal lives, their friends, inter-ests, and other social dispositions like group membershipor knows relations to other FOAF profiles. Its support forSocial Actions and the role of the user within a Network isscarce since it lacks the expressive power to represent socialinteractions, user needs and goals.

The OpenSocial API [21] 7 is another recent effort withthe goal of providing a common data and operations modelAPIs to access major social networking applications such asMyspace, Orkut, Yahoo!, etc. OpenSocial provides a veryabstract and generic data model, so to be able to fit infor-mation coming from several social networks concerning anactivities stream, groups of people, album of media files orresources capabilities, etc.Semantically Interlinked Online Communities (SIOC)[13]8, is an RDF/ OWL representation to link and describeonline community sites (e.g., message boards, wikis, weblogs,etc.). SIOC documents may use other existing ontologies toenrich the information described such as Dublin Core [34]metadata, FOAF, RSS1.0, etc.

The Social Web User Model (SWUM) [33,16], is ageneric model that focuses on the social Web domain too; itsgoal is to support data sharing and aggregation of social net-works user models by aligning concepts, attributes (e.g., per-sonal characteristics, interests and preferences, needs andgoals, knowledge and background), user behaviors(e.g., pre-vious search behavior using Google dashboard),location, andtime. User’s category or community can be inferred, but thereis no tracking of the user’s actions.TheHiddenU[25]9 project is an attempt in the direction ofsocial user model unification [43], as it compares severalmodels in terms of their coverage of different user and socialconcepts. The project Web site offers a comparison of manysocial user models that includes, as analysis dimensions, fea-tures such as demographics, competences, social relation,social interaction, activity/social activity, ownership, desires,

6 http://www.foaf-project.org7 http://www.opensocial.org8 http://rdfs.org/sioc/spec9 http://social-nexus.net

preferences, beliefs, studies, works, tastes, feelings, produc-tion, privacy, provenance, resources, and participation, etc.

For semantic enrichment and mining of users profilesfrom the social Web, some frameworks are relevant, suchas U-Sem[1]. TweetUM10 is a Twitter-based user modelingframework which allows developers to create semanticallyenriched user profiles inferred from users’ tweets, retrievingall topics that the user is interested in, the entities informa-tion, the top hash tags or top entities cloud. GeniUS11 isused to enrich the semantics of social data: given a stream ofmessages from the users’ status, it can generate the topic andsome user information.

As highlighted in Table 1, some of the dimensions thatare essential in describing the user of a Human Computationplatform from a social point of view are missing in all themodels that are recognized as “Social”, but this is naturaland unsurprising given the different scopes of the models.For example, no existing model appears to support the repre-sentation of affinity relationship between users and contentthey consume and produce; the concept of trustworthinessand reliability of a user on a particular social platform is alsocompletely overlooked.

The User and Social Model proposed in this paper couldbe, on the other hand, easily extended so to embed the fea-tures defined in the above mentioned models and also themissing ones but, differently from existing works in the fieldof (social) user modeling, it includes the representation ofhuman computation activities, performers, and applications.

4.2 Human Computation Activities and PerformersModeling

Despite the recent spreading of human computation andcrowdsourcing as research topics, to the best of our knowl-edge no existing work fully addresses the conceptual mod-eling or representation of human computation tasks and per-formers. Researchers have tried to break down and categorizehuman computation: Quinn and Bederson [35] categorizeshuman computation applications according to six dimen-sions, including performer motivation, quality, aggregation,human skill, participation time, and cognitive load. Littleet al. [29] represents human computation as a set of oper-ators organized in an iterative or parallel fashion. Bozzonet al. [7] enumerates a list of data-driven human computa-tion activities to be associated with crowdsearching tasks;[9] proposes a model-driven approach for the specificationof crowd-search tasks for automatic application generation;the approach includes two models: the “Query Task Model”,representing the meta-model of the query that is submitted tothe crowd and the associated answers; and the “User Interac-

10 http://www.wis.ewi.tudelft.nl/tweetum/11 http://www.wis.ewi.tudelft.nl/genius/

123

Page 19: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

tion Model”, which shows how a performer can interact withthe query task model to perform a task.

Content and content description modeling to define thekind of actions that can be applied on content objects by per-formers there is the need to define such contents and theirrelated attributes. The current state of the practice in con-tent management presents a number of metadata vocabular-ies dealing with the description of textual and multimediacontent [20], typically allowing the description of high-level(e.g., title, description) or low-level features (e.g., color his-togram, file format). Dublin Core [34] is a 15-element meta-data vocabulary (created by domain experts in the field ofdigital libraries) intended to facilitate discovery of electronicresources, with no fundamental restriction on the resourcetype. MPEG-7 [38] offers a comprehensive set of multime-dia description tools, which can be used by applications thatenable quality access to content. MPEG-7 gives a genericframework that can support various applications, facilitat-ing exchange and reuse of multimedia content across dif-ferent application domains. RuCOD [17] is a frameworkfor description of rich media content, based on the con-cept of “content objects”, i.e., rich media presentations thatenclose different types of media, along with real-world infor-mation and user-related information. The Content Descrip-tion Model proposed in this paper is inspired by these stan-dards, and it could be easily extended so to embed the fea-tures defined in the above mentioned models; also, ContentDescription Model can be trivially mapped to such represen-tations with minimal effort, possibly including informationrelated to the lineage of content objects and annotations.

From Table 1 it can be evinced that some of the modelsthat has been described already possess the capabilities ofrepresenting some Human Computation facets. GUMO andthus GUMF are able to deal with Images, with the possibilityto adapt and extend the description to any other kind of mul-timedia content. OpenSocial deals with content in the formof “MediaItems” to represent images, movies, and audio,SIOC threats them as “ Items” while CUMO,URM, UUCMand FOAF in their current version are not able to deal withcontent objects since they have no representation for them.SWUM, being based on RDF, and HiddenU, being based onOWL, are able to describe resources and multimedia content.

For what concerns the definition of a Task, several mod-els support its representation, even though not at the samelevel of detail as it has been defined in this work. URM andUUCM have a “Task” dimension used to describe task relatedinformation about the user. SIOC, thanks to its support forEnterprise data integration, is able to keep track of the tasksperformed by users and their role within the system. OpenSo-cial can describe tasks performed by the users with the notionof “ActivityObject” and HiddenU describes the relationshipbetween the user, the content, and the process used to gener-ate such content.

To conclude, even though Online Gaming Communitiessuch as Xbox Live! and Steam are exerting more and moreinfluence on the everyday life of players, just FOAF takes intoconsideration the concept of gaming community by definingthe possible platforms and associated accounts.

5 Conclusions

In this paper, we have presented a model for the conceptual-ization and representation of socially enabled human compu-tation applications, defined as applications that make inten-sive use of human contribution and social networks to solvecomplex tasks. The model capitalizes on existing user mod-els by integrating the representation of traditional aspects,like user profiles and roles, with the addition of novel fea-tures such as membership in multiple communities, central-ity metrics, content affinity information, and capabilities.We also propose novel models for human computation andgaming activities representation, specifically tailored to therepresentation of the actions that users can perform to helpproblem solving, and the link between actions and problems,expressed by the generic notion of conflict.

The illustrated model is at the base of the implementationof several socially enabled human computation platforms thatexploit multiple social networks, crowdsourcing platforms,and interaction styles to engage users in difficult multimediaprocessing tasks, and we reported on our experience with itsadoption in the platforms design and development phases.

Acknowledgments This work has been partially supported by theBPM4People project (http://www.bpm4people.org), funded by theCapacities e Research for SMEs Program of the Research ExecutiveAgency of the European Community; the CUbRIK project (http://www.cubrikproject.eu/), funded by the European Community Seventh Frame-work Programme (FP7/2007–2013); by the Dutch national programCOMMIT (http://www.commit-nl.nl).

References

1. Abel F, Celik I, Hauff C, Hollink L, Houben G-J (2011) U-sem:semantic enrichment, user modeling and mining of usage data onthe social web. CoRR, abs/1104.0126

2. Acerbis R, Bongio A, Brambilla M, Butti S, Ceri S, Fraternali P(2008) Web applications design and development with webml andwebratio 5.0. In: Paige RF, Meyer B (eds) Objects, components,models and patterns, volume 11 of Lecture Notes in Business Infor-mation Processing. Springer, Berlin, pp 392–411

3. Aroyo L, Houben G-J (2010) User modeling and adaptive semanticweb. Semantic Web 1(1–2):105–110

4. Axenopoulos A, Daras P, Malassiotis S, Croce V, Lazzaro M, EtzoldJ, Grimm P, Massari A, Camurri A, Steiner T, Tzovaras D (2012)I-search: a unified framework for multimodal search and retrieval,pp 130–141

5. Baeza-Yates RA, Ceri S, Fraternali P, Giunchiglia F (eds) (2012)Proceedings of the first international workshop on crowdsourcingweb search, Lyon, volume 842 of CEUR workshop proceedings.CEUR-WS.org

123

Page 20: Modeling CrowdSourcing Scenarios in Socially-Enabled Human Computation Applications

A. Bozzon et al.

6. Berkovsky S, Kuflik T, Ricci F (2009) Cross-representation media-tion of user models. User Model User Adapt Interact 19(1–2):35–63

7. Bozzon A, Brambilla M, Ceri S (2012) Answering search querieswith crowdsearcher. In: WWW, pp 1009–1018

8. Bozzon A, Catallo I, Ciceri E, Fraternali P, Martinenghi D,Tagliasacchi M, Tagliasacchi M (2012) A framework for crowd-sourced multimedia processing and querying. In: CrowdSearch, pp42–47

9. Bozzon A, Mauri A, Brambilla M (2012) A model-driven approachfor crowdsourcing search. In: CrowdSearch, pp 31–35

10. Brambilla M, Cabot J, Wimmer M (2012) Model-driven softwareengineering in practice. Synthesis lectures on software engineering.Morgan & Claypool Publishers

11. Brambilla M, Fraternali P, Ruiz CKV (2012) Combining social weband bpm for improving enterprise performances: the bpm4peopleapproach to social bpm. In: Mille et al. [32], pp 223–226

12. Brambilla M, Fraternali P, Vaca C (2011) BPMN and design pat-terns for engineering social BPM solutions. In: Proceedings of thefourth international workshop on BPM and social software, BPMS

13. Breslin JG, Harth A, Bojars U, Decker S (2005) Towardssemantically-interlinked online communities. In: Proceedings ofthe second European conference on the semantic web: researchand applications, ESWC’05. Springer, Berlin, pp 500–514

14. Brickley D, Miller L (2010) FOAF Vocabulary Specification 0.97.Namespace document

15. Carmagnola F, Cena F, Gena C (2011) User model interoperability:a survey. User Model User Adapt Interact 21(3):285–331

16. Cena F, Dattolo A, De Luca EW, Lops P, Plumbaum T, Vassileva J(2011) Semantic adaptive social web, pp 176–180

17. Daras P, Axenopoulos A, Darlagiannis V, Tzovaras D, Le BourdonX, Joyeux L, Verroust-Blondet A, Croce V, Steiner T, Massari A,Camurri A, Morin S, Mezaour A-D, Sutton L, Spiller S (2011)Introducing a unified framework for content object description.IJMIS 2(3/4):351–375

18. Birgit P, Kapsammer E, Mitsch S et al (2011) A first step towardsa conceptual reference model for comparing social user profiles

19. Fraternali P et al (2012) The CuBRIK project: human-enhancedtime-aware multimedia, search. [32], pp 259–262

20. Geurts J, van Ossenbrugen J, Hardman L (2005) Requirements forpractical multimedia annotation. In: Multimedia and the semanticweb, 2nd European semantic web conference

21. Häsel M (2011) Opensocial: an enabler for social applications onthe web Commun. ACM 54(1):139–144

22. Heckmann D, Krnger A (2003) A user modeling markup language(userml) for ubiquitous computing. In: Brusilovsky P, Corbett AT,de Rosis F (eds) User modeling 2003, Proceedings of the 9th inter-national conference, UM 2003, Johnstown, volume 2702 of LectureNotes in Computer Science. Springer, Berlin, pp 393–397

23. Heckmann D, Schwarzkopf E, Mori J, Dengler D, Kröner A (2007)The user model and context ontology gumo revisited for future web2.0 extensions. In: Proceedings of the international workshop oncontexts and ontologies: representation and reasoning (C & O:RR)

24. Howe J (2006) The rise of crowdsourcing. Wired 14(6)25. Kappel G, Schönböck J, Wimmer M, Kotsis G, Kusel A, Pröll B,

Retschitzegger W, Schwinger W, Wagner RR, Lechner S (2010)Thehiddenu—a social nexus for privacy-assured personalisationbrokerage. In: ICEIS, Proceedings of the 12th international con-ference on enterprise information systems, pp 158–162

26. Kobsa A (2001) Generic user modeling systems. User Model UserAdapt Interact 11(1–2):49–63

27. Law E, von Ahn L (2009) Input-agreement: a new mechanism forcollecting data using human computation games. In: Proc. CHI2009, pp 1197–1206

28. Cheng-Yu L, Soo V-W (2006) The conflict detection and resolutionin knowledge merging for image annotation. Inf Process Manage42(4):1030–1055

29. Little G, Chilton LB, Goldman M, Miller RC (2010) Exploring iter-ative and parallel human computation processes. In: Proceedingsof the ACM SIGKDD workshop on human computation, HCOMP’10. ACM, New York, pp 68–76

30. Luca Galli PF. Achievement systems explained. In: Serious gamingand social connect. Springer, Berlin

31. Mehta B, Niederée C, Stewart A, Degemmis M, Lops P, SemeraroG (2005) Ontologically-enriched unified user modeling for cross-system personalization, pp 119–123

32. Mille A, Gandon FL, Misselis J, Rabinovich M, Staab S (eds)(2012) Proceedings of the 21st world wide web conference, WWW2012, companion volume. ACM, Lyon

33. Plumbaum T, Wu S, De Luca EW, Albayrak S (2011) User mod-eling for the social semantic web. In: SPIM, pp 78–89

34. Powell A, Nilsson M, Naeve A, Johnston P (2005) Dublin coremetadata initiative - abstract model, White Paper

35. Quinn A, Bederson B (2009) A taxonomy of distributed humancomputation. Technical report

36. Quinn AJ, Bederson BB (2011) Human computation: a survey andtaxonomy of a growing field. In: Proceedings of the 2011 annualconference on human factors in computing systems, CHI ’11, pp1403–1412

37. Katharina R, Gerald R, Bernstein A (2007) An approach to over-come the personalization bootstrapping problem, cultural usermodeling with cumo

38. Salembier P, Sikora T (2002) Introduction to MPEG-7: multimediacontent description interface. Wiley, New York

39. Sosnovsky SA, Dicheva D (2010) Ontological technologies for usermodelling. IJMSO 5(1):32–71

40. Suchanek FM, Kasneci G, Weikum G (2007) Yago: a core ofsemantic knowledge. In: Proceedings of the 16th international con-ference on world wide web, WWW ’07. ACM, New York, pp 697–706

41. von Ahn L (2006) Games with a purpose. Computer 39:92–9442. von Ahn L (2009) Human computation. In: CIVR43. Wischenbart M, Mitsch S, Kapsammer E, Kusel A, Pröll B,

Retschitzegger W, Schwinger W, Schönböck J, Wimmer M, Lech-ner S (2012) User profile integration made easy: model-drivenextraction and transformation of social network schemas. In: Milleet al. [32], pp 939–948

44. Yamaguchi K, Hadi Kiapour M, Ortiz LE, Berg TL (2012) Parsingclothing in fashion photographs. In: CVPR, IEEE, pp 3570–3577

45. Zhang F, Song Z, Zhang H (2006) Web service based architectureand ontology based user model for cross-system personalization,pp 849–852

123