65
Software Architecture 18 Crowd Sourcing Haopeng Chen REliable, INtelligent and Scalable Systems Group (REINS) Shanghai Jiao Tong University Shanghai, China http://reins.se.sjtu.edu.cn/~chenhp e-mail: [email protected]

Software Architecture 18 Crowd Sourcing

  • Upload
    others

  • View
    17

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Software Architecture 18 Crowd Sourcing

Software Architecture 18Crowd Sourcing

HaopengChen

REliable, INtelligent andScalableSystemsGroup(REINS)ShanghaiJiaoTongUniversity

Shanghai,Chinahttp://reins.se.sjtu.edu.cn/~chenhp

e-mail:[email protected]

Page 2: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems传统软件工程

• 解决软件危机问题– 通过“工程化”方法开发软件,提高软件的开发效率及正确性

• 特点– 精英化– 计划性– 封闭化

2

Page 3: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems软件开发面临的新问题

• 如何实现“富功能”软件的快速构造与演化?– 软件规模庞大、功能繁多、更新频繁、需要具有高扩展性;开发技术变更迅速

• 如何充分利用资源并精炼出更好解决方案或目标软件?– 软件开发同质化,互联网上(如谷歌、Sourceforge等)存在诸多代码资源及技术解决方案,存在大量重复开发

• 如何使软件开发与社会网络结构相耦合?– 对于高度连接的社会网络而言,已发现的最优信息可以快速传播,使得软件能更快被开发

• 如何演化软件使其适应新需求、新环境?

3

Page 4: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems开发一个集成开发环境

• 如何基于软件工程开发一个能够支持各种语言编辑、编译、调试、运行的软件集成开发环境?– 需求分析– 架构设计– 代码实现– 代码测试– 软件部署– 软件维护– 版本升级– …

缺点:软件灵活性不够,不能适应新语言、不能支持新的功能需求

4

VisualStudio时间线

Page 5: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsEclipse集成开发环境

• 四个组成部分:JDT支持Java开发、CDT支持C开发、PDE用来支持插件开发,EclipsePlatform则是一个开放的可扩展IDE,提供建造块和构造并运行集成软件开发工具的基础

• EclipsePlatform允许工具建造者独立开发与他人工具无缝集成的工具从而无须分辨一个功能在哪里结束,而另一个功能在哪里开始

• Eclipse将来能成为可进行任何语言开发的IDE集成者,使用者只需下载各种语言的插件即可

Eclipse是一个依赖于群体智能的成功软件项目,谁也无法预料它将向哪个方向前进!

5

Page 6: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsEclipse集成开发环境(续)

• Eclipse项目已成为公司与个体均能参与开发的项目

公司贡献,3个月内 个人贡献,3个月内

• 每个人可以在线报告Eclipse的错误(Bugzilla)

• Eclipse在群体智能的刺激、激励下,功能趋于强大,质量趋于稳定

6

Page 7: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems目标

• 针对当前软件开发过程中快、散、变的特点,以群体智能为软件开发驱动力,构造高复杂性、高开放性、高质量的软件

• 潜在解决方法:群体智能驱动的软件构造与演化– 重要前提:社会网络、问题搜索– 研究问题1:群体智能驱动的复杂软件构造– 研究问题2:软件协同演化– 研究问题3:持续性的软件质量保障

7

Page 8: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems理论基础

软件构造与演化理论

群体智能

群体智能驱动的软件构造与演化

• 软件构造与演化– 分而治之,将软件开发任务分解为并行或者层次的开发任务,并分配给开发者;开发者之间可能存在交互

– 通过软件演化,为软件提供新功能或使其适应新环境

• 群体智能– 如蚁群算法,“每只蚂蚁只关心很小范围内的眼前信息,而且根据这些局部信息利用几条简单的规则进行决策。这样,在蚁群这个集体里,复杂性的行为就会凸现出来”

• 群体智能驱动的软件构造与演化– 由群体智能驱动,对软件目标、架构设计、算法实现、质量控制、组织形式等方面平衡,以高效构造及演化软件

8

Page 9: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems群体智能驱动的软件构造与演化的本质

• 强大的群体开发力量– 群体能高效进行复杂软件的开发

• 群体智能指导软件的发展方向– 通过交叉、变异、繁殖等,软件能快速收敛,也能快速发散

• 软件构造与演化– 群体智能将驱动高质量、高复杂性、高开放性的软件构造与演化过程

9

Page 10: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems群体智能驱动软件构造的模式

• 任务划分、分配与组合• 基于社会网络的多团队协同软件构造(Codebook项目的例子)• 重用已有项目(Chrome浏览器开发的例子)• 软件构造具有开放性(Chrome浏览器插件的例子)• 基于搜索的软件构造问题求解• 软件自动生成与交叉• 群体与AI互动• 通过众包方式开发软件(TopCoder的例子)• …

10

Page 11: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems简单的软件构造过程

• 开发任务可以被分解为小型的、独立的子任务

• 缺少智能– 不适宜复杂的、创新型的、高技巧的开发任务

需要引入智能,使其适应大型的、多样化的软件构造及演化

11

Page 12: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems复杂的软件构造过程

• 需要一个管理任务和开发人员的综合性平台

• 复杂任务,必须被分解成较小的子任务,每个子任务被设计以适应特殊需求或具备特点,使其能被分配到合适的开发群体。群体必须被适当地激励,选择(例如,通过口碑),和组织(例如,通过分层结构)

• 任务可能会通过多阶段的工作流进行组织,其中,开发人员可同步或异步协作完成任务。AI可能指导群体(或被其引导)

• 必须保证软件质量,确保单个开发人员的产品的高质量,并完美组装在一起

12

Page 13: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems任务划分与组合

• 划分软件构造任务,刻划与管理任务、子任务间依赖关系,组合构造结果– 对软件行为建模– 封装和重用成熟设计模式– 任务的内聚和松耦合– 为改善软件构造,需要尝试和迭代不同工作流参数

13

Page 14: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems基于社会网络的多团队协同软件构造

• Codebook是一个基于社交网络的Web服务,其支持大型开发项目中协调任务

– 通过共享制件和任务,识别相关人员和制件的关系,提供各类开发协调信息

– 例图中,开发经理Pam与开发人员Dave共同开发Squre方法;Bug#673也将Pam和Dave紧密联系在一起…

14

Page 15: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems重用已有项目

• GoogleChrome浏览器汇聚了100多个开源项目

David M. Gay's floating point routinesshowopen-vcdiffshow gpsdshow libjingleshow mtpdshow Simplified Wrapper and Interface Generator (SWIG)show

dynamic annotationsshow WebKitshow GIMP Toolkitshow libjpegshow Netscape Plugin Application Programming Interface (NPAPI)showtallocshow

Netscape Portable Runtime (NSPR)showMSDN sample codeshow hunspellshow libjpeg-turboshow ocmockshow tcmallocshow

google-glog's symbolization libraryshowAlmost Native Graphics Layer Engineshowhunspell dictionariesshow International Phone Number LibraryshowOpenMAX ILshow tlsliteshow

valgrindshow Darwinshow hyphenshow libpngshow opusshow undoviewshow

xdg-mimeshow Apple sample codeshow IAccessible2 COM interfaces for accessibilityshowlibsrtpshow OTS (OpenType Sanitizer)showThe USB ID Repositoryshow

xdg-user-dirsshow WebKit private system interfaceshowiccjpegshow libusbshow PLY (Python Lex-Yacc)show Internationalization Library for v8show

Breakpad, An open-source multi-platform crash reporting systemshowAndroidshow icon_familyshow libvashow Protocol Buffersshow Webdrivershow

BSDiffshow Binary-, RedBlack- and AVL-Trees in Python and Cythonshowicushow libvpxshow Python FTP server libraryshowWebRTCshow

XZ Utilsshow bsdiffshow Chinese and Japanese Word ListshowWebP image encoder/decodershowmockshow The Windows Installer XML (WiX)show

Google code support upload scriptshowbspatchshow ISimpleDOM COM interfaces for accessibilityshowlibxmlshow Quick Color Management Systemshowwtlshow

Java Native Interface from Android NDKshowbzip2show jemallocshow libxsltshow re2 - an efficient, principled regular expression libraryshowx86incshow

mock4jsshow Google Cache Invalidation APIshowjsoncppshow libyuvshow google-safe-browsingshow XUL Runner SDKshow

Mozilla Personal Security ManagershowCompact Language Detectionshowgoogle-jstemplateshow LZMA SDKshow sfntlyshow yasmshow

Network Security Services (NSS)showcodesighsshow Khronos header filesshow mach_overrideshow simplejsonshow zlibshow

google-urlshow devscriptsshow launchpad-translationsshow mesashow skiashow V8 JavaScript Engineshow

native clientshow Expat XML Parsershow LCOV - the LTP GCOV extensionshowmodp base64 decodershow SMHashershow Strongtalk

Native Client SDKshow eyesfreeshow LCOV - the LTP GCOV extensionshowNSBezierPath additions from Sean Patrick O'BrienshowSnappy: A fast compressor/decompressorshow

Network Security Services (NSS)showffmpegshow LevelDB: A Fast Persistent Key-Value StoreshowMongoose webservershow speexshow

Spdysharkshow flacshow NVidia Control X Extension LibraryshowMozc Japanese Input Method Editorshowsqliteshow

PPAPIshow Flot Javascript/JQuery library for creating graphsshowlibeventshow Cocoa extension code from CaminoshowSudden Motion Sensor libraryshow

seccompsandboxshow OpenGL ES 2.0 Programming Guideshowlibexifshow mt19937arshow SwiftShader software renderer.show

15

Page 16: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems软件构造具有开放性

• GoogleChrome浏览器插件,可以大大的扩展浏览器的功能– 插件功能:捕捉特定网页的内容,捕捉HTTP报文,捕捉用户浏览动作……

– 插件开发简单,开发语言是Javascript,开发人员能很快上手– 插件开发人员可以向Chrome的网上商店上传插件

16

Page 17: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems基于搜索的软件构造问题求解

• 通过启发式搜索,解决软件构造中关键问题或算法– 定义软件构造关键问题对应的搜索空间(即一组可能解法所构成的空间)

– 通过启发式搜索解空间– 设计度量以评价解法

• 启发式搜索可以应用于软件开发全过程– 需求工程– 软件设计– 软件构造– …

17

Page 18: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems软件自动生成与交叉

• 通过机器学习,实现软件自动生成– 通过历史代码库挖掘,挖掘有用代码– 通过部分输入输出数据,合成代码

• 软件交叉– 复用遗留软件的构件或优质代码– 通过多个软件交叉,生成目标软件

18

Page 19: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems通过众包方式开发软件

• TopCoder是一个面向程序员的网站,采用比赛、评分、支酬等方式吸引众多程序员业余工作– TopCoder的客户包括美国在线(AOL)、美林公司(MerrillLynch)等– TopCoder会把一些软件项目分拆成多个小单元,在网上发布,邀请全球的编程高手来竞投

19

Page 20: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems群体与AI互动

群体指导AI AI指导群体• 人成为计算组件,开展AI系统所不能胜任的(计算)任务– 算法受益于群体训练数据– 算法模拟人类认知– 设计机器学习算法– 设计算法对软件开发的成本和性能进行权衡

• AI指导群体开发– 通过机器学习和历史数据挖掘建立开发模型

– AI与群体交互,互相培训学习,共同控制复杂的软件构造过程

20

Page 21: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems进化论

• 生存下来的既不是最强的物种,也不是最聪明的物种,而是最能适应变化的物种。

—《物种起源》

查尔斯·达尔文

21

Page 22: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsLinux操作系统演化的例子

• Linux开始于芬兰赫尔辛基大学的学生LinusTorvalds• Linux是自由软件,用户可以无偿得到它及源代码,且可以任意修改和补充它们

• Linux处于一个自由发展的阶段,演绎出数以百计的(商业或非商业)版本,如Redhat、Debian等,其各具特色

• 目前,Linux凭借优秀的设计,不凡的性能,加上IBM、INTEL等知名企业大力支持,逐渐成为主流操作系统之一

22

Page 23: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems软件演化中的遗传算法

• 遗传算法的主要特征– 位串表示– 比例选择– 将选择、交叉、变异、精英算子、繁殖等作为产生新个体的主要方法

软件演化中个体的表示方案?

软件演化的交叉算子和变异算子?

是否基于社会网络的演化来引导软件演化?

23

Page 24: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems软件的协同演化

• 竞争协同– 杀毒软件与病毒之间、不同操作系统之间(如Windows与Linux竞争协同)

– 软件群体演化中互相适应,使得所有软件协同演化– 如何为目标软件建立竞争对手及建立什么适应度度量?

• 合作协同– 软件被分割,每一“构件”独立演化,并使软件整体得到优化– 粒子群优化和蚁群算法,是合作协同的典型例子– 如何建立软件与社会网络、软件“构件”间、构件与环境的信息交互?

24

Page 25: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems软件的质量预期

时间

故障率

理想

曲线

实际曲线

修改

由于副作用造成

故障率的提高

25

Page 26: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems软件质量保障

• 智能驱动的软件构造和演化过程中,需要经过交叉、变异、繁殖、协同,软件质量频繁波动– 如何度量群体智能对于软件质量的影响?

• 定性与定量– 如何在软件构造和演化过程中提升其质量?

• 质量需要稳定上升– 如何建立智能驱动软件构造与演化中的质量保障框架和方法?

• 传统测试与分析可能无法适应于(群体智能驱动的)软件的构造和演化,需要新型质量保障方法

26

Page 27: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems群体智能驱动软件构造演化平台

• 支持群体智能式的软件构造– 支持软件开发、调试、集成、部署、运行和管理– 支持启发式搜索和遗传算法,以解决关键难题

• 支持软件协同演化– 支持竞争式与协同式软件演化– 支持软件的交叉、变异与繁殖

• 支持持续性质量保障框架、工具与方法– 支持对软件的频繁变更分析及错误发现、修复– 支持运营演化中的软件质量保障

• 构建基于互联网的软件库,累积、管理软件资产;支持群体智能驱动的软件构造与演化

27

Page 28: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems核心科学问题

•搜索:可计算理论与问题求解的非确定性•演化:基于上下文的软件行为推理与变换•质量:软件的质量波动与收敛

科学挑战

•群体智能驱动的软件构造•软件的协同演化•持续性的软件保障

科学问题

28

Page 29: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsIntroduction

• Whatiscrowdsourcing?– Outsourcingataskviaopencall

• Crowdsourcingwebsites– MTurk/TopCoder/Upwork/CrowdFlower– 80,000jobs,5+millionworkersinUpwork

• Task-workermatchingplaysacrucialrole• Howtodescribetaskrequirementsandworkerskills• Whatcriteriashouldbegiveninthedescription

Page 30: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsIntroduction

• Description with natural language (Taskcn)– Not machine-readable– Inefficient– subjective

• Tags (Upwork)– Not sufficient to articulate task publishers’ needs

• task A(Java && Javascript)• task B(Java || Javascript)

– Requirements are not exhaustive– Matching rules vary on skills– No single suitable worker for the task

Page 31: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsA Solution – STWM

• Meta-modelfordescription– Extensible/Customized

• Self-adaptivetask-workermatchingalgorithm– Efficient– Matchtasksandworkersaccordingtothecustomizedrules– Recommendsuitableworkersinacertainorder

• Teamformation– Workerstoformateam– Recommendteamstothetaskpublisher

Page 32: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsFramework of STWM

Page 33: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsMeta model for description

• constraint must be satisfied during the matching process• match: =, within, >, <, ≠ … API provided• composite: Max, Min, ∪, ∩… API provided

Page 34: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsMeta model for description

Definitionsofthemetadataclassforpropertiesoftime,payandlanguageskill

Page 35: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsMeta model for description

Adefinitionofclasslanguageandtwoinstancesthisclass

Page 36: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsMeta model for description

Page 37: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsMeta model for description

//language_of_task1{

"name":"language","value":["java","C++","javaScript"],"weight":0.9,"skill_level":{

"class":"com.stwm.Operation",[{"op1":">","args1":[3.0,"double"]},{"op2":">","args2":[3.0,"double"]},{"op3":">","args3":[3.0,"double"]}]

},"constraint":{

"class":"com.stwm.Operation",[{"op1":"in","args1":["java","collection"]},{"op2":"in","args2":["C++","collection"]},{"op3":"||","args3":["?","?"]},{"op4":"in","args4":["javaScript","collection"]},{"op5":"&&","args5":["?","?"]},{"op6":"contain","args6":[["java","C++","javaScript"],"collection"]}]

}}

Page 38: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsMeta model for description

• Class definition for task and worker:

score:acriterionusedtosortmatchedworkers

skill_weight:theweightoftheskill_rerquirmentamongthelistedfourpropertyrequirementsinthetaskclass

Page 39: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsMatching algorithm for individual worker

• necessary property:– if p.domain ≠ skill and p.weight ≥ baseline_weight– if p.domain = skill and p.weight ≥ avg_weight(skills)

• calculation formula of worker.score– w is an instance of Class worker, p’ is the property of the worker w with the

same property name as p

Page 40: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsMatching algorithm for individual worker• Algorithm1.Task-workermatchingalgorithm• Input: Set<worker>W;taskT.• Output:Set<worker>W’;• 1.functionmatching(Set<worker>W,taskT):• 2. FinalSet,PreferCandidate,Candidate,W’←∅;• 3. Set<worker>ℛ =Cluster(W,T);• 4. foreachworkerwinℛ:• 5. iffor∀p∊M,p.match(p’)=true then• 6. Calculatew.score;• 7. FinalSet =FinalSet∪w;• 8. elseiffor∀p∊M’,p.match(p’)=true then• 9. Calculatew.score;• 10. PreferCandidate =PreferCandidate∪w;• 11. elseif∃p∊M,p.match(p’)=true then• 12. Candidate =Candidate∪ w;• 13. endif• 14. endfor• 15. ifFinalSet ≠∅ then• 16. W’=Sort(T,FinalSet);• 17. elseifPreferCandidate ≠∅ then• 18. W’=Sort(T,PreferCandidate);• 19. endif• 20. returnW’;• 21.endfunction

timecomplex:O(km)

Page 41: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsMatching algorithm for individual worker

Page 42: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsTeam formation algorithm

• Worker W– aij = 1 the jth worker Wj has the ith property of I (Ii) – aij = 0 otherwise

• Team Q– qi = 1 the ith property of I (Ii) is covered by the team Q– Compute its team profile q defining the expertise of the team as the (binary)

sum of the properties of each individual

Page 43: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsTeam formation algorithm

• the team formation problem can be formally formulated as a binary integer program as follows, where cj represents the cost of choosing the worker wj.

• Meta-RaPS-SCP– a feasible solution for a SCP instance – effective, simple, randomness

Page 44: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsTeam formation algorithm• Algoritm 2.Teamformation• Input:Set<worker>W;taskT; int preferCount;int maxLoops.• Output:Set<team>Qs;//ateamisasetofworkers• 1.functionteamFormation(Set<worker>W,taskT,int preferCount,int maxLoops):• 2. Qs←∅;Set<Property>PSet ←∅;int i =0;• 3. WhileQs.size <preferCount &&i ≤maxLoops:• 4. BooleanisFeasible =true;• 5. teamQ=Meta-RaPS-SCP(T,W,%priority,%restriction);• 6. ifQ==∅• 7. returnQs;• 8 endif• 9. foreachpropertypoftaskT:• 10. PSet =∅;• 11. forallworkersinteamQaddp’toPSet;• 12. //p’isthepropertyoftheworkerinteamQwiththesamepropertynameasp• 13. Propertypt=p.composite(PSet);• 14. //thecompositefunctionisgiveninthedefinitionofpasshowninmeta-model• 15. ifp.match(pt)=falsethen• 16. isFeasible =false;• 17. break;• 18. endif• 19. endfor• 20. ifisFeasible =truethen• 21. Qs=Qs∪ Q;• 22. endif• 23. i ++;• 24. endwhile• 25. returnQs;• 26.endfunction

Page 45: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsSimulation experiments

• Pythonscrapytograbworkerdatafromupwork• 500pagesworkerdata(4500)• Re-descriptiontoconstructproperties• Construct10000workers• (eachonehasatmost6properties,atleast1property)• 1master,3slave

Page 46: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsSimulation experiments

• Exp1: experiment for task-worker matching with comparison– Same skill requirements, different preference

Page 47: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsSimulation experiments

• Exp1: experiment for task-worker matching with comparison

Definitionforpropertydatabase

Page 48: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsSimulation experiments

• Exp1: experiment for task-worker matching with comparison

Page 49: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsSimulation experiments

• TagsdescriptionfortaskAandtaskB

Page 50: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsSimulation experiments

• Analysis:• Validatetheeffectiveness,extensibilityandcorrectnessofmeta-model– databaseproperty

• Validatethecorrectnessofthematchingalgorithm• Validatetheefficiencyofthematchingprocess

– Cluster:nearly40%– Thewholematchingprocess:3.67s– (descriptionprocess2.34s,cluster0.82s)

Page 51: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsSimulation experiments

• Exp2: experiment for team formation– set preferCount = 4, maxLoops = 100, %priority = 80% and %restriction =

60% .

Page 52: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsSimulation experiments

• Exp2: experiment for team formation

Nearly1.13s,0.03sforMeta-RaPS-SCPprocedure

Page 53: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsSimulation experiments

• Then set more restrict constraints on the property of task C: – langOfC.value = {Java, JavaScript,Ruby, Html5} – constraint: Java && JavaScript && Ruby && Html5 – payOfC.value = 50 – No suitable team found

– Set %priority = 5%, %restriction = 90%– Or increase maxLoops=200, 400, 600, 800– Still no suitable team

– the task should be re-described

Page 54: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsSystem Design

54

Page 55: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsSystem Design

55

Ø ClusteringStrategy- reducesearchspace• Firstdivision– basedonthetype

Platformcustomized

Flexibleandscalable

Avoiddimensiondisaster

• Seconddivision– basedonK-MeansalgorithmK-Means– simpleandefficient

Maximumiterationnumber– controlconvergence

Lastoutputasinput

Page 56: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems

56

Page 57: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems

57

• Map-Reduceimplementation

Page 58: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsSystem Design

58

Ø DynamicMeasurement• Reasonsfordynamicmeasurement

• Objectiveevaluationsgivenbydifferentpublishers

• Slidingwindowanalysis

Page 59: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsExperiments

59

Ø Data• PythonScrapy – 50,000piecesofdatafromUpwork.com

• Transformtosuitabilitydescription

• Simulated500,000developers

Page 60: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems

60

Experiments

Ø Efficiencyofclusteringmethod

Page 61: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems

61

Experiments

Ø Effectivenessofclusteringmethod

Page 62: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems

62

Experiments

Ø Accuracyofdynamicmeasurement

• before

• after

Page 63: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable Systems

63

Experiments

Ø Accuracyofdynamicmeasurement

Mostsuitableworker

A,B,C

Mostsuitableworker

B,A,A

Page 64: Software Architecture 18 Crowd Sourcing

REliable, INtelligent &Scalable SystemsPublications

• Fu Y, Chen H, Song F. STWM: A Solution to Self-adaptive Task-Worker Matching in Software Crowdsourcing[M]//Algorithms and Architectures for Parallel Processing. Springer International Publishing, 2015: 383-398.

• Song F, Chen H, Fu Y. An Approach to Rapid Worker Discovery in Software Crowdsourcing[M]//Algorithms and Architectures for Parallel Processing. Springer International Publishing, 2015: 370-382.

Page 65: Software Architecture 18 Crowd Sourcing

Thank You!

65