Upload
alexander-serebrenik
View
253
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Software evolution research is a thriving area of software engineering research. Recent years have seen a growing interest in variety of evolution topics, as witnessed by the growing number of publications dedicated to the subject. Without attempting to be complete, in this talk we provide an overview of emerging trends in software evolution research, such as extension of the traditional boundaries of software, growing attention for social and socio-technical aspects of software development processes, and interdisciplinary research applying research techniques from other research areas to study software evolution, and software evolution research techniques to other research areas. As a large body of software evolution research is empirical in nature, we are confronted by important challenges pertaining to reproducibility of the research, and its generalizability.
Citation preview
Software Evolution anno 2014:directions and challenges
Alexander Serebrenik
@aserebrenik
2008
Time for a new book!
2014
2008 vs. 2014
From systems to ecosystems
Business-oriented view
“a set of actors functioning as a unit and interacting with a shared market for software and services, together with the relationships among them.”
with thanks to International Data Corporation (IDC)
Development-centric view
a collection of software projects that are developed and evolve together in the same environment
with thanks to Bram Adams
Socio-technical viewa community of persons (end-users, developers, debuggers, …) contributing to a collection of projects
Technical
Scientific
Practical
Legal and ethical
Technical challenges
• eliminate non-names• eliminate specific quirks• group “similar” names
– first/last name – textual similarity– latent semantic analysis
• (correct groups manually)
Technical challenges
Technical challenges
• eliminate non-names• eliminate specific quirks• group “similar” names
– first/last name – textual similarity– latent semantic analysis
• (correct groups manually)
Technical challenges
Structured data2008
Unstructured data2014
Technical challenges
Structured data2008
Unstructured data2014
Scientific challenges
Scientific challenges
Raw dataProcessed data set
Tools & scripts
#MSR papers 2004-2009
Y Y Y 2Y Y N 2Y P Y 1Y P P 2Y P N 2Y N Y 16Y N P 19Y N N 64P N Y 1P N N 2N Y N 2N P N 1N N Y 7N N P 2N N N 31N/A N/A N/A 17
We share raw data but rarely share tools – reinventing the wheel anybody?
Practical challenges
• How can we share our big data with other researchers?• Different formats, different tools, storage
problems, …• How can we make our research results useful
to practitioners and development communities?
• How can we build tools and dashboards that integrate our findings?
Legal and ethical challenges
(especially for survey data)
http://www.intracto.com/blog/online-privacy-belangrijk
k-anonymity
k-anonymity
l-diversityt-closeness
2008 vs. 2014
From “traditional” to “non-traditional” artifacts:
What is software?
http://ctms.engin.umich.edu/CTMS/index.php?example=Introduction§ion=SimulinkModeling
Maintainability???Evolution???
BumbleBee: a refactoring tool for spreadsheets
with thanks to Felienne Hermans
http://help.eclipse.org/juno/index.jsp?topic=%2Forg.eclipse.m2m.atl.doc%2Fguide%2Fconcepts%2FModel-Transformation.html
http://help.eclipse.org/juno/index.jsp?topic=%2Forg.eclipse.m2m.atl.doc%2Fguide%2Fconcepts%2FModel-Transformation.html
• describe evolutionary steps • relate to changes of other
artifacts• describe prevalence in
practice • support automation
New kind of verification
artifacts
2008
2009
2012
2013
2008 vs. 2014
From technical to socio-technical perspective:
Who are these people?
What do they do?
> 90% in WordPress & Drupal> 95% in FLOSS surveys> 87% in GNOME> 70% in software-related jobs (NSF)
MEN
FLOSS 2013
Europe,US,CA,AUBrazil/Argentina
How can we reliably and efficiently identify gender, age, location?
Technical challenges
?
Name + Location = Gender
Lonzo Alonzo ⇒
w35l3y wesley ⇒
Name + Location = Gender
<title>Ben Kamens</title>…<h1>We’re willing to be embarrassed about what we <em>haven’t</em> done…</h1>
Heuristics: title + first h1
Ben Kamens We’re willing to be embarrassed about what we haven’t done…
<PERSON>Ben Kamens</PERSON> We’re willing to be embarrassed about what we haven’t done…
Stanford Named Entity Tagger
Quality of gender resolution: SurveySelf-identification
As inferred TotalM F ?
M 60 3 43 106F 2 5 4 11
Self-identification
As inferred TotalM F ?
M 90 3 13 106F 2 9 0 11
+ avatars, other social media sites (manually)
PAGE 4212-04-2023
.cpp .po
.jpg
/test/
/library/ .doc
makefile .sql .conf
Occasional contributors
Frequent contributors
How can we reliably and efficiently identify human activities?
Technical challenges
How can we reliably and efficiently identify human activities?
Technical challenges