This year tutorials cover a broad range of topics such as model checking, product line, program analytics, testing, search based software engineering, human computer interaction, and software mining. The following is the list of tutorials:
T1. Build Your Own Model Checker with PAT (Sept 3, half day, AM)
T2. Product Line Variability (Sept 3, half day, AM)
T3. Software Analytics: Challenges and Opportunities (Sept 4, half day, PM)
T4. Automated GUI Testing of Android Apps: Challenges, Approaches, Tools, and Best Practices (Sept 3, half day, PM)
T5. Search Based Software Engineering: Foundations, Challenges and Recent Advances (Sept 3, half day, PM)
T6. Learn to Build Automated Software Analysis Tools with Graph Paradigm and Interactive Visual Framework (Sept 4, full day)
T7. Testing Stochastic Software (Sept 4, half day, AM)
T8. Mining and Modelling Unstructured Data (Sept 4, half day, AM)
T9. Using Docker Containers to Improve Reproducibility in Software Engineering Research (Sept 4, half day, PM)
Build Your Own Model Checker with PAT (Sept 3, half day, AM)
Model checking has established as an effective method for automatic system analysis and verification. It is making its way into many domains and methodologies. Applying model checking techniques to a new domain (which probably has its own dedicated modeling language) is, however, far from trivial.
Translation-based approach works by translating domain specific languages into input languages of a model checker. Because the model checker is not designed for the domain (or equivalently, the language), translation-based approach is often ad hoc. Ideally, it is desirable to have an optimized model checker for each application domain. Implementing one with reasonable efficiency, however, requires years of dedicated efforts.
In this tutorial, we will briefly survey a variety of model checking techniques. Then we will show how to develop a model checker for a language combining real-time and probabilistic features using the PAT (Process Analysis Toolkit) step-by-step, and show that it could take as short as a few weeks to develop your own model checker with reasonable efficiency. The PAT system is designed to facilitate development of customized model checkers. It has an extensible and modularized architecture to support new languages (and their operational semantics), new state reduction or abstraction techniques, new model checking algorithms, etc. Since its introduction 5 years ago, PAT has attracted more than 3500 registered users (from 800+ organisations in 60 countries) and has been applied to develop model checkers for 20 different languages.
Software development has entered a mass production era. To ensure quality, software verification is becoming a compulsory step in the software development life cycle, especially for safety and critical systems. Among the principal validation/verification methods (e.g., simulation, testing and theorem proving), model checking  has emerged as a promising and powerful approach to automatically verify software systems, e.g., complex circuit design, communication protocols, driver software, software process models, software requirement models, architectural frameworks, product lines and system implementations.
Model checking is the application of an automatic process to verify whether a model satisfies a property by exhaustively exploring the state space of the model. Till now, it has become a wide area including many different approaches (e.g., explicit model checking and symbolic model checking) catering for different properties (e.g., temporal logics, refinement relationship, etc.) and state space reduction techniques (e.g., partial order reduction, symmetry reduction, etc.). Applying model checking in a new domain requires in-depth understanding of model checking techniques. Unfortunately, the complexity prevents many domain experts, who may not be experts in the area of model checking, from successfully applying model checking to their domains.
In this tutorial, we cover the basic knowledge about model checking and explain how to adopting model checking in new application domains. This tutorial consists of two parts. The first part briefly surveys the state-of-the-art model checkers and discusses the challenge in applying model checking techniques. The second (and the main) part details what are necessary steps to build a model checker of your language. In particular, we will show how the PAT framework is designed to help using a concrete example: how to step-by-step develop a model checker for hierarchical real-time probabilistic systems.
The audiences will learn every step of developing a model checker. The intended audiences of this tutorial are academics, graduate students, researchers, software architects and analysts. We particularly welcome those who want to apply formal methods (especially model checking techniques) into their own application domains. Experts will be there to guide and provide them with feedback.
DONG Jin-Song is an Associate Professor at School of Computing, National University of Singapore. Jin-Song Dong received Bachelor (1st hon) and PhD degrees in Computing from University of Queensland in 1992 and 1996. From 1995 to 1998, he was research scientist at CSIRO in Australia. Since 1998 he has been a faculty member in the School of Computing at the National University of Singapore (NUS) and a member of PhD supervisors at NUS Graduate School (NGS). He is the co-director of Singapore-French joint Research lab IPAL. Jin Song is on the editorial board of ACM Transaction on Software Engineering and Methodology and Formal Aspects of Computing. He is on the steering committee of APSEC, FME & ICFEM and has been general/program chair for a number of international conferences, including 19th FM 2014 in Singapore. Jin Song has been a Visiting Fellow (2006) at Oxford University and a Visiting Professor (since 2009) at National Institute of Informatics of Japan. Outside of work, he plays competitive tennis and coaches top ranked junior players in Singapore (including his own 3 kids). He also developed Markov Decision Process (MDP) models for tennis strategy analysis in PAT. More information is available at https://www.comp.nus.edu.sg/~dongjs/.
SUN Jun is an Assistant Professor at Singapore University of Technology and Design. Sun, Jun received Bachelor and PhD degrees in computing science from National University of Singapore (NUS) in 2002 and 2006. In 2007, he received the prestigious LEE KUAN YEW postdoctoral fellowship in School of Computing of NUS. Since 2010, he joined Singapore University of Technology and Design (SUTD) as an Assistant Professor. He was a visiting scholar at MIT from 2011-2012. Jun's research interests include software engineering, formal methods, software engineering, program analysis and cyber-security. He is the co-founder of the PAT model checker. To this date, he has more than 140 publications. More information is available at http://people.sutd.edu.sg/~sunjun/.
LIU Yang is an Assistant Professor at Nanyang Technological University, Singapore. Liu, Yang graduated in 2005 with a Bachelor of Computing in the National University of Singapore (NUS). In 2010, he obtained his PhD and started his post doctoral work in NUS, MIT and SUTD. In 2011, Dr Liu is awarded the Temasek Research Fellowship at NUS to be the Principal Investigator in the area of Cyber Security. In 2012 fall, he joined Nanyang Technological University as a Nanyang Assistant professor. Dr. Liu specializes in software verification, security and software engineering. His research has bridged the gap between the theory and practical usage of formal methods and program analysis to evaluate the design and implementation of software for high assurance and security. His work led to the development of a state-of-the-art model checker PAT. More information is available at http://www.ntu.edu.sg/home/yangliu/.
Product Line Variability (Sept 3, half day, AM)
In this tutorial, the participants will learn the difference between software variability and product line variability. They will learn and apply the key concepts of product line variability modelling. Besides the integrated modelling of product line variability information, the participants will learn how to define variability in an separate model. They will know the key benefits of defining and communicating about product line variability in an orthogonal model. In addition, they will learn the core concepts of modelling product line variability illustrated using examples. Moreover, they will know the difference of an application variability model and the product line (domain) variability model. Running examples will illustrate the concepts taught.
- Software Product Line Engineering
- Key Principles
- Systematic, planned, pro-active Reuse
- Managed variability
- Software Product Line Framework
- Overview of Domain Requirements Engineering, Domain Design, Domain Realization, Domain Testing
- Overview of Application Requirements Engineering, Application Design, Application Realization, Application Testing
- Product Line Variability
- Key Concepts of Product Line Variability
- Variability of Software
- Recognition of Product Line Variability in Software Artefacts
- Need to for Explicit Documentation of Variability
- Types of Product Line Variability
- Binding Product Line Variability
- Defining the Binding
- Performing the Binding
- Modeling Product Line Variability
- Integrated Variability Modeling
- Benefits and Drawbacks
- Related-Work: Feature Models
- Orthogonal Variability Modeling (OVM)
- OVM Language Definition
- Interrelating Variability and other Development Artifacts
Professionals (managers, software engineers, architect, product manager), researchers, lectures, students
Prof. Dr. Klaus Pohl is the director of paluno, the Ruhr Institute for Software Technology, and holds a full professorship for software systems engineering at the University of Duisburg-Essen. He received his Ph.D. and his habilitation in computer science from RWTH Aachen. He is involved in various technology transfer as well as major research projects focusing on different aspects of product line engineering. Klaus Pohl is (co-)author of over 250 refereed publications and served as program chair for conferences such as the IEEE Intl. Conference on Software Engineering (ICSE 2013), Intl. Requirements Engineering Conference (RE '02), the Experience Reports Track of the 27th ICSE in 2005, the German Software Engineering Conference (SE 2005), the 9th Intl. Software Product Line Conference (SPLC Europe 2005) and the 18th Intl. Conference on Advanced Information Systems Engineering (CAiSE 2006), and he is a member of the organizing committee of VaMoS (the international workshop on Variability Modeling of Software-intensive Systems). More information is available at https://sse.uni-due.de/team/leitung/prof-dr-klaus-pohl/.
Software Analytics: Challenges and Opportunities (Sept 4, half day, PM)
Nowadays, software development projects produce a large number of software artifacts including source code, execution traces, end-user feedback, as well as informal documentation such as developers' discussions, change logs, Stack-Overflow, and code reviews. Such data embeds rich and significant knowledge about software projects, their quality and services, as well as the dynamics of software development. Most often, this data is not organized, stored, and presented in a way that is immediately useful to software developers and project managers to support their decisions. To help developers and managers understand their projects, how they evolve, as well as support them during their decision-making process, software analytics - use of analysis, data, and systematic reasoning for making decisions - has become an emerging field of modern data analysis. While results obtained from analytics-based solutions suggested so far are promising, there are still several challenges associated with the adoption of software analytics into software development processes, as well as the development and integration of analytics tools in practical settings.
We therefore propose a tutorial on software analytics. The tutorial will start with an introduction of software analytics. Next, we will discuss the main challenges and opportunities associated with software analytics based on the examples from our own research. These examples will cover a range of topics leveraging software analytics. The topics include mobile apps quality, code review process and its quality, analytics for bug report management, as well as the use of analytics to solve scheduling problems in the cloud.
This half-day tutorial will focus on the presentation of software analytics, its main challenges and opportunities, as well as a sample of recent research works that make use of it. More precisely, the tutorial will explain, present, and discuss the following:
- Introduction to software analytics
- Challenges and opportunities associated with software analytics
- Examples of software engineering problems making use of software analytics.
- Summary and recap of the tutorial
This tutorial is intended for both novice and experts, academics and industrial practitioners. It will provide participants with an understanding of software analytics, actionable software artefacts leveraged by software analytics, as well as a sample of research problems making use of software analytics. We will also discuss the opportunities and challenges related with this new emerging field of software engineering. Participants are encouraged to talk about their recent works related to the tutorial (if any) and share their experiences and major faced challenges. Experts will be there to guide and provide them with feedback.
Latifa Guerrouj is an Assistant Professor at the Department of Software Engineering and Information Technologies of École de Technologie Supérieure, Montréal, Canada. She hold a Ph.D. from the Department of Computer Science and Software Engineering (DGIGL) of Polytechnique de Montréal (2013) and engineering degree with honours in Computer Science (2008). Latifa's research interests span several software engineering areas, including empirical software engineering, software analytics, data mining, and mining software repositories. More info at http://latifaguerrouj.ca/.
Olga Baysal is an Assistant Professor at the School of Computer Science, Carleton University, Ottawa, Canada. She received a Ph.D. (2014) in Computer Science from the University of Waterloo, Canada. Her research interests span a wide range of software engineering areas, including empirical software engineering, mining software repositories, software analytics, software maintenance and evolution, and human aspects of software engineering. Much of Olga's work focuses on understanding how software engineers create, use and maintain software systems. More info at http://olgabaysal.com/.
Foutse Khomh is an Assistant Professor at the École Polytechnique de Montréal, where he leads the SWAT Lab on software analytics and cloud engineering research. Prior to this position he was a Research Fellow at Queen's University (Canada), working with the Software Reengineering Research Group and the NSERC/RIM Industrial Research Chair in Software Engineering of Ultra Large Scale Systems. He received his Ph.D. in Software Engineering from the University of Montreal in 2010. His main research interest is in the field of empirical software engineering, with an emphasis on developing techniques and tools to improve software quality. He co-founded the International Workshop on Release Engineering and was one of the editors of the first special issue on Release Engineering in the IEEE Software magazine. More info at http://www.khomh.net/.
Xin Xia received his PhD degree from the College of Computer Science and Technology, Zhejiang University, China in 2014. He is currently a research assistant professor in the college of computer science and technology at Zhejiang University. He has published in many major international conferences and journals in software engineering, including TSE, ASE, ISSTA, EMSE, ICSME, TR, ASEJ, SANER, ICPC, MSR, ESEM, and ISSRE. His research interests include software analytic, empirical study, and mining software repository. More info at http://mypage.zju.edu.cn/en/xinxia.
Automated GUI Testing of Android Apps: Challenges, Approaches, Tools, and Best Practices (Sept 3, half day, PM)
The last decade has seen tremendous proliferation of mobile computing in our society.
Billions of users have access to millions of mobile applications that can be installed directly on
their mobile devices and electrical appliances such as TV settop boxes. Factors such as new
monetization/revenue models, programming models, and distribution infrastructures contribute
to an "attractive" movement that captivates new and traditional developers, as well as a crowd
of other professionals that explore, design, and implement mobile apps. Also, the need for
"enterprise apps" that support startups or serve as a new frontend for traditional companies is
pushing softwarerelated professionals to embrace the mobile technologies . However, the
nature of the economy (devices, apps, markets) imposes new challenges on how mobile apps
are envisioned, designed, implemented, tested, released, and maintained? for instance, mobile
developers and testers face the following critical challenges: (i) continuous pressure from the
market for frequent releases, (ii) platform fragmentation at device and OS levels, and (iii) rapid
platform/library evolution and API instability.
In order to deal with the aforementioned challenges, continuous testing of mobile apps
on a large set of device configurations and under different contextual events (e.g., WiFi
connectivity) is a "musthave" in the development process to ensure quality. However, this must
be enabled within the constraints of tight release schedules and limited developer and hardware
resources. Additionally, both practitioners and researchers must contend with mobilespecific
challenges during the execution and testing of mobile apps including: the eventdriven nature of
mobile apps, gesturebased interactions, interfaces with sensors, and the possibility of multiple
contextual states (e.g., WiFi /GPS on/off).
This tutorial aims at providing the ASE community with uptodate information on the
stateoftheart and stateofthe practice regarding mobile app testing. Specifically, it will
address, the challenges, approaches, tools, and best practices for GUI testing of Android apps.
The tutorial content will help participants understand the main challenges behind mobile testing,
and will provide them with useful and actionable information concerning the pros and cons of
the approaches and tools available for mobile app GUI testing. Additionally, it will provide with
guidelines for designing their own infrastructure for mobile app testing. The content of this
tutorial is based on the knowledge and experience gathered during the last four years of
academic research and industrial collaborations of the members of SEMERU group from the
College of William and Mary.
Our goal with this tutorial is to provide participants with (i) an overview of the automated
approaches for GUI testing of Android apps, (ii) handson experience with representative tools,
and (iii) guidelines and solutions for building their own infrastructures for largescale testing. To
achieve these goal, the tutorial will be structured around five major topics described as in the
- Mobile GUI testing concepts and challenges (30 mins): participants will be
introduced to the basic concepts related to GUI testing and major challenges for mobile
testing such as Android Framework and SDK limitations, coldstarts, GUI events
coordinations, problematic components, API instability, external dependencies, and the
issues of scaling virtual devices, among others.
- Approaches and tools (75 mins): we will present the major approaches for automated
GUI testing of mobile apps? in particular, we will cover random/fuzz testing, modelbased
testing, recordand replay, crowdsourced testing, and APIs for testing automation (i.e.,
Espresso and UI Automator). This segment of the tutorial will distill the advantages and
disadvantages of these varying categories, and will further discuss the virtues of specific
features implemented as well as natural directions for future work on such tools. In
addition, the participants will be asked to use some of the representative
industrystrength tools such as Monkey, Barista, STF, Espresso, Hierarchy Viewer as
well as some of the research prototypes, such as MonkeyLab, and CrashScope.
- Enabling testing of Android apps (30 mins): we will present an infrastructure
designed to overcome most of the challenges for GUI testing of Android apps. The
infrastructure has been designed to enable largescale execution of Android apps, by
relying on open source software and commodity machines. Also the infrastructure has
been designed to follow the open/closed principle to enable customized testing.
- Improving bug reporting (30 mins): during this segment of the tutorial we will present
novel ideas and tools which leverage dynamic analysis techniques to better support
reporting, detecting, and reproducing bugs/crashes in mobile applications. The first of
these tools constitutes an offdevice bug reporting mechanism that uses information
gleaned from program analysis to help guide a user through reporting detailed
reproduction steps for a bug. The second is a novel approach to automated input
generation technique capable of discovering, reporting (with reproduction steps and
screenshots), and reproducing Android application crashes.
- SEMERU tools demo (30 mins): we will briefly present some tools we have made
available to the community, for automatic detection of crashes, improved bug reporting,
automatic translation of test scripts, and ondevice collection of reproduction steps? then,
participants will use those tools.
- Discussion (15 mins): we will conclude by (i) presenting our vision of the future work in
automated GUI testing of mobile apps, and (ii) discussing open challenges/issues with
The tutorial targets researchers (students and faculty) and practitioners with different levels of
experience in automated software engineering (i.e., novice, intermediate, and expert). No prior
knowledge of Android development is required. The presenters will introduce basic concepts of
GUIbased testing, and automated approaches for GUI testing of mobile apps.
Mario Linares-Vasquez has recently received his Ph.D. from the
College of William and Mary, advised by Dr. Denys Poshyvanyk, and he
has joined Universidad de los Andes (Colombia), as a tenuretrack
Assistant Professor. His research interests include evolution and
maintenance of mobile apps, automated GUI testing, and application of
data mining and machine learning techniques to support software
engineering tasks. The topic of his dissertation focused on supporting
evolution and maintenance of Android apps, by relying on novel combinations of dynamic
analysis, mining software repositories, and static analysis. His papers have been published in
top peerreviewed software engineering venues such as TSE, EMSE, FSE, ICSE, ISSTA, ASE,
ICSME, ICST, and MSR. He was awarded with a Best Paper Award at ICSME'1 3, and two ACM
SIGSOFT Distinguished Paper Awards at ESEC/FSE'1 5 and ICPC'1 6. LinaresVasquez has
served as a program committee member for MSR, SANER, and ICPC. He has also reviewed for
various SE journals, including EMSE, JSS, IST, and IEEE Software. More information available
Kevin Moran is currently a Ph.D student in the Computer Science
Department at the College of William and Mary. He is a member of the
SEMERU research group and advised by Dr. Denys Poshyvanyk. His main
research interest involves facilitating the processes of Software
Engineering, Maintenance, and Evolution with a focus on mobile platforms.
He graduated with a M.S. degree from William & Mary in August of 201 5
and his thesis focused on improving bug reporting for mobile apps through
novel applications of program analysis techniques. He has published in several top
peerreviewed software engineering venues including: ICSE, ESEC/FSE, ICST, and MSR. He
was recently recognized as the secondoverall winner among graduate students in the ACM
Student Research competition at ESEC/FSE'1 5. Moran has served as an external reviewer for
ICSE, ICSME, FSE, APSEC, and SCAM. More information available at
Denys Poshyvanyk is an Associate Professor in the Computer Science
Department at the College of William and Mary where he leads SEMERU
research group. He received his Ph.D. from Wayne State University, where
he was advised by Dr. Andrian Marcus. His current research is in the area of
software engineering, evolution and maintenance, program comprehension,
reverse engineering, software privacy, repository mining, traceability,
performance testing, mobile app (Android) development and testing, energy consumption, and
reuse. His papers received several Best Paper Awards at ICPC'06, ICPC'07, ICSM'1 0,
SCAM'1 0, ICSM'1 3 and ACM SIGSOFT Distinguished Paper Awards at ASE'1 3, ICSE'1 5,
ESEC/FSE'1 5, and ICPC'1 6. He is also a recipient of the NSF CAREER award (201 3) and the
Plumeri Award for Faculty Excellence (201 6). Dr. Poshyvanyk has previously presented a
technical briefing at ICSE'1 2 on "Software Engineering in the Age of Data Privacy". More
information available at http://www.cs.wm.edu/~denys.
Search Based Software Engineering: Foundations, Challenges and Recent Advances (Sept 3, half day, PM)
A growing trend has begun in recent years to move software engineering problems from human-based search to machine-based search that balances a number of constraints to achieve optimal or near-optimal solutions. As a result, human effort is moving up the abstraction chain to focus on guiding the automated search, rather than performing the search itself. This emerging software engineering paradigm is known as Search Based Software Engineering (SBSE). It uses search based optimization techniques, mainly those from the evolutionary computation literature to automate the search for optimal or near-optimal solutions to software engineering problems. The SBSE approach can and has been applied to many problems in software engineering that span the spectrum of activities from requirements to maintenance and reengineering. Already, success has been achieved in requirements, refactoring, project planning, testing, maintenance and reverse engineering. However, several challenges have to be addressed to mainly tackle the growing complexity of software systems nowadays in terms of number of objectives, constraints and inputs/outputs.
Most of software engineering problems are multi- and many-objective by nature to find a trade-off between several competing goals. In addition, several software engineering solutions lack robustness due to the dynamic environments of software systems (e.g., requirements changes over time). Furthermore, it is essential to understand the points at which human oversight, intervention, resumption-of-control and decision making should impinge on automation. Human programmers might reject changes made by any automated programming technique. If they feel that they have little understanding or control, there will be a natural reluctance to trust the automated tool. Thus, we have to consider different levels of automation when adapting search algorithms to software engineering problems.
In this tutorial, we will give, first, an overview about SBSE then we will focus on some case studies that we proposed, along with our research groups and industrial partners, addressing the above challenges, including: many-objective software re-modularization, bi-level defects detection and interactive dynamic multi-objective optimization for software refactoring. Finally, we will discuss possible new research directions in SBSE.
- Introduction and Context
- SBSE foundations and concepts
- Case studies
- Search-based Model-Driven Engineering
- Search-based Software Refactoring
- Search-based Software Testing
- Future research directions
The targeted audience is the researchers interested to learn the steps of how optimization techniques can be applied to real-world software engineering problems. We do not expect that the targeted audience should have a strong background in optimization.
Dr. Marouane Kessentini is an Assistant Professor in the Department of Computer and Information Science at the University of Michigan Dearborn. He is the founder of the Search-Based Software Engineering (SBSE) research lab. He has several collaborations with different industrial companies on the use computational search, machine learning and evolutionary algorithms to address several software engineering problems such as software quality, software testing, software migration, software evolution, etc. He received his Phd from University of Montreal in 2012 and a Presidential BSc Award from the President of Tunisia in 2007. He received many grants from both industry and federal agencies and published around 75 papers in search-based software engineering journals and conferences, including 3 best paper awards. He has served as program committee member in several major conferences (GECCO, MODELS, ICMT, SSBSE, etc.), an editorial board member of several journals (SQJ, ASE, IST, TEVC and EMSE), and an organization member of many conferences and workshops. He was also the co-chair of the SBSE track at the GECCO2014 and GECCO2015 conferences and he is now the general chair of of the 8th IEEE Search Based Software Engineering Symposium (SSBSE2016). He is also the founder of the North American Symposium on Search Based Software Engineering, funded by the National Science Foundation (NSF), and a guest editor of the first Special Issue on Search Based Software Engineering at the IEEE Transactions on Evolutionary Computation Journal (2016). He is an invited speaker in the 2016 IEEE World Congress on Computational Intelligence (Vancouver, Canada) to give a tutorial around SBSE. More information is available at http://www-personal.umd.umich.edu/~marouane/.
Dr. Ali Ouni is a Research Assistant Professor in the department of Computer Science at Osaka University. He is a member of the software engineering laboratory (SEL). He received his Ph.D. degree in computer science from University of Montreal in 2014. For his exceptional Ph.D. research productivity, he was awarded the Excellence Award from the University of Montreal. He has served as a visiting researcher at Missouri University of Science and Technology, and University of Michigan, in 2013 and 2014 respectively. His research interests are in software engineering including software maintenance and evolution, refactoring of software systems, software quality, service-oriented computing, and the application of artificial intelligence techniques to software engineering. He served as a program commitee member and reviewer in several journals and conferences. He is a member of the IEEE and the IEEE Computer Society. More information is available at https://sites.google.com/site/ouniaali/.
Learn to Build Automated Software Analysis Tools with Graph Paradigm and Interactive Visual Framework (Sept 4, full day)
Software analysis has become complex enough to be intimidating to new students and professionals. It can be difficult to know where to start with over three decades of staggering research in data and control flow analyses and a plethora of analysis frameworks to choose from, ranging in maturity, support, and usability. While textbooks, surveys and papers help, nothing beats the personal experience of implementing and experimenting with classic algorithms.
With support from DARPA, we have developed a graph paradigm enabled with an interactive visual framework to implement and experiment with software analysis algorithms. Parsed programs along with pre-computed data and control flows are stored as a graph database so that analyzers with varying degrees of accuracy and scalability tradeoffs can be easily implemented using a high-level query language. The graphical as well as textual composition of queries, interactive visualization, and the 2-way correspondence between the code and its graph models are integrated through a platform called Atlas. With this machinery, the implementation and visualization effort is reduced as much as 10 to 50 fold, making it much easier to learn about and do research on software analysis algorithms with applications to software safety and security. The tutorial will provide the necessary background, including implementation of widely used algorithms. Participants will learn to prototype several algorithms in a short timeframe.
- Introduction to the graph paradigm with interactive visualization.
- Overarching fundamental concepts and algorithms, and a survey of the seminal research on software analysis algorithms.
- Demonstrations of implementing classic algorithms using the graph paradigm.
- A case study of building a software verification tool for C and its application to verify safety properties of the Linux kernel.
- Case studies to illustrate how to build advanced toolboxes for Java and Java bytecode and use them for research, teaching, or industry adoption.
- Resources for the participants to continue their studies beyond the tutorial.
The tutorial will exemplify how automated software engineering research, teaching and practice can benefit by using the graph paradigm and interactive visual framework. Participants can bring laptops and have a firsthand experience of applying the graph paradigm interactively. We will provide, on memory sticks, the Atlas platform, reference implementations of classic algorithms, a set of commonly needed analyzers, and the C and Java test programs. The tutorial will present a conceptual framework that is applicable to any programming language. The Atlas platform and the specific implementations will be for C, Java, or Java bytecode.
Suresh Kothari (Suraj) is the Richardson Professor of Electrical and Computer Engineering (ECE) at Iowa State University (ISU) and the founder of EnSoft. He has served as a PI for the DARPA Automated Program Analysis for Cybersecurity (APAC) program, and a Co-PI for the DARPA Software Enabled Control (SEC) program. Currently he is a PI for the Space/Time Analysis for Cybersecurity (STAC) program. EnSoft, the company he founded in 2002, provides software products and services worldwide to more than 300 companies including all major avionics and automobile companies in United States, Asia, and Europe. He was awarded in 2012 the Iowa State Board of Regents Professor Award for excellence in research, teaching, and service. He has served as a Distinguished ACM Lecturer. More information is available at http://class.ece.iastate.edu/kothari/.
Ben Holland is a research scientist at ISU working on DARPA projects. He has extensive experience of writing program analyzers to detect novel and sophisticated malware in Android applications. He has served on the ISU team as a key analyst for DARPA's APAC program. He has given talks at Derbycon 4.0 in Louisville, Kentucky and at DARPA's headquarters in Arlington, Virginia. His past work experience has been in mission assurance at MITRE, government systems at Rockwell Collins, and systems engineering at Wabtec Railway Electronics. He holds a master's degree in Computer Engineering and Information Assurance, a B.S. in Computer Engineering, and a B.S. in Computer Science. Currently he serves on the ISU team for DARPA's STAC program. More information is available at https://ben-holland.com/.
Testing Stochastic Software (Sept 4, half day, AM)
Despite 40 years of research in automated test data generation, testing programs with
nondeterministic behaviour remains a major challenge. Traditional testing techniques are designed
for programs that always produce the same output for any given set of inputs. Increasingly however,
researchers and industry practitioners are interested in developing software that has stochastic
behaviour. Examples include machine learning, Search-Based Software Engineering, metaheuristic
optimisation and Monte-Carlo simulations. These approaches are used to comprehensively explore
the range of potential outcomes and can identify novel solutions to complex problems. Yet, they also
create additional challenges for software testing, since we often cannot say for certain whether any
one particular output is correct. Failures in stochastic software could lead to losses in time, money,
reputation, injury and even death. However, existing techniques are unable to cope with uncertainty
in the correctness of the outputs. How then can we test such stochastic software?
This tutorial introduces a statistical approach towards testing stochastic software. Instead of
checking the software outputs one at a time for each test, the output distributions are investigated
over multiple executions. This is important since, even though the outputs observed so far may
appear to be correct, the next execution could potentially produce an unexpected result (for exactly
the same inputs). Statistical tests will therefore be presented (based on frequentist, likelihood and
Bayesian statistics) to estimate how likely it is that the software contains a fault. Some stochastic
programs are particularly difficult to test because their behaviour depends non-linearly on a large
number of variables. In these cases, faults may exist in the software for a long time before they are
noticed and it is difficult to know whether a fault has been fixed. This tutorial therefore includes
search-based techniques for finding input values that make discrepancies more apparent in the
outputs. This, in combination with metaheuristics and pseudo-oracles provides a comprehensive
solution for testing stochastic software. Practical examples are provided from a number of different
fields to illustrate the challenges and solutions to testing stochastic software. The techniques
presented will also be useful in coping with other forms of uncertainty in the software's output and
should help to inspire further research into effective new techniques for test data generation.
- Introduction to stochastic software, its uses in various areas of application and the
challenges in testing it compared with traditional deterministic software
- Exploration of the various statistical approaches that can be applied to test stochastic
software, including Bayesian and likelihood-based statistics as well as metaheuristics and
- Presentation of techniques to identify subtle faults in stochastic software through searchbased optimisation and statistical fitness functions
- Description of probabilistic model checking techniques to verify whether, or to what extent, the behavior of stochastic software satisfies their specification
- Demonstration of stochastic software testing on programs from a variety of different
application areas and explanation as to how these techniques can be used to account for
other forms of uncertainty
The tutorial is intended to be of interest to participants from both industry and academia, including
those who are:
- Researchers looking for interesting new opportunities and challenges for software testing
- Software engineers seeking new ways of addressing the problems they face in testing their
- Anyone who is curious about the ways in which software engineering and statistics can be
Dr Matthew Patrick is a Wellcome Trust Junior Interdisciplinary Fellow in the Department of Plant
Sciences at the University of Cambridge, holds a Research Associateship at Corpus Christi College,
Cambridge and is a Fellow of the Cambridge Philosophical Society. He is currently researching new
techniques to address specific challenges involved with testing scientific software (stochasticity, big
data and mathematical assumptions). Dr Patrick is also interested in combining software engineering
and statistical analysis more broadly, with particular expertise in biologically-inspired techniques
such as evolutionary algorithms and mutation testing. Prior to joining the University of Cambridge,
he worked in the Agronomic Information Services group at Syngenta Crop Protection AG in Basel,
Switzerland and in 2013 he obtained a PhD in Computer Science from the University of York, UK for
research into mutation testing. More information is available at http://www.plantsci.cam.ac.uk/directory/patrick-matthew.
Dr Guoxin Su is a Senior Research Fellow in the Department of Computer Science, School of Computing at the National University of Singapore (NUS).
He is interested in developing novel methods of probabilistic model checking and improving its applicability (e.g., efficiency and flexibility) to a variety of real-world
problem areas, including adaptive software, runtime monitoring, networking and cybersecurity. He has published a number of papers in the premier venues of Software Engineering
and Formal Methods, including IEEE TSE, ICSE, CONCUR, ATVA and FASE. Prior to joining NUS, he received a PhD from the University of Technology, Sydney in 2013. More information
is available at http://www.comp.nus.edu.sg/~sugx/.
Mining and Modelling Unstructured Data (Sept 4, half day, AM)
Artifacts containing natural language, like Q&A websites (e.g., Stack Overflow), tutorials, and development emails, are essential to support software development. They have become a popular subject for software engineering research. The analysis of such artifacts is particularly challenging because of their heterogeneity: These resources consist of natural language interleaved with fragments of multiple programming and markup languages. Our tutorial is aimed at overcoming the challenge, by first discussing the state of the art of methodologies to analyze unstructured data, and their current limitations and challenges. Then, it focuses on our efforts towards a systematic approach to model contents of such artifacts. This in turn enables novel holistic analyses that fully exploit their intrinsic heterogeneous nature. We describe the theoretical foundations of our StORMeD framework, how it can be used to extract a full-fledged model of a development artifacts, and how it can be leveraged to construct various types of analyses, such as summarization.
Part 1: Theoretical Session
- Context: We discuss the structure of heterogeneous artifacts and their importance for SE research.
- State of the Art: We illustrate the main approaches in literature highlighting their limitations and challenges.
- Technical Concepts: We introduce StORMeD and its technical concepts, including the grammar implementation and how it models artifacts through a heterogeneous abstract syntax tree (H-AST).
Part 2: Hands-on Session
- StORMeD service and API Usage: We introduce the practical usage of the service and its API to parse and model artifacts, including the analysis of Stack Overflow discussions.
- Application to Summarization: We present the implementation of a practical application to summarize Stack Overflow discussions using a holistic similarity metric.
- Wrap-up: We conclude the tutorial by discussing other potential applications and ideas where our approach can be beneficial.
Our tutorial targets academics that consider heterogeneous development artifacts as their target subject for any kind of study. In particular, any analysis that would benefit from a holistic approach - i.e., an approach that considers the intrinsic heterogeneous nature of the artifacts - can potentially benefit from our approach. Thus, we believe that researchers in the mining software repositories community, the program comprehension community, and in general software maintenance and evolution, constitute the best match as a target audience for this tutorial.
We expect the audience to possess basic knowledge about parsing and grammars. We plan to use the Scala programming language for the tutorial, but without using any advanced feature: We expect the audience to be experienced about object-oriented programming and basic functional programming, like lambda expressions and simple higher-order functions (e.g., map, reduce).
Luca Ponzanelli is a PhD student at the the Università della Svizzera italiana (USI) since September 2012. He is working in the REVEAL research group under the supervision of Prof. Dr. Michele Lanza. He received his Msc from USI in 2012, and his BSc from the University of Milano-Bicocca in 2010.
Andrea Mocci is a postdoctoral researcher at Università della Svizzera italiana (USI), working in the REVEAL research group headed by Prof. Michele Lanza. His interest include analysis of development artifacts, developer behavior, and software components. He obtained his PhD in 2011 under the supervision of Prof. Carlo Ghezzi, and he has been a postdoctoral fellow at MIT in 2012 with Prof. Daniel Jackson.
Michele Lanza is a full professor at the faculty of informatics of Università della Svizzera italiana, where he founded the REVEAL research group in 2004. He co-authored over 150 journal and conference papers and the book Object-Oriented Metrics in Practice. His activities span various international software engineering research communities. He has served on the program committees of ICSE, FSE, ICSME, ICPC, and MSR. He has given tutorials and technical briefings at ICSE 2006, OOPSLA 2007, ICSE 2008, and ICSE 2011.
Using Docker Containers to Improve Reproducibility in Software Engineering Research (Sept 4, half day, PM)
The ability to replicate and reproduce scientific results has become an increasingly important topic for many academic disciplines. In computer science and, more specifically, software engineering, contributions of scientific work rely on developed algorithms, tools and prototypes, quantitative evaluations, and other computational analyses. Published code and data come with many undocumented, dependencies, and configurations that are internal knowledge and make reproducibility hard to achieve. This tutorial presents how Docker containers can overcome these issues and aid the reproducibility of research artifacts in software engineering and discusses their applications in the field.
- Introduction to Containers and Reproducibility of SE Research.
The briefing will start by introducing the term reproducibility in relation to SE research. It will continue to introduce container technologies and how they can help with reproducibility.
- Docker Container Basics.
The briefing will cover a short overview of the Docker ecosystem and will introduce the basic building blocks and its tooling. This block in the briefing will also walk through the process and the concrete instructions necessary to build an initial container.
- Software Engineering Use Cases.
This part of the tutorial will walk through 2 concrete use cases found in software engineering research. First, it will show how to package a typical evaluation/quantitative analysis part of a SE research paper where the analysis has been conducted with the statistical package R. As a second use case, it will look at how to package an existing, open sourced prototype implementation of a Software Engineering research paper. In both examples, the concrete instructions to construct the Docker image will be elaborated along the way.
- Open Challenges and Limitations.
We conclude the briefing with a discussion on the open challenges that still remain in the area of reproducibility, what kind of limitations exist.
This tutorial is suitable for both academic researchers and industry professionals that want to learn more about Docker containers and reproducibility in general. No prior knowledge of Docker or any other container technology is necessary. To follow along with the instructions, we assume basic skills in working with the Linux console (e.g., bash). The audience will be pointed to further material, for those who want to learn more about container technologies.
Jurgen Cito is a Ph.D. candidate at the University of Zurich, Switzerland. In his research, he investigates the intersection between software engineering and cloud computing. In the summer of 2015, he was a research intern at the IBM T.J. Watson Research Center in New York, where he worked on cloud analytics based on Docker containers. That year he also won the local Docker Hackathon in New York City with the project docker-record. More recently, he has been a mentor at various Docker events and meetups. More information is available at: http://www.ifi.uzh.ch/seal/people/cito.html.
Harald C. Gall is a professor of software engineering in the Department of Informatics at the University of Zurich, Switzerland. His research interests include software engineering, focusing on software evolution, software quality analysis, software architecture, reengineering, collaborative software engineering, and service centric software systems. He was the program chair of the European Software Engineering Conference and the ACM SIGSOFT ESEC-FSE in 2005 and the program co-chair of ICSE 2011. More information is available at: http://www.ifi.uzh.ch/seal/people/gall.html.