Using Docker Containers to Improve Reproducibility in Software and Web Engineering Research

Monday, June 6th, 2016 11:00-15:30, Room 355

Slides: slideshare

The ability to replicate and reproduce scientific results has become an increasingly important topic for many academic disciplines. In computer science and, more specifically, software and Web engineering, contributions of scientific work rely on developed algorithms, tools and prototypes, quantitative evaluations, and other computational analyses. Published code and data come with many undocumented assumptions, dependencies, and configurations that are internal knowledge and make reproducibility hard to achieve. This tutorial presents how Docker containers can overcome these issues and aid the reproducibility of research artefacts in software engineering and discusses their applications in the field.

Pre-requisites: Follow the instructions here to come prepared for the tutorial with the required software tools installed on your machine.

Jürgen Cito is a Ph.D. candidate at the University of Zurich, Switzerland. In his research, he investigates the intersection between software engineering and cloud computing. In the summer of 2015, he was a research intern at the IBM T.J. Watson Research Center in New York, where he worked on cloud analytics based on Docker containers. That year he also won the local Docker Hackathon in New York City with the project docker-record. More information is available at: http://www.ifi.uzh.ch/seal/people/cito.html

Vincenzo Ferme is a Ph.D. candidate at the Università della Svizzera Italiana (USI), Lugano, Switzerland. In his research, he is involved in the BenchFlow Project. The goal of the project is to design the first benchmark for assessing and comparing the performance of workflow management systems. In the context of the project, he is developing a framework for automated software performance benchmarking that largely relies on Docker. More information is available at: http://www.vincenzoferme.it

Harald C. Gall is a professor of software engineering in the Department of Informatics at the University of Zurich, Switzerland. His research interests include software engineering, focusing on software evolution, software quality analysis, software architecture, reengineering, collaborative software engineering, and service centric software systems. He was the program chair of the European Software Engineering Conference and the ACM SIGSOFT ESEC-FSE in 2005 and the program co-chair of ICSE 2011. More information is available at: http://www.ifi.uzh.ch/seal/people/gall.html

Recommender Systems meet Linked Open Data

Tuesday, June 7th, 2016 11:00-13:00, Room 355

Slides: slideshare

Information overload is a problem we daily experience when accessing information channels such as a Web site, a mobile application or even our set-top box. There is a clear need for applications able to guide users through an apparently chaotic information space thus filtering, in a personalized way, only those elements that may result of interest to them. Recommender systems have been originally conceived having e-commerce scenarios in mind but they rapidly spread to different knowledge and application domains and are nowadays a fundamental building block of many personalized information access systems. Together with the transformation of the Web from a distributed and hyperlinked repository of documents to a distributed repository of structured knowledge, in the last years, a new generation of recommendation engines has emerged. As of today, we have tons of RDF data available in the Web of Data, but only a few applications really exploit their potential power. The availability of such data is for sure an opportunity to feed personalized information access tools such as recommender systems. They rely on the the use of Linked Data as a source of information to enrich items description and provide new services.

The main goals of this tutorial are:

  • Provide an introduction to recommender systems by describing the main approaches available to design and feed the recommendation engine;
  • Show how to exploit the information available in the Linked Open Data cloud to develop a new generation of recommender systems.

Tommaso Di Noia is Associate Professor since 2014 in the field of Information Processing Systems at Polytechnic University of Bari. Currently, his main research topics deal with Linked Open Data and how to leverage the knowledge encoded in Big Data datasets in order to develop content-based/collaborative/context-aware recommendation engines (recommender systems). Strongly related to this latter research topic is the analysis and modeling of User Profiles in Information Retrieval scenarios. As for Linked Open Data, he is interested in the whole process of production, publication, maintenance and exploitation of the ultimate technological solutions for Open Data.

A Declarative Approach to Information Extraction using Web Service API

Tuesday, June 7th, 2016 14:00-17:30, Room 355

The number of diverse web services that we use regularly is significantly increasing. Most of these services are managed by autonomous service providers. It has become very difficult to get a unified view of this widespread data, which in all likelihood is substantially important to enterprises. A classical approach followed by the enterprises is to write applications using imperative languages making use of the web service API. Such an approach is not scalable and is difficult to maintain considering the ever-evolving web services landscape. This tutorial explores a declarative approach to information extraction from the web services using basic web and database technologies. It is targeted to audience from both industry as well as academia and requires a basic understanding of database principles and web technologies.

John Samuel is a Post-doctoral researcher in LIRIS, Université de Lyon, France. He obtained his PhD in Computer Science from Université Blaise Pascal, France in 2014. Prior to that he worked as software engineer in Yahoo!, Bangalore. His research interests include data integration, analysis and visualization, web services, knowledge representation and geographical information systems.
Homepage: https://liris.cnrs.fr/john.samuel/

Christophe Rey obtained his PhD in Computer Sciences in 2004, at Blaise Pascal University, France. Since 2005, he is associate professor in the same university. His main research topics are relational databases and knowledge representation and reasoning (using description logic), applied to data integration, and especially the mediation approach with LAV mappings.
Homepage: http://fc.isima.fr/~crey/

Design science research in information systems and software systems engineering

Wednesday, June 8th, 2016 11:00-15:30, Room 355

Slides: PDF

The last ten years has seen a surge of interest in design science research in information systems, and of empirical research in software engineering. In this tutorial I present a framework for design science in information and software systems engineering that shows how in design science research, we iterate over designing new artifacts and empirically investigating these artifacts. To be relevant, the artifacts should potentially contribute to organizational goals, and to be empirically sound, research to validate new artifacts should provide insight into the effects of using these artifacts in an organizational context. The logic of both of these activities, design and empirical research, is that of rational decision making. I show how this logic can be used to structure our technical and empirical research goals and questions, as well as how to structure reports about our technical or empirical research. This gives us checklists for the design cycle used in technical research and for the empirical cycle used in empirical research. Finally, I will discuss in more detail what the role of theories in design science research is, and how we use theory to state research questions and to generalize the research results.

Roel Wieringa occupies the chair of Information Systems at the Department of Computer Science at the University of Twente, The Netherlands. His research interests include requirements engineering, enterprise architecture, and design science research methodology for information systems and software engineering. He has written three books, Requirements Engineering: Frameworks for Understanding (Wiley, 1996), Design Methods for Reactive Systems: Yourdon, Statemate and the UML (Morgan Kaufmann, 2003, and Design Science Methodology for Information Systems and Software Engineering (Springer, 2014).

Intended Audience

Researchers planning to do empirical work in the context of a technical research project. The audience will obtain an understanding how to guard the relevance of their research for practice by doing problem‐oriented research, as well as how to validate new technology empirically in a methodologically sound way.

Distributed Web Applications with IPFS

Thursday, June 9th, 2016 9:00-17:30, Room 355

IPFS, the InterPlanetary File System, is the distributed and permanent Web, a protocol to make the Web faster, more secure, open and available. IPFS could be seen as Git meets a BitTorrent swarm, exchanging objects within one Git repository. In other words, IPFS provides a high throughput content-addressed block storage model, with content-addressed hyperlinks. This forms a generalised Merkle DAG, a data structure upon which one can build versioned file systems, blockchains, and even a Permanent Web. IPFS combines a Distributed Hash Table, an incentivised block exchange, and a self-certifying namespace. IPFS has no single point of failure, and nodes do not need to trust each other.

In this full-day tutorial, participants will be able to learn about the IPFS Application Stack, namely: libp2p, the networking layer of IPFS used in order to support multi transport protocols and routing mechanisms; bitswap, the data exchange protocol that enables peers to request and offer blocks of data; Merkle DAG, a Merkle Tree type data- structure where blobs of data are referenced by their cryptographically hash, so that they can be validated with regards to integrity and discovered in the network; API, the interface used by other applications to use IPFS. This tutorial will have a presentation and hands on components, ending up with a discussion in order to help the attendees to understand how IPFS can be used for more specific use cases.

David Dias is a P2P Software Engineer and Researcher at Protocol Labs (http://ipn.io/), the company behind IPFS. Before he has worked on the security and web development industry at ^lift security. David holds a P2P Masters in Science, having built the first P2P DHT using WebRTC specifically for the Web Platform for job execution distribution. David's a frequent speaker on P2P, security and distributed systems. Currently he is also an invited Professor at the University of Lisbon, having developed a new post graduation course on modern Web development. Follow him on @daviddias.

Friedel Ziegelmayer is a Software Engineer and Researcher at Protocol Labs (http://ipn.io), the company building IPFS. Previously he studied mathematics at the Albert-Ludwig-University in Freiburg. He also works on other open source projects like the test runner Karma.
You an follow him at @dignifiedquire or look at his code

Intended Audience

Developers, students and researchers with a general interest in cryptography, distributed systems and P2P protocols is recommended. Familiarity with JavaScript or Go are bonus.

Learning Outcomes

  • How to use IPFS to build a distributed Application
  • Understand how using libp2p enables IPFS


During the course of this Tutorial, participants will learn how IPFS was designed and the reasons behind its architecture decisions and build an Web Application using IPFS. The topics covered are:

  • Merkle Data Structures
  • Distributed Hash Tables
  • Peer Routing strategies
  • Peer Discovery strategies