This repository contains the source code implementation of FOCUS and the datasets used to replicate the experimental results of our ICSE'19 paper:
FOCUS: A Recommender System for Mining API Function Calls and Usage Patterns
Phuong T. Nguyen, Juri Di Rocco, Davide Di Ruscio, Lina Ochoa, Thomas Degueule, Massimiliano Di Penta
A pre-print version of the paper is available here.
Our paper has been awarded two badges by the ICSE 2019 Artifact Evaluation Track, namely "Artifacts Available" and "Artifacts Evaluated." This means that all the related artifacts have been properly documented, and they are consistent, complete, and reproducible. Furthermore, they include appropriate evidence to facilitate future reuse and reproduction. We also strictly adhere to norms and standards of the research community for artifacts of this type.
A detailed instruction on how to experiment the artifacts is provided here.
FOCUS is a context-aware collaborative-filtering recommendation system that exploits cross relationships among OSS projects to suggest the inclusion of additional API invocations and concrete API usage patterns. The current implementation targets Java code specifically.
Implementing a collaborative-filtering recommendation system requires to assess the similarity of two customers, i.e., two software projects. Existing approaches consider that any two projects using an API of interest are equally valuable sources of knowledge. Instead, we postulate that not all projects are equal when it comes to recommending usage patterns: a project that is highly similar to the project currently being developed should provide higher quality patterns than a highly dissimilar one.
Our collaborative-filtering recommendation system attempts to narrow down the search scope by only considering the projects that are the most similar to the active project. Therefore, methods that are typically used conjointly by similar projects in similar contexts tend to be recommended first.
We incorporate these ideas in a new context-aware collaborative filtering recommender system that mines OSS repositories to provide developers with API FunctiOn Calls and USage patterns: FOCUS. Our approach employs a new model to represent mutual relationships between projects and collaboratively mines API usage from the most similar projects.
This repository is organized as follows:
- The tools directory contains the implementation of the different tools we developed:
- Focus: The Java implementation of FOCUS
- FocusRascal: A set of tools written in Rascal that are used to (i) transform raw Java source and binary code into FOCUS-processable and PAM-processable data (ii) retrieve concrete Java usage patterns
- PAM: A set of Python scripts allowing to compare our approach to PAM
- The dataset directory contains the datasets described in the paper that we use to evaluate FOCUS:
- jars: 3,600 JAR files extracted from Maven Central (the raw MVL dataset)
- MV_L: meta-data of the MVL dataset (extracted from 3,600 JAR files)
- MV_S: meta-data of the MVS dataset (extracted from 1,600 JAR files)
- SH_L: meta-data of the SHL dataset (extracted from the source code of 610 GitHub projects)
- SH_S: meta-data of the SHS dataset (extracted from the source code of 200 GitHub projects)
Note1: the archive of 5,147 Java projects retrieved from GitHub via the Software Heritage archive is available at this url.
Note2: The results presented in the paper can be reproduced following the instructions contained in the Focus directory.
If you find our work useful for your research, please cite the paper using the following BibTex entry:
@inproceedings{Nguyen:2019:FRS:3339505.3339636,
author = {Nguyen, Phuong T. and Di Rocco, Juri and Di Ruscio, Davide and Ochoa, Lina and Degueule, Thomas and Di Penta, Massimiliano},
title = {FOCUS: A Recommender System for Mining API Function Calls and Usage Patterns},
booktitle = {Proceedings of the 41st International Conference on Software Engineering},
series = {ICSE '19},
year = {2019},
location = {Montreal, Quebec, Canada},
pages = {1050--1060},
numpages = {11},
url = {https://doi.org/10.1109/ICSE.2019.00109},
doi = {10.1109/ICSE.2019.00109},
acmid = {3339636},
publisher = {IEEE Press},
address = {Piscataway, NJ, USA},
}