Skip to content

Checkstyle GSoC 2022 Project Ideas

Roman Ivanov edited this page Apr 20, 2024 · 23 revisions

Participated, https://summerofcode.withgoogle.com/archive/2022/organizations/checkstyle



Extension with smaller projects:


Project Name: Auto-fix Module

Skills required: intermediate Java

Project type: new feature implementation.

Project goal: implement new module, test it on real projects

Project size: large

Mentors: Roman Ivanov, Daniel Mühlbachler

Description: Checkstyle is known as tool that raises numerous minor issues. There are so many of these and they are so minor that it is hard to find time and engineer to fix them. Most of the issues are so easy to fix but navigation to certain part of the code and making the fix takes time. Engineers could spend this time doing something more valuable. Implementation of an auto-fix functionality could significantly simplify introduction of checkstyle to project as it will do most tedious work automatically.

The major part of checkstyle violations are specifically targeting the formatting of the code. It is often that IDE formatting settings are not in sync with the checkstyle configuration. The IDE can fix the code itself as part of it’s auto-formatting. The same should be done by Checkstyle. Each Check that is targeting the formatting part of the code should have “Fix” functionality built-in. This functionality will convert the code with the violation to compliant code without any user interaction. Such functionality is in huge demand by users.

In scope of this project, it is required to review all existing functionality of auto-fix of code in plugins and tools to learn challenges they have and see the whole list of requirements to resolve such a task. Make implementation of auto-fix for formatting Checks as part of a special Module that takes all reported violations and fix them that will support auto-fix. If the resulting functionality proves to be easy to maintain, and might be reused by checkstyle plugins, then propose API changes can be brought to the core library and allow any plugins to reuse it.

More details at https://github.com/checkstyle/checkstyle/issues/7427

Links to similar tools: https://docs.openrewrite.org/tutorials/automatically-fix-checkstyle-violations

Ai autofix for checkstyle: https://link.springer.com/article/10.1007/s10664-021-10107-0


Project Name: Optimization of distance between methods in single Java class

Skills required: basic Java , good analytical abilities, good background in mathematics.

Project type: new feature implementation.

Project goal: to make quality practices automated and publicly available.

Project size: large

Mentors: Roman Ivanov, Baratali Izmailov

Description:

This task is ambitious attempt to improve code read-ability by minimizing user jump/scrolls in source file to look at details of method implementation when user looks at method first usage.

It is required to analyse a lot of code and find a model to minimize distance between methods first usage and method declaration in the same file and respect users preferences to keep grouped overloaded and overridden methods together. Some other preferences may appear during investigation of open-source projects.

First step is already done by our team, we created a web service that already calculate distances between methods and make DSM matrix to ease analysis - methods-distance. We already practice it in our project.

As a second step it is required to use a matrix of distances between methods and optimize it by some empiric algorithm to allow user define expected model of class by arguments. This will allow to use this algorithm as a Check to enforce code structure automatically during build time.

Results of the project:

  • article with all details of analysis and algorithm details;
  • new Checkstyle's Check with optimization algorithm to share the algorithm with whole java community.

Prove of necessity: we have a number of PRs where contributors put new methods at any possible place in a class but better place is close to first usage. Example #1, Example #2, Example #3, ....


Project Name: Reconcile formatters of Eclipse , NetBeans and IntelliJ IDEA IDEs by Checkstyle config.

Skills required: basic Java.

Project type: new feature implementation, analysis of existing IDE features.

Project goal: to make well-known quality practices publicly available.

Project size: large

Mentors: Roman Ivanov, Pavel Bludov

Description:

Usage of different IDEs in the same team is already a serious problem, as different IDEs format code base on their own rules and configurations. Unwanted formatting changes happen to code which complicate code-review process. Problem become more acute when project use static analysis tool like Checkstyle that has a wide range of code formatting Checks.

It is required to make it possible to use the same Checkstyle config to work in IDEs without conflicts with IDEs internal formatters. This will help team members be independent on IDE choice but at the same time keep the same format and code style throughout the team.

Main focus of this project is the analysis of formatting abilities of IDEs (indentation, imports order, declaration order, separator/operator wrap, .....) . Update existing Checkstyle Rules to be able to work in the similar and non-conflicting way.

Results of the project:

  • create configuration for IDEs for Checkstyle project to let Checkstyle team use it and auto-format code to conform with checkstyle_check.xml file that is used by Continuous Integration.
  • create Checkstyle config that follows default Eclipse formatting + inspection rules
  • create Checkstyle config that follows default IntelliJ IDEA formatting + inspection rules
  • create Checkstyle config that follows default NetBeans formatting + inspection rules

Prove of necessity: mail-list post #1, mail-list post #2, mail-list post #3 , discussion #1


Project Name: Open JDK Code convention coverage

Skills required: basic Java.

Project type: new feature implementation.

Project goal: to make well-known quality practices publicly available.

Project size: large

Mentors: Roman Ivanov, Pavel Bludov

Description:

OpenJdk Code Convention was one of the first guidelines on how to write Java code. OpenJdk Code Convention is marked as outdated (because of date of last update made in it) but best practices described there do not have an expiration date. New OpenJDK Java Style Guidelines is close to the final version and most likely will be successor of OpenJdk Code Convention. But there is a number of projects in Apache that still follow OpenJdk rules, so both configurations are in need by community.

OpenJdk Code Convention is already partly covered by Checkstyle, known as Sun Code Convention. A lot of validation Rules were added and changed in Checkstyle from the time when Sun's configuration was created (2004 year).

During the project it is required to review both documents in detail and prove publicly that Checkstyle covers all guideline rules. Missed functionality needs to be created, blocking bugs need to be fixed. Page OpenJdk Java Style Checkstyle Coverage needs to be updated. New page "New OpenJDK's Java Style Checkstyle Coverage" need to be created. Both pages need to be formatted in the same way as it is done for Google's Java Style Checkstyle Coverage.

Prove of necessity: javadoc issues on github; results of open survey; request from users for Openjdk coverage support.


Project Name: Coverage of Documentation Comments Style Guide

Skills required: basic Java.

Project type: new feature implementation.

Project goal: to make well-known quality practices publicly available.

Project size: large

Mentors: Roman Ivanov, Pavel Bludov

Description:

Project will mainly be focusing on automation of Documentation Comments (javadoc) guidelines by Checkstyle Checks. Reliable comments parsing was a major improvement in Checkstyle during GSoC 2014, archived results need to be reused to reliably implement automation of Javadoc best practices.

Separate configuration file with newly created Checks need to be created. Best practices in documentation make sense not for all projects. Javadoc validation matters only for library projects that need to expose online documentation in web publicly.

The result of this project will be a configuration file with the maximum possible coverage of Comment style guide. Report should look like Google's Java Style Checkstyle Coverage. If there will be time left we can focus on coverage of guidelines from https://blog.joda.org/2012/11/javadoc-coding-standards.html

Prove of necessity: javadoc issues on github.


Project Name: Spellcheck of Identifiers by English dictionary

Skills required: intermediate Java.

Project type: new feature implementation.

Project goal: implement spell checking for java code for all identifiers .

Project size: large

Mentors: Roman Ivanov, Andrei Paikin

Description:

The correct spelling of words in code is very important, since a typo in the name of method that is part of API could result in serious problem. Mistakes in names also make reading of code frustrating and misleading, especially when a typo in one letter makes developer to read javadoc or even implementation of the method. Two most popular IDEs (Eclipse and IntelliJ IDEA) already have spell-check ability. It will be beneficial for Checkstyle to have the same functionality that could be used in any Continuous Integration system by Command Line Interface or as part of build tool (maven, ant, gradle, ....) with wide range of options to customize to users needs. Features of existing spell-checkers need to be analyzed -
IntelliJ IDEA Spellchecking , Eclipse Spelling. There are numbers of open-source projects that do spell-check. It is ok to reuse them if license is compatible. Examples: https://code.google.com/archive/p/bspell/ , http://www.softcorporation.com/products/spellcheck/, ... https://github.com/giraciopide/shellcheck-maven-plugin


Project Name: Automate verification of documentation for all modules and generation of web site content based on javadoc of modules

Skills required: intermediate Java

Project type: creation of new functionality.

Project goal: organize documentation and automate its maintenance

Project size: medium

Mentors: Roman Ivanov,

Description: Checkstyle is an active project. Our user base is always requesting existing functionality to be expanded and adding brand new features. As these features are added to the core Checkstyle project, documentation must be updated to notify users not involved in the request of its existence. Some changes can drastically change the default behavior of a module. Documentation becomes extremely important to help users understand how our modules work and can be configured to fit each unique persons’ needs without looking at the source behind the scenes.

Documentation is mostly a manual process and it is easy to miss updating it during the fix workflow. Missing documentation on functionality can be missed for years as users can only go by documentation to know what exists. Even if it is caught and tried to be added, some contributors are not aware of our best practices when it comes to writing said documentation.

We want to automate most of our documentation creation to help avoid the manual processes in creating it. Automation will ensure all documentation for checkstyle follows a strict standard that we define. Not only ensuring all configurable options are documented, it will help detect if current examples of usage are enough or if more are needed. It will ensure examples provided are valid, compilable if Java, and that it will or will not produce the violations for the configuration and check being described. For any new modules added, it will print out a template for the contributor to follow and fill in the required information specific to that check, like descriptions.

As part of this project, students must ensure all documentation verification pass for existing documentation and generation of xdoc/html content done automatically and do not need manual updates.


Project Name: practice what you preach

Skills required: intermediate Java

Project type: improving quality.

Project goal: improving automation of code review.

Project size: large

Mentors: Roman Ivanov, Nick Mancuso, Andrei Paikin

Description: There are a lot of static analysis tools for Java language, it is not a problem to activate them in project and get report on what to improve. Problem is next step, how to start fixing reported problems gradually and in the same time keep focusing of feature delivery as before and do not let any body else to contribute new violations while team is fixing old problems.

To find time to fix old problems is hard but possible task. Not all engineers agree to spend time on this so usually small part of engineers start doing this to let other see benefits of this and later of amount of involved engineers is growing. The most frustrating for engineers fact is that while resolving such violations some other contributors create violation (most of time unintentionally). Fixing problem after code is merged to common code base is few time harder when doing it at time initial implementation.

Goal of this project is find ways to activate more analysis tools in our code and find a way to enforce more strict rules step by step to not interfere with other engineers, and not let other engineers contribute new violations if some rules are enforced in certain part of code.

Exact tasks with goal above in mind:

  1. move Teamcity inspections config be based on configuration file in a code, explain all suppression for inspections in separate javadoc tag to be explicit why violation is false-positive or wontfix.
  2. enforce 0 violations by errorprone over our code.
  3. use of archunit in UT for design verification over classes and between classes.
  4. extend usage of pitest to activate all mutators and cover all survivals by tests, resolve all existing suppressions.
  5. activate checkerframework and fix violations from it.

Project Name: Automate release process of checkstyle

Skills required: basic Java and basic bash scripting, basic CI knowledge

Project type: infrastructure improvement

Project goal: remove last human factor point from maintenance of project

Project size: medium

Mentors: Roman Ivanov, Nick Mancuso, Andrei Paikin

Description: Opensource Project is really free when it does not depends on certain people and everyone can contribute to it and set of people can manage to adjust Continuous Integration(CI) and do official Release easily.

In our project CI process is already set in code and everyone can see it and propose change for it. But we still have release process 80% automated, and unfortunately it depends on 1 person to do this. Release process take a bit of time to complete that is why we have release process set in "ones a month" but in reality we ready to release at any point of time from any commit, as our development process is already imply this. Release process is focusing of delivery detail to users rather then just bumping a version and sharing jars. So we focus on release notes generation to share all details with users and sharing it at web hostings and social media. Release process requires a lot of actions from certain person who has admin access to websites/hosts and other accounts to make release.

Goal of this project to finish automation to let any person with Read-Write access in GitHub do release by activation of CI execution. Release process should be in scripts/code of our repo(s) to let all see what is going on and even is required do release manually in case CI is experiencing downtime or any other problems.

Completion of this project will help us to increase frequency of releases if required to make sure that users can receive released version to use almost right after code changes is merged to main code base. It will make contribution to project more attractive as it will be clear that as soon as code change is accepted you can use in at project where you are experiencing problem with defect. Frequency of releases will be defined later on and probably be based on severity of fixed defects, contributors can share with us how much they ok to wait for release.

Project Name: Regression Testing Tool and HTML Report Generator for Pull Request

Skills required: basic Java, or Shell, or Groovy, or Scala; basic understanding of testing principles.

Project type: creation of testing tools.

Project goal: to enforce quality and ease new Rule implementation to project.

Description:

Checkstyle needs a tool that will do regression testing based on proposed patch (PullRequest). Tool needs to ensure that after fixing an issue new problems and unexpected behaviours are not introduced. It is required to parse git changes and find changed modules. Based on the list of changed modules, it should generate testing configurations(checkstyle configuration files) and run them with binaries based on code before change and with proposed change. Diff report(differences of violations) need to be generated and ready to be shared in web.

Each module could have set of manually prepared configuration chunks that should be used if that module was changed. But full automation is highly desirable.

Base Repo: https://github.com/checkstyle/regression-tool

Prove of necessity: issues on github as request for new validations ; mail-list thread that describes reason of temporal moratorium on new Rules in Checkstyle; official sandbox project with about 40 additional Rules ; validation ideas that would be good to borrow from Groovy experience; just another custom Rules for Checkstyle: 1, 2, 3, 3, 4, 5, 6, ... ; results of open survey, wiki page with ideas, link to issues that are created base on discussions in the team.


Project Name: Pitest Resolution

Project type: Resolving outstanding Pitest Issues

Project goal: to enforce quality and reduce backlog of pitest suppressions

Description:

Checkstyle recently introduced a new pitest suppression model and mutators to our repo and suppressed currently outstanding pitest issues. PIT is a state of the art mutation testing system, providing gold standard test coverage for Java and the jvm. It ensures the code we have written is up to a high standard of showing functionality of why our code is written as it is. Some of our code base has been around for a while and was likely introduced without full quality testing to show all aspects of it as it is intended.

Checkstyle needs these suppressions reviewed and identify if any new tests, and input files, can be identified to resolve them and show the code is functionally sound as it is.This may require diving into the logic of the modules to either assist in identifying with a test or leading to resolution on what to do with the suppression. Checkstyle has a tool in another repo that helps with some of these pitest in assisting with identifying sources that can be used to assist in finding inputs to be used for tests, which is provided in the helpful links.

Helpful Links: pitest homepage , thought process example in working pitest , example pitest resolved issue , tool assisting with checkstyle pitest


Project Name: Break Away from Maven Plugin

Skills required: basic Java, Shell, Groovy

Project goal: to allow Checkstyle to not be held back on changes because of maven-checkstyle-plugin

Description:

Checkstyle is a library used by many other tools. We have become too dependent on another tool, maven-checkstyle-plugin, for use with all our CI testing and custom regression. This reliance continues to prevent Checkstyle to release breaking changes in our repo as this also breaks all the usage in the CI. This has constantly requires us to do work arounds to not disturb our connection and reliance on maven-checkstyle-plugin.

Checkstyle needs to break away and really only rely on tools we maintain. Below is a list of connected issues which detail some of the areas that need to change in order to break away from this plugin.

Connected Issues: Launch/Diff Groovy should remove use of maven-checkstyle-plugin, Convert sevntu-checkstyle-check to ant run, Convert regressions that use maven-checkstyle-plugin to CLI based

Example of Plugin Issue: Upgrade XML logger to XML 1.1

Clone this wiki locally