Skip to content

GitLab Action to autograde projects based on a configurable set of metrics

License

Notifications You must be signed in to change notification settings

uhafner/autograding-gitlab-action

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Autograding GitLab Action

GitHub Actions CodeQL

This action autogrades projects based on a configurable set of metrics and gives feedback on merge requests (or single commits) in GitLab. I use this action to automatically grade student projects in my lectures at the Munich University of Applied Sciences.

You can see the results of this action in the example pull request of a fake student project. You will find another GitLab pipeline example in the top-level folder of this project.

Pull request comment

Please have a look at my companion coding style and Maven parent POM to see how to create Java projects that can be graded using this GitLab action. If you are hosting your project on GitHub, then you might be interested in my identical GitHub action as well.

Both actions are inspired by my Jenkins plugins:

The Jenkins plugins work in the same way but are much more powerful and flexible and show the results additionally in Jenkins' UI.

Please note that the action works on report files that are generated by other tools. It does not run the tests or static analysis tools itself. You need to run these tools in a previous step of your pipeline. See the example below for details. This has the advantage that you can use a tooling you are already familiar with. So the action will run for any programming language that can generate the required report files. There are already more than one hundred analysis formats supported. Code and mutation coverage reports can use the JaCoCo, Cobertura, OpenCover and PIT formats, see the coverage model for details. Test results can be provided in the JUnit, XUnit, or NUnit XML-formats.

Configuration

The autograding action must be added as a separate step of your GitLab pipeline since it is packaged in a Docker container. This step should run after your normal build and testing steps so that it has access to all produced artifacts. Make sure to configure your build to produce the required report files (e.g., JUnit XML reports, JaCoCo XML reports, etc.) even when there are test failures or warnings found. Otherwise, the action will only show partial results.

In the following example, this step is bound to the test stage. You need to provide a GitLab access token as CI/CD Variable GITLAB_TOKEN to create comments in the merge request. The action will use the token to authenticate against the GitLab API. See the next section for details.

build:
  stage: build # (compile, test with code and mutation coverage, and run static analysis)
  artifacts:
    paths:
      - target # Make results available for the autograding action in the next stage
  script:
    - mvn -V --color always -ntp clean verify -Dmaven.test.failure.ignore=true -Ppit

test:
  image: uhafner/autograding-gitlab-action:v1 # Or use a specific tag like 1.9.0
  stage: test
  variables: 
    # You need to provide a GitLab access token as CI/CD Variable GITLAB_TOKEN in the GitLab user interface 
    CONFIG: > # Override default configuration: just grade the test results
      {
      "tests": {
        "tools": [
          {
            "id": "test",
            "name": "Unittests",
            "pattern": "**/target/*-reports/TEST*.xml"
          }
        ],
        "name": "Unit and integration tests",
        "skippedImpact": -1,
        "failureImpact": -5,
        "maxScore": 100
        }
      }
    SKIP_LINE_COMMENTS: 'true'
  script:
    - java -cp @/app/jib-classpath-file edu.hm.hafner.grading.gitlab.GitLabAutoGradingRunner

Action Parameters

This action can be configured using the following environment variables (see example above):

  • GITLAB_TOKEN: mandatory GitLab access token, see next section for details.
  • CONFIG: "{...}": optional configuration, see next sections for details, or consult the autograding-model project for the exact implementation. If not specified, a default configuration will be used.
  • DISPLAY_NAME: "Name in the comment title": optional name in the comment title (overwrites the default: "Autograding score").
  • SKIP_LINE_COMMENTS: true: Optional flag to skip the creation of commit comments at specific lines (for warnings and missed coverage). By default, line comments are created.
  • MAX_WARNING_COMMENTS: <number>: Optional parameter to limit the number of warning comments at specific lines. By default, all line comments are created.
  • MAX_COVERAGE_COMMENTS: <number>: Optional parameter to limit the number of coverage comments at specific lines. By default, all line comments are created.
  • SKIP_DETAILS: true: Optional flag to skip the details of the results (e.g., stack trace of failed tests, autograding detail tables) in the commit or merge request comment. By default, the detailed report is created.

GitLab Access Token

The action needs a GitLab access token to create comments in the commit notes or merge request. You can create a new token with a meaningful name in your GitLab user or group settings. This token name will be used as author for all notes. The token needs the permissions api, read_api, read_repository, write_repository

Group Access Token

This token then needs to be added as a CI/CD Variable in your GitLab project settings. The name of the variable must be GITLAB_TOKEN and the value is the token you just created.

CI/CD Variables

Metrics Configuration

The individual metrics can be configured by defining an appropriate CONFIG environment variable (in JSON format) in your GitLab pipeline:

Currently, you can select from the metrics shown in the following sections. Each metric can be enabled and configured individually. All of these configurations are composed in the same way: you can define a list of tools that are used to collect the data, a name and icon (Markdown identifier) for the metric, and a maximum score. All tools need to provide a pattern where the autograding action can find the result files in the workspace (e.g., JUnit XML reports). Additionally, each tool needs to provide the parser ID of the tool so that the underlying model can find the correct parser to read the results. See analysis model and coverage model for the list of supported parsers.

Additionally, you can define the impact of each result (e.g., a failed test, a missed line in coverage) on the final score. The impact is a positive or negative number and will be multiplied with the actual value of the measured items during the evaluation. Negative values will be subtracted from the maximum score to compute the final score. Positive values will be directly used as the final score. You can choose the type of impact that matches your needs best.

Test statistics (e.g., number of failed tests)

Test statistics

This metric can be configured using a JSON object tests, see the example below for details:

{
  "tests": {
    "tools": [
      {
        "id": "test",
        "name": "Unittests",
        "pattern": "**/junit*.xml"
      }
    ],
    "name": "JUnit",
    "passedImpact": 10,
    "skippedImpact": -1,
    "failureImpact": -5,
    "maxScore": 100
  }
}

You can either count passed tests as positive impact or failed tests as negative impact (or use a mix of both). Skipped tests will be listed individually. For failed tests, the test error message and stack trace will be shown directly after the summary in the pull request.

Code or mutation coverage (e.g., line coverage percentage)

Code coverage summary

This metric can be configured using a JSON object coverage, see the example below for details:

{
  "coverage": [
    {
      "tools": [
        {
          "id": "jacoco",
          "name": "Line Coverage",
          "metric": "line",
          "sourcePath": "src/main/java",
          "pattern": "**/jacoco.xml"
        },
        {
          "id": "jacoco",
          "name": "Branch Coverage",
          "metric": "branch",
          "sourcePath": "src/main/java",
          "pattern": "**/jacoco.xml"
        }
      ],
      "name": "JaCoCo",
      "maxScore": 100,
      "coveredPercentageImpact": 1,
      "missedPercentageImpact": -1
    },
    {
      "tools": [
        {
          "id": "pit",
          "name": "Mutation Coverage",
          "metric": "mutation",
          "sourcePath": "src/main/java",
          "pattern": "**/mutations.xml"
        }
      ],
      "name": "PIT",
      "maxScore": 100,
      "coveredPercentageImpact": 1,
      "missedPercentageImpact": 0
    }
  ]
}

You can either use the covered percentage as positive impact or the missed percentage as negative impact (a mix of both makes little sense but would work as well). Please make sure to define exactly a unique and supported metric for each tool. For example, JaCoCo provides line and branch coverage, so you need to define two tools for JaCoCo. PIT provides mutation coverage, so you need to define a tool for PIT that uses the metric mutation.

Missed lines or branches as well as survived mutations will be shown as comments in the merge request:

Line coverage comment Branch and mutation coverage comment

Static analysis (e.g., number of warnings)

Static analysis

This metric can be configured using a JSON object analysis, see the example below for details:

{
  "analysis": [
    {
      "name": "Style",
      "id": "style",
      "tools": [
        {
          "id": "checkstyle",
          "name": "CheckStyle",
          "pattern": "**/target/checkstyle-result.xml"
        },
        {
          "id": "pmd",
          "name": "PMD",
          "pattern": "**/target/pmd.xml"
        }
      ],
      "errorImpact": 1,
      "highImpact": 2,
      "normalImpact": 3,
      "lowImpact": 4,
      "maxScore": 100
    },
    {
      "name": "Bugs",
      "id": "bugs",
      "icon": "bug",
      "tools": [
        {
          "id": "spotbugs",
          "name": "SpotBugs",
          "sourcePath": "src/main/java",
          "pattern": "**/target/spotbugsXml.xml"
        }
      ],
      "errorImpact": -11,
      "highImpact": -12,
      "normalImpact": -13,
      "lowImpact": -14,
      "maxScore": 100
    }
  ]
}

Normally, you would only use a negative impact for this metric: each warning (of a given severity) will reduce the final score by the specified amount. You can define the impact of each severity level individually.

All warnings will be shown as comments in the merge request:

Warning annotations

Monitoring Project Quality

You can also use this action to monitor the quality of your project without creating a grading score: use only configurations that do not set the maxScore properties. Then this action basically works like a stand-alone version of my Jenkins Warnings and Coverage plugins, see the following screenshots.

Quality Monitor Overview Quality Monitor Details