Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] CPD: Include processing errors in XML report #4992

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

adangel
Copy link
Member

@adangel adangel commented May 3, 2024

Describe the PR

Collects processing errors and includes them in CPD report and renders them
in XML. It uses the same format as the PMD report (although we have no formal
schema for CPD XML reports).

This is a nice addition to #4991 - it allows you to look at the report file
rather then at the log output to find any processing errors.

This extends the CPD XML report format. We extends the format in the past (adding file elements,
adding attributes) without bigger problems in e.g. maven-pmd-plugin, which reads this format.

Related issues

Ready?

  • Added unit tests for fixed bug/feature
  • Passing all unit tests
  • Complete build ./mvnw clean verify passes (checked automatically by github actions)
  • Added (in-code) documentation (if needed)

@adangel adangel added the in:cpd Affects the copy-paste detector label May 3, 2024
@pmd-test
Copy link

pmd-test commented May 3, 2024

1 Message
📖 Compared to master:
This changeset changes 0 violations,
introduces 0 new violations, 0 new errors and 0 new configuration errors,
removes 0 violations, 0 errors and 0 configuration errors.
Download full report as build artifact

Generated by 🚫 Danger

@jsotuyod
Copy link
Member

jsotuyod commented May 9, 2024

I'm torn on this change… it may break any tool parsing this XML if it's not defensive… At the same time, for CPD, unlike PMD, we don't have an XML Schema for these reports (we should!), so it's not like we have made any explicit promises…

I can't really make up my mind to be honest…

@oowekyala
Copy link
Member

oowekyala commented May 11, 2024

I think it's fine to introduce new elements. We also have added attributes in the past. My intuition here is that if you integrate PMD via parsing the report, you allow using any PMD version, and despite that you are not minimally defensive, then you're doing it wrong. PMD is actively developed and it should be understood that the report format might vary somewhat between versions in compatible ways. Although we should probably write up in an ADR what we understand are "compatible ways".

@adangel
Copy link
Member Author

adangel commented May 16, 2024

Adding additional elements definitely creates the risk to break some tools, because they didn't expect this (whether this is good or wrong that the tool is not developed defensive doesn't matter).
I see two options to avoid this:

  1. provide a new report format (e.g. xml2). We could also introduce a schema at the same time. It has the downside, if we need to change the format again, we would create xml3 and so on...
  2. only output the errors in the xml format, if the new report property "renderErrors" is explicitly enabled. By default, the format would not change. In the next major version, we could change the default or always output errors. Downside is, that we need to cleanup later. I guess, this is true for both options (we'd need remove the old format or remove the property or change the default of the property). A variant of this option would be to add a property "version" which controls the report format version.

For tools, that use the report, it is no difference: If they want to handle errors (e.g. for displaying), they anyway need to change. And then they would just need an additional change to enable this feature to use it. If they don't want to use this feature, they are not forced to change anything (they would also not need to change, if they ignored additional unknown elements in the report).

Not sure, if that would be overengineered already. But I think, we somehow should consider the XML formats (and JSON formats) as API... In that sense: enhancements are possible in minor versions. And yes, it boils down to describe "what we understand are "compatible ways"."...

@jsotuyod
Copy link
Member

provide a new report format (e.g. xml2). We could also introduce a schema at the same time. It has the downside, if we need to change the format again, we would create xml3 and so on...

I'd rather not… more classes, duplication, having to deprecate them, the switch for customer is more complicated and needed even if they don't care about errors / are parsing the report defensively, so we are impacting a 100% of users for something we expect to be an issue for a minority of integrations.

only output the errors in the xml format, if the new report property "renderErrors" is explicitly enabled. By default, the format would not change. In the next major version, we could change the default or always output errors. Downside is, that we need to cleanup later.

This is a viable option. Unfortunately, a rather slow one…

  • We add the property now, defaults to false, warns that it will be turned to true in the next major
  • We default it to true in PMD 8, mark the property as deprecated and warn if overwritten
  • We remove the property in PMD 9

A variant of this option would be to add a property "version" which controls the report format version.

Also viable, same timeline and execution as the previous one.

We could also introduce a schema at the same time.

We can actually introduce the schema right now for all alternatives… the schema will state that there will be 0 or more <error> nodes at the end after duplicates. That holds true under the current format for all alternatives, even if not including the nodes is decided by a property.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in:cpd Affects the copy-paste detector
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants