-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Publish as modular JAR(s) to Maven Central #14
Comments
As far as Maven goes, this project so far isn’t internally relying on Maven for anything. Maven and the Central Repository aren’t the sole way or even the primary way that the HTML parser releases has been distributed. And Maven isn’t the primary build tool used for building releases. Given that, making any disruptive changes to the sources primarily just to get things to work more easily with Maven seems like a “tail wagging the dog” kind of thing; in particular, the following:
Personally speaking, I don’t use Maven at all. For the HTML checker and its dependencies, the only reason I publish Maven packages is that others have asked for them. So I’m not super enthusiastic about the prospect of changes being made to the HTML parser sources if those changes would end up meaning that I’d then need to put significant time into making fixes myself to the HTML checker to align it with the HTML parser changes. However, all that said, if @hsivonen sees benefits in what you’re proposing, then I would not be opposed — and I’d be willing to commit time to reviewing and testing patches. |
Thanks for your feedback. Just for the record, my sole goal is to have modular JAR(s) published to Maven Central. I'm open to any suggestions on how to achieve that. What I've described, is just 1 way to do so, but one that I'm willing to help out with. And I could keep the packages unchanged, such that there shouldn't be any code changes at all, except for XOM/jchardet removal. |
As mentioned by @sideshowbarker, changing the source layout just to adapt to Maven defaults does not look reasonable. My projects do not follow Maven defaults, instead I change the POM to override the defaults.
I assume that you mean a multi-release jar, but recent JDKs cannot target 1.5 (1.7 is the minimum AFAIR). So the question is whether the users of this project require 1.5 binary compatibility, or which one would be the minimum acceptable. Otherwise the idea of adding a
No matter which tool is used to build a release, setting an automatic module name would be useful to those of us that have builds for recent JDKs; my pull request does that for Maven. Otherwise, people using JDKs version 9 or later are going to get similar warnings to the following one that I get when I build my css4j-agent module with Maven:
Having a full JPMS modularization like @anthonyvdotbe proposes would be good, but my pull request is a simpler solution that is available just now. More information about module names: |
Yeah, going for 8 years without a new release is a long time, so I agree it’s time we should strongly consider trying to get a new release out.
I also agree that given the shift in Java overall to a module system, it makes sense to strongly consider moving to distribution of a modular jar. And as far as Central, in practice even for non-Maven users, it now seems to be the single primary common place to get jars from — so I also agree it makes sense to provide distributions there.
I personally have no objection to first just merging that small change from #13, if @hsivonen is supportive. But even if we were to only do that, we’d still need to do a new release, right? If we’re gonna release a new jar at all, I guess it makes sense to strongly consider doing the work to ensure it’s got the whole set of current-best-practice characteristics for jar releases that are expected by users nowadays. But I can’t commit to support for moving forward with anything until @hsivonen weighs in. |
Other projects are releasing a jar with an automatic module first, and then another with full modularization. But any issues about split packages etc. should be resolved before that. Also, my PR assumes that there will be a single module. I am unaware of any split-package issue inside this project (yes the dependencies do have them), and the single-module approach seems correct unless one wants to start making deep changes to the project just now. But of course if one and only one release is going to be produced, by all means let it have full JPMS. |
@anthonyvdotbe I forgot to mention something about this:
There are tricks to use dependencies with split packages (I have a similar problem in my css4j-dom4j module), so XOM could possibly be kept. Yes, I know that's less than perfect. And jchardet could be kept as a filename-generated module name (much like this project's htmlparser.jar is currently). Again, yes I know... |
I'm completely OK with publishing a new release. I should have done it many times over the past years. It's been a "next week" item for way too long. 8 years indeed is a long time. 😬 I also want to get rid of jchardet and the ICU detectors. Neither represents present-day browser-compatible behavior. Ideally, these would be replaced with chardetng, but adding JNI dependencies isn't really in the Java culture, and I haven't seriously investigating compiling it into Java byte code. (asmble might be an avenue.) I'd like to better understand the motivation for removing XOM support. I gather that Maven ships both source packages and binary packages. The parser is designed such that it requires a XOM jar to be present at build time, but once you have the binary jar, if you don't want to use XOM, you don't need to have XOM present. How big a problem is this with Maven? It also seems bad to make breaking changes (shuffle around the Java package naming structure) for the sake of Maven. So I'd be OK with
Do I understand correctly that some policy change prevents next step being just bumping the version number is |
I believe that he wants to remove XOM because it is not modularized.
Neither Maven nor JPMS require that (in the specific case of this project). The motivation to move saxtree is unclear to me.
Looks like a good candidate for a package that is not to be JPMS-exported (if not removed finally).
They were considering to automatically enforce the rejection, but do not think that they finally did it. In any case, this is only going to affect the Maven deployment if your build is a JPMS-modularized one. The warning is mostly due to concerns about a possible 'DLL hell' if explicit module names and filename-based names are mixed together. That's why people are encouraged to declare a name ASAP. |
asmble looks interesting indeed. Another alternative might be to run it on GraalVM. There's also project Panama, but that's WIP and focused on C for now. Either way, just to set expectations, supporting chardetng is out of my league :)
tl;dr I'll leave XOM support untouched, since it doesn't pose a problem to neither Maven, nor Java modules. (The motivation was: we should start with a clean slate and drop support for anything that isn't itself modularized (or is likely to be anytime soon). And modularizing XOM will be hard, because it depends, both directly & indirectly, on xml-apis:xml-apis (the version that is currently depended on by htmlparser, 1.1, uses its predecessor, xerces:xmlParserAPIs). The problem with these artifacts, is that they package APIs which are part of Java SE's
No problem, the package structure will be retained. (The motivation for asking about moving it into the
Given the above, I'm proposing to remove the
I'm not sure I understand your question, but I can explain what the warning means: JARs which aren't modularized (i.e. don't contain a module foo {
requires bar;
} but as long as |
Something that has not been mentioned: you are already declaring a module name, just OSGi instead of JPMS:
So you are already shipping those packages under that name (not sure if you have real OSGi users though). |
There is no need to have two modules, you can have the two packages under the same module. |
Yes, but from the following:
it follows that |
Modules are intended to group related packages, and Not to talk about potential additional work for those in the Let's see what @hsivonen thinks though. |
I’ve come around to realizing that the best way to avoid any problems with keeping the I also want to retract something related that I wrote here earlier:
I realize now it wasn’t really relevant to mention that here — because for one thing, the mechanism the HTML checker uses to build the integrated parser is that it directly walks the Also, any choice of build tools we end up deciding on within the parser repo itself isn’t gonna affect the checker build of the parser. I would just be keeping the checker build of the parser the same anyway, regardless. So please ignore anything I wrote here previously about needing to consider the HTML checker usage of the parser when making decisions about how we want to do things here going forward. Whatever we decide, it should just be based on trying to determine what’s best for the wider community of users for the parser, and current best practices. |
To summarize, here's the open questions I'd like confirmation on. Is it ok to...:
I also filed elharo/xom#142 w.r.t. XOM support: to be able to publish modular JARs, we'd need a XOM release which has at least |
There are about 60% of developers still on Java 8, according to latest surveys, and OpenJDK 8 is supported until May 2026. Figures are between 3 and 7% for Java 7 usage. My project produces multi-release JARs valid for Java 8 and 11 (7 and 11 for 1.0), and a dependency that only runs on 11 would be useless. |
Good point, I scrapped the question: the sources themselves will still be compiled for Java SE 8. |
Not sure that I follow you here: do you mean that the source level will be 8, or that you compile targeting Java 8? The former would be irrelevant to the JAR, and the latter rules out |
With "the sources themselves", I meant: "everything but module-info.java". So: |
|
Strictly speaking it won't be a multi-release JAR, but even it it were: I don't see what point you're trying to make? The JAR will work as a modular JAR on Java SE 11+, and as a plain old JAR on Java SE 8. |
My point is to try to clarify what you are proposing, because I suggested to target different releases in my first response to your post (where you talked about keeping source level 1.5), and your last summary still involved a Java 11-only thing. Saying "the sources themselves will still be compiled for Java SE 8" in the context of a modular project is something that needs clarification. An then, I wonder why Java 8 and not 7. |
I never talked about keeping source level 1.5. I said:
which I thought was sufficiently clear to mean:
|
Motivation:
I'm willing to help by providing PRs such as:
module-info.java
file, so I'd like to upgrade the compiler level to 11 as well (the code can stay on the 1.5 syntactic level. I'm not planning to make any code changes, apart from package declarations and import statements)xml-apis:xml-apis
Maven artifact, and thus has a "split packages" issueHowever, I have a few questions:
@hsivonen @sideshowbarker what are your thoughts on this?
The text was updated successfully, but these errors were encountered: