Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop Ruby as a build dependency #740

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

Conversation

ffrank
Copy link
Contributor

@ffrank ffrank commented Feb 27, 2024

We have been using Ruby for two things

  1. a test script to find non-conforming YAML files (which has not worked in a long time and had been disabled)
  2. the make yamlfmt target, which is a very hacky approach to having gofmt but for YAML, which is also an odd proposition, given that whitespace has semantic meaning in YAML

This change adds the Python based yamllint to our build dependencies. It has a much smaller footprint compared to bringing in Ruby, especially for folks who want to build mgmt on their regular work/play machines.

The test script now works and gives actionable error output.

The yamlfmt target essentially does the same thing now, though. No automatic "fixing" of YAML formats anymore, but then we probably also don't expect to write that much YAML any longer.

Linter settings were chose so that few changes are needed in existing YAML. It's another benefit of this approach that we can now curate our linter configuration.

@ffrank
Copy link
Contributor Author

ffrank commented Feb 27, 2024

I have no idea why the file-mode.sh test suddenly fails in github CI. Works On My Machine.

Makefile Outdated
@@ -215,7 +215,7 @@ gofmt:
find . -maxdepth 9 -type f -name '*.go' -not -path './old/*' -not -path './tmp/*' -not -path './vendor/*' -exec goimports -w {} \;

yamlfmt:
find . -maxdepth 3 -type f -name '*.yaml' -not -path './old/*' -not -path './tmp/*' -not -path './omv.yaml' -exec ruby -e "require 'yaml'; x=YAML.load_file('{}').to_yaml.each_line.map(&:rstrip).join(10.chr)+10.chr; File.open('{}', 'w').write x" \;
find . -maxdepth 3 -type f -name '*.yaml' -not -path './old/*' -not -path './tmp/*' -not -path './omv.yaml' | xargs yamllint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently there's a tool called yamlfix that might satisfy this.
https://github.com/lyz-code/yamlfix

@@ -1,3 +1,4 @@
---
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could disable the requirement for this in yamllint if we wanted

@purpleidea
Copy link
Owner

I have no idea why the file-mode.sh test suddenly fails in github CI. Works On My Machine.

Race bugs in testing env. I will disable it.

@purpleidea
Copy link
Owner

then we probably also don't expect to write that much YAML any longer.

Quite true. I expect a few incoming config file things, but in general it's not as much of an issue.

@ffrank
Copy link
Contributor Author

ffrank commented Feb 29, 2024

In github CI I realized we have two more ruby users: mdl and fpm. fpm is not so painful because it doesn't need to be part of the build, and if we want to run it in CI, github actions can still just run ruby setup actions.

However, mdl is an entire different story. I tried substituting it with pymarkdownlnt, which has a limitation around tables (I can work around), but also has an annoying bug that occurs when indented blocks appear in enumerations or bullet point lists, and TABs are used for indentation. This begs the questions:

Why do we insist on tabs in markdown? It's not best practice and we explicitly disable the MD010 rule that would warn us about this. Linting life would be much easier without tabs. Would spaces be so bad here?

(This mostly happens in docs/language-guide.md.)

@purpleidea
Copy link
Owner

Why do we insist on tabs

I love my tabs. For many reasons unfortunately.

  1. Accessibility
  2. Not having to hammer my space bar to indent
  3. Easy of use for when you want to see more or less indents

...

@purpleidea
Copy link
Owner

(And in theory we should be adding more markdown docs as time goes on, so we definitely want to keep the validator.)

@ffrank
Copy link
Contributor Author

ffrank commented Feb 29, 2024

To illustrate the issue in the previous comment:

$ pymarkdownlnt scan docs/language-guide.md 

Unexpected Error(BadTokenizationError): An unhandled error occurred processing the document.

The issue is a failed assertion in this parser. I believe that it's caused by a bug, and I have been looking into it, but debugging someone else's parser/lexer code is...challenging.

@purpleidea
Copy link
Owner

While I love the idea of dropping the use of any ruby, it being there is not personally blocking me atm.

But if you're really keen on this issue, maybe this would help though: https://github.com/gomarkdown/markdown ?

@ffrank
Copy link
Contributor Author

ffrank commented Feb 29, 2024

Oh I definitely want to be sure we keep linting the markdown. We should even work towards disabling fewer rules imo.

So yeah I filed jackdewinter/pymarkdown#1015 and will also look at the gomarkdown thingy.

@ffrank ffrank force-pushed the noruby branch 2 times, most recently from 9034c4c to 8570028 Compare March 1, 2024 23:44
@ffrank
Copy link
Contributor Author

ffrank commented Mar 2, 2024

The tip of this now has a shape like I could imagine it. Let me know your thoughts please.

If you all find this an acceptable way forward, I will squash it down.

Note that I went for pipenv over just jamming pip packages into everyone's systems (except pipenv itself...apparently it's best practice to install it through pip). I hope this will be more sustainable (and not more fragile).

@purpleidea
Copy link
Owner

Wow, lots of neat work, thanks Felix!

I've just had a short read through as a diff of the whole thing, rather than each commit separately. There are clearly a bunch of great changes I can see already.

There are a few unknowns I'd need to dig into first. My first question is that if we're moving from ruby to python, is this a major improvement? It definitely seems to be from a packaging point of view. I'd like to rule out gomarkdown though first, at least for the mdl alternative.

Lastly, how did you want to squash? (I do not recommend squashing all of it into a single commit.) If there are a few stand-alone / easy commits to squash and/or merge early, maybe I can make this easier to review by merging those first?

I have a very large feature branch that I'll be merging shortly, and my goal is to merge this right after that. I don't think there are would be any major conflicts, but will do it in that order just to be safe.

Cheers!

@ffrank
Copy link
Contributor Author

ffrank commented Mar 5, 2024

There are a few unknowns I'd need to dig into first. My first question is that if we're moving from ruby to python, is this a major improvement? It definitely seems to be from a packaging point of view. I'd like to rule out gomarkdown though first, at least for the mdl alternative.

Python is already a dependency, and no modern Linux box can function without python. It's a lot less noise on contributor's machines in this way. Ruby is rather obscure. Even Puppet users don't install it anymore, since Puppet vendors its own Ruby (go figure).

I will try replacing pymarkdown with gomarkdown before I proceed.

Lastly, how did you want to squash? (I do not recommend squashing all of it into a single commit.) If there are a few stand-alone / easy commits to squash and/or merge early, maybe I can make this easier to review by merging those first?

Mainly I want to get rid of some intermediary steps like ruby-yaml-fix -> yamllint -> yamlfix. Also if I go for gomarkdown, any mention of pymarkdown needs not be in our history.

But yes, otherwise I'm no fan of single-commit branches either. Some history can be helpful in archaeology down the line.

@ffrank ffrank marked this pull request as draft March 5, 2024 17:17
@ffrank
Copy link
Contributor Author

ffrank commented Mar 5, 2024

So gomarkdown does not come with linting functionality. There is gomarkdown/mdfmt, but it's unmaintained and go build fails.

A slightly better maintained version is at https://github.com/shurcooL/markdownfmt/ and it does build, but it's not really customizable, and at a glance, it seems to do the opposite of what we want, e.g. joining our nice and carefully crafted 80 character line paragraphs into single long lines.

@purpleidea
Copy link
Owner

haha, okay! I'm okay with the python route for now if you think that's preferable. I'm a bit concerned about the open issue and wouldn't mind waiting until that's fixed before we swap it though.

Would you be able to rebase/squash your commits into a few separate ones based on logical functionality that you think makes sense and I'll review and merge those?

@purpleidea
Copy link
Owner

Or I could pick off the easy ones first if you'd like. LMK

@ffrank
Copy link
Contributor Author

ffrank commented Mar 5, 2024

I will clean up and ping you.

@ffrank ffrank marked this pull request as ready for review March 6, 2024 15:05
@ffrank
Copy link
Contributor Author

ffrank commented Mar 6, 2024

Okay, go.

@ffrank
Copy link
Contributor Author

ffrank commented Mar 10, 2024

@purpleidea this is ready for review.

When error messages are written to stdout, they will be considered as
output in case we want to fail from inside $( ) or backticks, and then
the error does not end up on the terminal.
We pull in pipenv and install pip packages for yamlfix and pymarkdownlnt
into the virtualenv. This allows us to drop mdl and later replace the ruby
hack in Makefile that currently does our yamlfmt.
At the same time, introduce yamlfix for a proper "make yamlfmt" workflow
that does not rely on the old ruby based hack.

Also change existing YAML to reflect the result.
Add a fold in github actions output around the ragel build.

Run the commit-message test locally, so that error can be detected
before pushing to CI. We also now accept two-letter topics.

Some minor improvement in the testing scripts.
@ffrank
Copy link
Contributor Author

ffrank commented Apr 1, 2024

Update, I'm in communication with @jackdewinter in jackdewinter/pymarkdown#1015 in order to solve the limitation around tab characters. Once the fix works, it will allow us to remove the workaround from test-markdownlint.sh.

But I feel that we can merge as is now, and remove the workaround later, so this needs not be blocked.

@@ -10,11 +10,12 @@ echo running "$0"
ROOT=$(dirname "${BASH_SOURCE}")/..
cd "${ROOT}" || exit 1

commit_title_regex='^\([a-z0-9]\(\(, \)\|[a-z0-9]\)\+[a-z0-9]: \)\+[A-Z0-9][^:]\+[^:.]$'
commit_title_regex='^\([a-z0-9]\(\(, \)\|[a-z0-9]\)*[a-z0-9]: \)\+[A-Z0-9][^:]\+[^:.]$'
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It drops the random requirement that the "topic:" prefix has at least three characters in the topic.

@@ -183,5 +184,13 @@ then
test_commit_message_common_bugs $commit
done
fi
else # assume local branch
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sweet!

@@ -155,42 +157,30 @@ test_commit_message_body() {
if [[ -n "$TRAVIS_PULL_REQUEST_SHA" ]]
then
commits=$(git log --format=%H origin/${TRAVIS_BRANCH}..${TRAVIS_PULL_REQUEST_SHA})
[[ -n "$commits" ]]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we nuking everything from here to the end?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just avoid copy/paste by doing the for commit in $commits work at the bottom of the script.

then
echo "FAIL: Commit message should match the following regex: '$commit_title_regex'"
echo "('$title' does not match)"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@purpleidea
Copy link
Owner

I've cherry-picked a few of these commits. I think we should hold off on the markdown/yaml thing for the moment. I need to have a deeper look into it all. Thanks for the fun commits!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants