-
Notifications
You must be signed in to change notification settings - Fork 1k
GitLab CI workflow with sanitizers #6746
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #6746 +/- ##
=======================================
Coverage 98.59% 98.59%
=======================================
Files 79 79
Lines 14665 14665
=======================================
Hits 14459 14459
Misses 206 206 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Did you have a look at the I'm not sure why the most recent run was cancelled, I'll close & re-open to re-trigger: https://github.com/Rdatatable/data.table/actions/runs/12886173136 At a higher level, if performance is bad (say >=20% slower than any current PR-level action), I would recommend only running such an action on To date, |
Thank you for pointing me towards the devcontainer! Unlike I don't think it we'll get it significantly faster than 4 minutes (or anywhere close to 40 seconds), especially if optional dependencies are introduced, so let's leave it on |
Using Installing the dependencies from source (including those from I suppose we could try to abuse the cache system by making an individual cache key for every needed package (plus its version plus the version of R-devel), but it's definitely not intended to be used this way. Is GitLab CI better for tasks like these, with slowly changing 100-megabyte caches? |
GLCI is probably the right way to go (cc @jangorecki & @ben-schwen). Just want to note that because of our minimal approach to dependencies, the cache approach should be OK in general -- I guess the most important package to check with the sanitizer will be bit64, which is relatively stable / won't change noticeably more often than R (I am the maintainer & certainly don't plan a lot of activity 😉). The other packages are probably not important for testing with the sanitizer per se, so if they are slightly behind CRAN it's unlikely to cause issues. |
In GLCI we also cache the pkgs and install them but it seems that in this GHA we also install all dependencies for these pkgs?
In comparison in GLCI we only install dependencies from DESCRIPTION and
|
These are the strong dependencies of the packages from
(Including
Not installing the packages from |
oh, I definitely would not concern myself with getting other.Rraw working in the sanitizer GHA :) |
Also: - Do not check pull requests - Try to cache installed dependencies
Also, add a temporary suppression for R's .Call() due to a minor 'bit' problem.
f4fd007
to
f218563
Compare
Tested on GitLab. The existing workflow is quite comprehensive, I didn't have to do much. The sanitized checks add slightly more than 10 minutes to the CI time. |
.dev/ubsan.supp
Outdated
@@ -0,0 +1,2 @@ | |||
# TODO(bit>4.5.0.1): remove after a new 'bit' version is on CRAN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
planning to do that next after data.table release :)
.gitlab-ci.yml
Outdated
# run the main checks with Address(+Leak),UBSanitizer enabled | ||
test-lin-san: | ||
<<: *test-lin | ||
image: docker.io/rocker/r-devel-ubsan-clang |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this r-devel-gcc-strict
image equivalent?
(thinking it may be preferable to consistently use Jan's repo)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way the entire R and (almost) all packages are compiled with -fsanitize=address,undefined
. I think there is a way compile just data.table
with sanitizers by setting PKG_CFLAGS
, PKG_LIBS
, but then either the sanitizers must be manually preloaded using LD_PRELOAD
or R must be linked with custom MAIN_LD
(FLAGS
) set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, cc @ben-schwen / @jangorecki though who are more familiar with GLCI. Thanks!
I would rather avoid using not ours docker images due to, as it happened in the past already, some of the issues in our tests might be caused by specifics of those images. It also happens they pulled in extra dependencies. Therefore having an own copy of that images already prevents breaking changes or new dependencies being pulled in, which can break some tests. Ultimately would be to move our images from my namespace to Rdatatable namespace. |
I would also favor using our own images to not carry the extra dependency burden. |
@jangorecki I was unable to sign up for a new GitLab account due to their verification requirements, so the |
^ merged the above. I think there's a follow-up step we need to publish the image? We should really write the procedure down... |
Hmm, right, the container image would need to be built in |
@ben-schwen / @jangorecki what would it take to move the images into the Rdatatable account? |
Not much. Move or fork the repo (depending on if Jan wants to maintain some versions). Then we might have to initially push the images (but this might can be skipped if we move the repo) |
Yes, there should be not much extra effort. The only purpose for those images was data.ravle so there is no need to maintain them in my namespace if they will be moved to Rdatatable. |
Great! Jan I think you need to initiate the move, I didn't think I have high enough permissions |
Transfer does not work due to
It will be easier to just take a subset of images that are actually in use and clone them to Rdatatable. |
It was easiest to just https://gitlab.com/Rdatatable/dockerfiles We can prune as follow-ups. History & everything is retained since it's just a copy of the same What I don't know is what's next, e.g. re-registering the images. |
The |
FYI R devel image was set to be build on schedule, each night, not only on pushes |
I may just be bad at GitLab, but it looks like as |
I think you need to do it at a group level (because of our open source license which has a number of limited seats) and not at a repo level. At a repo level, I also can't add members. In any case, I added @aitap |
Thank you very much, @ben-schwen. It's a bit strange that GitLab used to require the |
Now tested on GitLab, takes 10 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like we're in a lot better shape, thanks!
let's leave this sit another day. I started pre release stuff for bit last night, might actually submit it today in which case we can knock off that TODO already.
bit 4.6.0 on shelves now |
They were only needed for a previous version of the 'bit' package.
GitLab CI seems to run the jobs under `sh -e`, which terminates the step on first failure. Use || to capture the initial failures and fail for real later.
Two problems remaining:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM TY!
There are multiple ways to write the workflow in order to detect the issues like #6733.
One is to be quite close to the reference R-hub workflow description, which is in the
HEAD
of the current pull request. Here's how it fails on a recentmaster
.The other is to write a more custom workflow, still reusing the container maintained by Gábor Csárdi and the infrastructure providing binary packages linked with
libc++
for the clang+sanitizers check. It's in the parent of the current pull request or the GHA-sanitizers branch. Here's what the failure looks like.(The supported option is to use
rhub::rhub_setup()
.)Comments from people experienced with GHA are very welcome; this is the third time I'm doing any CI and the first time I'm writing a non-throwaway GitHub workflow.