Skip to content

GitLab CI workflow with sanitizers #6746

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Mar 12, 2025
Merged

GitLab CI workflow with sanitizers #6746

merged 10 commits into from
Mar 12, 2025

Conversation

aitap
Copy link
Contributor

@aitap aitap commented Jan 21, 2025

There are multiple ways to write the workflow in order to detect the issues like #6733.

One is to be quite close to the reference R-hub workflow description, which is in the HEAD of the current pull request. Here's how it fails on a recent master.

The other is to write a more custom workflow, still reusing the container maintained by Gábor Csárdi and the infrastructure providing binary packages linked with libc++ for the clang+sanitizers check. It's in the parent of the current pull request or the GHA-sanitizers branch. Here's what the failure looks like.

(The supported option is to use rhub::rhub_setup().)

Comments from people experienced with GHA are very welcome; this is the third time I'm doing any CI and the first time I'm writing a non-throwaway GitHub workflow.

@aitap aitap requested a review from MichaelChirico as a code owner January 21, 2025 11:40
Copy link

codecov bot commented Jan 21, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.59%. Comparing base (c1d04a4) to head (188b171).
Report is 17 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #6746   +/-   ##
=======================================
  Coverage   98.59%   98.59%           
=======================================
  Files          79       79           
  Lines       14665    14665           
=======================================
  Hits        14459    14459           
  Misses        206      206           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@MichaelChirico
Copy link
Member

MichaelChirico commented Jan 21, 2025

Did you have a look at the clang-ubsan .devcontainer we have? I have used that in the past. I am not sure about performance in GHA.

https://github.com/Rdatatable/data.table/blob/7445df7473869a73103717a3f34d4e233d1c012b/.devcontainer/r-devel-clang-ubsan/Dockerfile

I'm not sure why the most recent run was cancelled, I'll close & re-open to re-trigger:

https://github.com/Rdatatable/data.table/actions/runs/12886173136

At a higher level, if performance is bad (say >=20% slower than any current PR-level action), I would recommend only running such an action on master.

To date, master-only checks have been done in GLCI -- @ben-schwen / @jangorecki do you know if we've ever had a UBSan (or MSan etc) action set up on GitLab?

@aitap
Copy link
Contributor Author

aitap commented Jan 22, 2025

Thank you for pointing me towards the devcontainer! Unlike rocker/r-devel-ubsan, rocker/r-devel-san-clang does have AddressSanitizer enabled, and we can try to additionally enable LeakSanitizer like here.

I don't think it we'll get it significantly faster than 4 minutes (or anywhere close to 40 seconds), especially if optional dependencies are introduced, so let's leave it on master.

@aitap
Copy link
Contributor Author

aitap commented Jan 28, 2025

Using rocker/r-devel-ubsan-clang with GitHub Actions takes a bit more effort.

Installing the dependencies from source (including those from inst/tests/other.Rraw) takes more than 20 minutes. The installed packages can be cached, saving time for future re-runs, but the cache is immutable, leaving us the choice between keying the cache by R-devel version only (making the cached packages gradually more out of date until a new R version is released) and keying the cache by versions of all involved packages (expiring the cache as a whole the moment a single package needs to be updated).

I suppose we could try to abuse the cache system by making an individual cache key for every needed package (plus its version plus the version of R-devel), but it's definitely not intended to be used this way.

Is GitLab CI better for tasks like these, with slowly changing 100-megabyte caches?

@MichaelChirico
Copy link
Member

MichaelChirico commented Jan 28, 2025

GLCI is probably the right way to go (cc @jangorecki & @ben-schwen).

Just want to note that because of our minimal approach to dependencies, the cache approach should be OK in general -- I guess the most important package to check with the sanitizer will be bit64, which is relatively stable / won't change noticeably more often than R (I am the maintainer & certainly don't plan a lot of activity 😉). The other packages are probably not important for testing with the sanitizer per se, so if they are slightly behind CRAN it's unlikely to cause issues.

@ben-schwen
Copy link
Member

ben-schwen commented Jan 29, 2025

In GLCI we also cache the pkgs and install them but it seems that in this GHA we also install all dependencies for these pkgs?

Installing package(s) 'BiocVersion', 'bit64', 'bit', 'R.utils', 'xts', 'zoo',
  'yaml', 'knitr', 'markdown', 'ggplot2', 'hexbin', 'plyr', 'dplyr', 'caret',
  'gdata', 'nlme', 'sf', 'nanotime'
also installing the dependencies ‘listenv’, ‘parallelly’, ‘future’, ‘globals’, ‘digest’, ‘shape’, ‘future.apply’, ‘numDeriv’, ‘progressr’, ‘SQUAREM’, ‘diagram’, ‘lava’, ‘colorspace’, ‘tzdb’, ‘cpp11’, ‘prodlim’, ‘timechange’, ‘stringi’, ‘farver’, ‘labeling’, ‘munsell’, ‘RColorBrewer’, ‘viridisLite’, ‘fansi’, ‘pkgconfig’, ‘utf8’, ‘proxy’, ‘iterators’, ‘data.table’, ‘clock’, ‘gower’, ‘hardhat’, ‘ipred’, ‘lubridate’, ‘purrr’, ‘tidyr’, ‘timeDate’, ‘stringr’, ‘wk’, ‘R.oo’, ‘R.methodsS3’, ‘evaluate’, ‘highr’, ‘xfun’, ‘commonmark’, ‘cli’, ‘glue’, ‘gtable’, ‘isoband’, ‘lifecycle’, ‘rlang’, ‘scales’, ‘tibble’, ‘vctrs’, ‘withr’, ‘Rcpp’, ‘generics’, ‘magrittr’, ‘pillar’, ‘R6’, ‘tidyselect’, ‘e1071’, ‘foreach’, ‘ModelMetrics’, ‘pROC’, ‘recipes’, ‘reshape2’, ‘gtools’, ‘classInt’, ‘DBI’, ‘s2’, ‘units’, ‘RcppCCTZ’, ‘RcppDate’

In comparison in GLCI we only install dependencies from DESCRIPTION and

‘R.oo’, ‘R.methodsS3’, ‘lattice’, ‘evaluate’, ‘highr’, ‘xfun’, ‘commonmark’

@aitap
Copy link
Contributor Author

aitap commented Jan 29, 2025

These are the strong dependencies of the packages from inst/tests/other.Rraw:

parse('inst/tests/other.Rraw')[[1]][[3]] |> eval() |>
 tools::package_dependencies(which = 'strong', recursive=TRUE) |>
 unlist() |> unique()
  [1] "cli"          "glue"         "grDevices"    "grid"         "gtable"
  [6] "isoband"      "lifecycle"    "MASS"         "mgcv"         "rlang"
 [11] "scales"       "stats"        "tibble"       "vctrs"        "withr"
 [16] "utils"        "methods"      "nlme"         "graphics"     "Matrix"
 [21] "splines"      "farver"       "labeling"     "munsell"      "R6"
 [26] "RColorBrewer" "viridisLite"  "fansi"        "magrittr"     "pillar"
 [31] "pkgconfig"    "colorspace"   "lattice"      "utf8"         "Rcpp"
 [36] "generics"     "tidyselect"   "ggplot2"      "e1071"        "foreach"
 [41] "ModelMetrics" "plyr"         "pROC"         "recipes"      "reshape2"
 [46] "stats4"       "class"        "proxy"        "codetools"    "iterators"
 [51] "data.table"   "dplyr"        "clock"        "gower"        "hardhat"
 [56] "ipred"        "lubridate"    "purrr"        "tidyr"        "timeDate"
 [61] "stringr"      "tzdb"         "cpp11"        "rpart"        "survival"
 [66] "nnet"         "prodlim"      "timechange"   "stringi"      "diagram"
 [71] "KernSmooth"   "lava"         "tools"        "shape"        "future.apply"
 [76] "numDeriv"     "progressr"    "SQUAREM"      "future"       "globals"
 [81] "parallel"     "digest"       "listenv"      "parallelly"   "zoo"
 [86] "gtools"       "bit"          "evaluate"     "highr"        "xfun"
 [91] "yaml"         "classInt"     "DBI"          "s2"           "units"
 [96] "wk"           "bit64"        "RcppCCTZ"     "RcppDate"     "R.oo"
[101] "R.methodsS3"

(Including data.table itself, which is interesting.)

BiocVersion is caused by me being lazy and relying on BiocManager::install to only install/update the packages where the latest version is not already installed.

Not installing the packages from other.Rraw will indeed save a lot of time.

@MichaelChirico
Copy link
Member

oh, I definitely would not concern myself with getting other.Rraw working in the sanitizer GHA :)

@aitap aitap marked this pull request as draft February 13, 2025 20:52
Also:
- Do not check pull requests
- Try to cache installed dependencies
Also, add a temporary suppression for R's .Call() due to a minor 'bit' problem.
@aitap aitap force-pushed the GHA-sanitizers-rhubv2 branch from f4fd007 to f218563 Compare February 13, 2025 21:49
@aitap
Copy link
Contributor Author

aitap commented Feb 14, 2025

Tested on GitLab. The existing workflow is quite comprehensive, I didn't have to do much. The sanitized checks add slightly more than 10 minutes to the CI time.

@aitap aitap marked this pull request as ready for review February 14, 2025 07:39
@aitap aitap changed the title GitHub workflow with sanitizers GitLab CI workflow with sanitizers Feb 14, 2025
.dev/ubsan.supp Outdated
@@ -0,0 +1,2 @@
# TODO(bit>4.5.0.1): remove after a new 'bit' version is on CRAN
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

planning to do that next after data.table release :)

.gitlab-ci.yml Outdated
# run the main checks with Address(+Leak),UBSanitizer enabled
test-lin-san:
<<: *test-lin
image: docker.io/rocker/r-devel-ubsan-clang
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this r-devel-gcc-strict image equivalent?

https://gitlab.com/jangorecki/dockerfiles/-/blame/master/r-devel/r-devel-gcc-strict?ref_type=heads#L43

(thinking it may be preferable to consistently use Jan's repo)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way the entire R and (almost) all packages are compiled with -fsanitize=address,undefined. I think there is a way compile just data.table with sanitizers by setting PKG_CFLAGS, PKG_LIBS, but then either the sanitizers must be manually preloaded using LD_PRELOAD or R must be linked with custom MAIN_LD(FLAGS) set.

Copy link
Member

@MichaelChirico MichaelChirico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, cc @ben-schwen / @jangorecki though who are more familiar with GLCI. Thanks!

@jangorecki
Copy link
Member

jangorecki commented Feb 15, 2025

I would rather avoid using not ours docker images due to, as it happened in the past already, some of the issues in our tests might be caused by specifics of those images. It also happens they pulled in extra dependencies. Therefore having an own copy of that images already prevents breaking changes or new dependencies being pulled in, which can break some tests. Ultimately would be to move our images from my namespace to Rdatatable namespace.
If PR passed CI tests on GL then LGTM. To test on glci simply push to branch there.

@ben-schwen
Copy link
Member

I would also favor using our own images to not carry the extra dependency burden.
Otherwise LGTM.

GLCI results

@aitap
Copy link
Contributor Author

aitap commented Feb 17, 2025

@jangorecki I was unable to sign up for a new GitLab account due to their verification requirements, so the !11 pull request with a new image is from an old throwaway account of mine.

@MichaelChirico
Copy link
Member

^ merged the above. I think there's a follow-up step we need to publish the image? We should really write the procedure down...

@aitap
Copy link
Contributor Author

aitap commented Feb 23, 2025

Hmm, right, the container image would need to be built in .gitlab-ci.yml, and the builds don't currently work due to the lack of compute minutes.

@MichaelChirico
Copy link
Member

@ben-schwen / @jangorecki what would it take to move the images into the Rdatatable account?

@ben-schwen
Copy link
Member

Not much. Move or fork the repo (depending on if Jan wants to maintain some versions). Then we might have to initially push the images (but this might can be skipped if we move the repo)

@jangorecki
Copy link
Member

Yes, there should be not much extra effort. The only purpose for those images was data.ravle so there is no need to maintain them in my namespace if they will be moved to Rdatatable.

@MichaelChirico
Copy link
Member

Great! Jan I think you need to initiate the move, I didn't think I have high enough permissions

@jangorecki
Copy link
Member

Transfer does not work due to

Project cannot be transferred to a different top-level namespace, because image tags are present in its container registry

It will be easier to just take a subset of images that are actually in use and clone them to Rdatatable.

@MichaelChirico
Copy link
Member

It was easiest to just git push everything to the new remote:

https://gitlab.com/Rdatatable/dockerfiles

We can prune as follow-ups. History & everything is retained since it's just a copy of the same .git repo.

What I don't know is what's next, e.g. re-registering the images.

@aitap
Copy link
Contributor Author

aitap commented Feb 28, 2025

The !1 merge request at the new location of the containers repo should build and publish the new container.

@jangorecki
Copy link
Member

FYI R devel image was set to be build on schedule, each night, not only on pushes

@MichaelChirico
Copy link
Member

I may just be bad at GitLab, but it looks like as Maintainer I can't add @aitap as a "group member" over there, I think @jangorecki / @ben-schwen as Owner would need to do so.

@ben-schwen
Copy link
Member

I may just be bad at GitLab, but it looks like as Maintainer I can't add @aitap as a "group member" over there, I think @jangorecki / @ben-schwen as Owner would need to do so.

I think you need to do it at a group level (because of our open source license which has a number of limited seats) and not at a repo level. At a repo level, I also can't add members. In any case, I added @aitap

@aitap
Copy link
Contributor Author

aitap commented Mar 4, 2025

Thank you very much, @ben-schwen. It's a bit strange that GitLab used to require the gitlab.com/aitap account to undergo verification and now the requirement is seemingly lifted, but I'm definitely not complaining about it.

@jangorecki jangorecki removed their request for review March 4, 2025 12:28
@aitap
Copy link
Contributor Author

aitap commented Mar 4, 2025

Now tested on GitLab, takes 10 minutes.

Copy link
Member

@MichaelChirico MichaelChirico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like we're in a lot better shape, thanks!

let's leave this sit another day. I started pre release stuff for bit last night, might actually submit it today in which case we can knock off that TODO already.

@MichaelChirico
Copy link
Member

bit 4.6.0 on shelves now

aitap added 2 commits March 6, 2025 17:40
They were only needed for a previous version of the 'bit' package.
GitLab CI seems to run the jobs under `sh -e`, which terminates the step
on first failure. Use || to capture the initial failures and fail for
real later.
@aitap
Copy link
Contributor Author

aitap commented Mar 10, 2025

Two problems remaining:

  • Need to properly fail the test if R CMD check succeeds but UBSan complains about something
  • Need to make sure the new bit is installed for the check
    • Had to clear the cache for that to happen

188b171 tested on GitLab.

@aitap aitap marked this pull request as ready for review March 10, 2025 15:17
Copy link
Member

@ben-schwen ben-schwen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM TY!

@aitap aitap merged commit 494c311 into master Mar 12, 2025
10 checks passed
@aitap aitap deleted the GHA-sanitizers-rhubv2 branch March 12, 2025 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants