Skip to content

[rkeops] Setting the number of cores when using CPU backend does not seem to work #383

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
astamm opened this issue Oct 1, 2024 · 4 comments

Comments

@astamm
Copy link

astamm commented Oct 1, 2024

Hi,
As the title of the issue explains, a small reprex for this (Hausdorff distance computation on functional data set).

Generate some data

library(purrr)
library(roahd)
library(withr)

N <- 100
P <- 200
L <- 3
time_grid <- seq(0, 1, length.out = P)
C1 <- exp_cov_function(time_grid, alpha = 0.1, beta = 0.2)
C2 <- exp_cov_function(time_grid, alpha = 0.2, beta = 0.5)
C3 <- exp_cov_function(time_grid, alpha = 0.3, beta = 1)
centerline <- matrix(c(
  sin(2 * pi * time_grid),
  sqrt(time_grid),
  10 * (time_grid - 0.5) * time_grid),
  nrow = 3, byrow = TRUE
)

withr::with_seed(1234, {
  dat <- generate_gauss_mfdata(
    N, L, centerline,
    correlations = c(0.5, 0.5, 0.5),
    listCov = list(C1, C2, C3)
  )
})

dat <- purrr::map(1:N, \(n) {
  rbind(dat[[1]][n, ], dat[[2]][n, ], dat[[3]][n, ])
})
dat <- dat[1:21]

Implement dist() function for Hausdorff distance via {rkeops}

library(rkeops)
reticulate::use_virtualenv(virtualenv = "rkeops", required = TRUE)
check_rkeops()

hausdorff_distance_rkeops <- function(x, y) {
  x_i <- Vi(x)
  y_j <- Vj(y)
  D_ij <- norm2(x_i - y_j)
  max(max(min(D_ij, "i")), max(min(D_ij, "j")))
}

hausdorff_distance_rkeops(t(dat[[1]]), t(dat[[2]]))

dist_rkeops <- function(x) {
  x <- purrr::map(x, t)
  N <- length(x)
  K <- N * (N - 1) / 2
  out <- purrr::map_dbl(1:K, \(k) {
    k <- k - 1
    i <- N - 2 - floor(sqrt(-8 * k + 4 * N * (N - 1) - 7) / 2.0 - 0.5);
    j <- k + i + 1 - N * (N - 1) / 2 + (N - i) * ((N - i) - 1) / 2;
    i <- i + 1
    j <- j + 1
    hausdorff_distance_rkeops(x[[i]], x[[j]])
  })
  
  attributes(out) <- NULL
  attr(out, "Size") <- N
  lbls <- names(x)
  attr(out, "Labels") <- if (is.null(lbls)) 1:N else lbls
  attr(out, "Diag") <- FALSE
  attr(out, "Upper") <- FALSE
  attr(out, "method") <- "hausdorff"
  class(out) <- "dist"
  out
}

Benchmarks

rkeops_use_cpu(ncore = 4L)
bench::mark(
  D1 <- dist_rkeops(dat), iterations = 10L
)
# A tibble: 1 × 13
  expression            min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result
  <bch:expr>          <bch> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list>
1 D1 <- dist_rkeops(… 1.33s  1.39s     0.701    7.05MB     4.56    10    65      14.3s <dist>
rkeops_use_cpu(ncore = 1L)
bench::mark(
  D2 <- dist_rkeops(dat), iterations = 10L
)
# A tibble: 1 × 13
  expression            min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result
  <bch:expr>          <bch> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list>
1 D2 <- dist_rkeops(… 1.43s  1.54s     0.641    6.82MB     4.23    10    66      15.6s <dist>
@astamm
Copy link
Author

astamm commented Oct 1, 2024

Benchmarking on the whole set of 100 curves leads to equivalent median times (35s) and memory allocation (159MB).

@astamm
Copy link
Author

astamm commented Oct 2, 2024

I see that setting the number of cores is handled by {RhpcBLASctl}. This package has the following warning:

openmp-related functions only work at build-time instructions. Omp_get * now returns NA_Integer if R was not built with openmp instructions. NOTE: if R itself is built without OpenMP instructions and BLAS thread is used with OpenMP, this package will not work...XD

In fact, if I run locally RhpcBLASctl::omp_get_num_procs(), it returns NA because R is built by default without OpenMP on macOS.

I will try to get a build with OpenMP to understand if this issue persists. However, it could be good to mention that CPU parallelisation does not work with default R builds on macOS.

@astamm
Copy link
Author

astamm commented Oct 2, 2024

Instructions to (re-)enable OpenMP support with macOS clang compiler can be found here: https://mac.r-project.org/openmp/. Once done, re-installing from source RhpcBLASctl make it discover OpenMP and correctly find the number of cores.

It is trickier with rkeops because it is a wrapper that behind the scene uses the pykeops Python package. Hence, on macOS, one also needs to somehow tell the python installer where to find OpenMP libs and headers to properly build pykeops with OpenMP support.

@astamm
Copy link
Author

astamm commented Oct 2, 2024

The best way to handle this would be to provide a configure.ac file which would generate a configure file via a call to autoconf in which we can generate different Makevars files depending on the detected OS (see https://www.r-bloggers.com/2017/08/setting-up-optional-openmp-support-with-rcpparmadillo/ for example).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant