Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests are very slow and a lot of time is spent in the wait4() system call #733

Open
brgl opened this issue Jun 2, 2023 · 4 comments
Open
Labels
Component: Bash Code Everything regarding the bash code Priority: Medium Wrong or misleading documentation, broken behavior with workaround Status: Unconfirmend No reproducer was provided or the reproducer did not work for maintainers. Type: Bug

Comments

@brgl
Copy link

brgl commented Jun 2, 2023

Describe the bug
The libgpiod project uses bats for its command-line tools tests. The tests use linux kernel's gpio-sim module for mocking up GPIO chips and in general cover the same functionality in shell tools as the GLib tests covering the C library.

While it's expected for the C tests to run much faster than bash ones, the bats test-suite is painfully slow - it takes 26 seconds on a regular work laptop to complete vs 0,28 seconds for the C tests on the same machine.

Upon investigation it turned out that the majority of the time in all sub-processes is spent waiting for child processes to exit as evidenced by the following output of strace -c -f:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ------------------
 91.48   23.231926        1598     14537      6173 wait4
  2.30    0.584469           0    614218           rt_sigprocmask
  1.46    0.370631           2    158077         8 read
  1.21    0.308347           3     97831     80115 readlink
  0.64    0.162784          20      8059           clone
  0.51    0.129295           1     75338      7495 newfstatat
  0.41    0.104257          20      5081       498 execve
  0.24    0.061996           1     42027           mmap
  0.24    0.059774           1     51657     37681 ioctl
  0.20    0.050864           0    139673           rt_sigaction
  0.18    0.046581           0     57523      3246 close
  0.18    0.045685           3     13972           write
  0.10    0.025445           1     15038           mprotect
  0.10    0.025105           1     19771      6448 lseek
  0.10    0.025086           1     16949         9 openat
  0.09    0.022840           0     25950       657 fcntl
  0.06    0.016108           0     16903           brk
  0.04    0.010011           0     14011           dup2
  0.04    0.009773           2      4685           munmap
[...]

This doesn't seem normal and I'd like to request some help in figuring out what's wrong. I'm posting it as a bats issue but it may very well be an issue in how I'm using the framework.

To Reproduce
Steps to reproduce the behavior:

  1. Clone current master of the libgpiod repo
  2. run ./autogen.sh --enable-tests --enable-tools --enable-gpioset-interactive && make
  3. run sudo ./tools/gpio-tools-test and observe slow execution.

Please note that your kernel needs to be built with the gpio-sim module enabled for the tests to work.

Alternatively one can use yocto to build a qemu vm allowing to run these tests without recompiling the kernel - if anyone is interested, I can prepare a repo that would automate the build.

Expected behavior
Tests execute much faster and don't spend ~92% of time in wait4().

Environment (please complete the following information):
Tested on two separate environments with same results - one was a regular Ubuntu 23.04 system, the other was a custom Yocto based distro.

  • Bats version: 1.8.2 and 1.9.0 respectively.
  • operating system (including version): Ubuntu 20.04 and Yocto based on current poky master branch.
  • bash --version: 5.2.15
  • Install method: distribution package and bitbake recipe respectively
@brgl brgl added Priority: NeedsTriage Issue has not been vetted yet Type: Bug labels Jun 2, 2023
@martin-schulze-vireso
Copy link
Member

I'd be interested in the yocto build for qemu to reproduce this.

@martin-schulze-vireso martin-schulze-vireso added Priority: Medium Wrong or misleading documentation, broken behavior with workaround Status: Unconfirmend No reproducer was provided or the reproducer did not work for maintainers. Component: Bash Code Everything regarding the bash code and removed Priority: NeedsTriage Issue has not been vetted yet labels Jun 3, 2023
@brgl
Copy link
Author

brgl commented Jun 5, 2023

@martin-schulze-vireso Here you go

Let me know if anything is missing from the README

@brgl
Copy link
Author

brgl commented Jun 15, 2023

I think this may just be the way bats works - running the bats' own test-suite results in a very similar strace output:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ------------------
 94,78   87,937096        2697     32603     14298 wait4
  1,20    1,117514           0   1695481           rt_sigprocmask
  1,09    1,008525           0   1045150        27 read
  0,66    0,608701          36     16741           clone
  0,48    0,441647          11     37725     23464 execve
  0,21    0,199233           0    274617    107718 newfstatat
  0,20    0,182347           1     98627      4962 openat
  0,18    0,165625           0    169950           mmap
  0,15    0,134789           0    184782    162361 ioctl
  0,14    0,129233           3     40735           write
  0,12    0,113985           0    198739      4196 close
  0,10    0,095850           0    139639      3159 fcntl
  0,09    0,086048           0    312173           rt_sigaction
  0,08    0,070505           0     95010     34204 lseek
  0,06    0,053366           0     56400           dup2
  0,06    0,052509           1     51470           mprotect
  0,05    0,045429           7      5704      3457 mkdir

@martin-schulze-vireso
Copy link
Member

That may well be but maybe we can improve Bats runtime...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Bash Code Everything regarding the bash code Priority: Medium Wrong or misleading documentation, broken behavior with workaround Status: Unconfirmend No reproducer was provided or the reproducer did not work for maintainers. Type: Bug
Projects
None yet
Development

No branches or pull requests

2 participants