Fix InactiveDestroy #175

rovo89 · 2022-11-28T15:58:12Z

These commits fix the "fast" cleanup when attribute InactiveDestroy is set.

rovo89 · 2022-11-28T15:59:58Z

I'm still looking for a good way to avoid $sth->finish() being called on destroy when InactiveDestroy is set. This is important when trying to disconnect without fetching all pending results and possible in some cases where a child process is forked and joined while fetching results.

Without this change, the END block of DBI would call disconnect_all() and that would call mysql_close() on all open connections. This is bad because forks would destroy their parent's context on exit as shown in the example for AutoInactiveDestroy in the DBI docs. This example fails with "DBD::MariaDB::db do failed: Lost connection to MySQL server during query" without this fix: $dbh->{'AutoInactiveDestroy'} = 1; if (my $pid = fork()) { waitpid($pid, 0); $dbh->do('SELECT 1'); } else { exit(0); }

DBI already calls dbd_st_finish() before dbd_st_destroy() if the handle is active. For inactive handles no finish() should be necessary. With InactiveDestroy set, the handle is always considered inactive. Skipping finish() could lead to "out of sync" situations and inconsistencies, however, the attribute is explicitly meant to skip all DB-related destroy actions, especially for the case when the connection will be closed anyway. The only thing which is not cleaned up by finish() is the last result. This commit fixes the following example: $dbh->{'AutoInactiveDestroy'} = 1; my $sth = $dbh->prepare('SELECT id FROM bcwb_maint_types; SELECT SLEEP(1); SELECT text FROM bcwb_maint_types;'); $sth->execute(); my $i = 0; do { printf "Result %d\n", ++$i; while (my $id = $sth->fetchrow_array()) { print $id, "\n"; } print "---\n"; if ($i == 1) { if (my $pid = fork()) { waitpid($pid, 0); } else { exit(0); } } } while ($sth->more_results);

rovo89 · 2022-11-29T15:43:20Z

@pali Could you have a look please? Thanks!

pali · 2022-11-29T21:11:07Z

I have looked at it but I'm really not sure if this work correctly. There are lot of pitfalls and edge cases. I think that I had in my mind some solution for this issue but I have never implemented it. Moreover such feature must be covered by tests and currently project lacks big CI testing with different Perl versions, different mysql/mariadb client versions and server versions. Travis is already gone, so it needs to be replaced by for example Github actions. And just now I fixed AppVeyor CI for Windows builds. So I would propose to postpone this change after CI matrix testing is again working to prevent any breakage. I'm not really sure that changing code related to PL_dirty does not cause segfault in some library combination.

rovo89 · 2022-11-30T08:21:25Z

I can understand the need for tests. Even for a single database though, it will be quite hard to verify correctness.

For a statement handle destroy, I think it's required to:

Bring the DB into a state where it can execute other statements again (e.g. read pending rows and results from the socket)
For server side prepare: Tell the database were not going to use the statement anymore (mysql_stmt_close)
Free structures allocated via MariaDB API (e.g. mysql_free_result, mysql_stmt_free_result)
Free DBI-related structures

For a database handle destroy:

Destroy all statements (should be ensured by Perl reference counting)
Tell the database we're going to disconnect (mysql_close)
Free structures allocated via MariaDB API (but I don't think there are any for the database handle?)
Free DBI-related structures

If InactiveDestroy is set, any communication with the database must be skipped. The socket is shared, so if one process implicitly destroys all DBI objects on exit, it must be ensured that it does not:

Read anything (!) from the socket (e.g. the pending results), otherwise the parent process will wait for the data infinitely.
A side-effect is that the exit will be faster because it doesn't need to process potentially huge results. I had a case where I set InactiveDestroy even on the parent process. Processing millions of rows takes a while, and if the user pressed Ctrl+C it would need to fetch all pending rows from the database before destroying everything. Setting the attribute made the process finish immediately.
Tell the database to destroy anything statement- or database-related, because it's still in use by the parent process.
Tell the database we're going to disconnect, otherwise the server closes the connection and the parent process gets a "Lost connection" error.

The problem is that I don't think there is a way to check whether communication over the socket has happened (or not), whether there is still anything allocated on DB side or via the MariaDB API (or not, if we still need it) and whether DBI structures are still allocated.

I think we can only test the side-effects, i.e. connect, send a query, fork, wait for the child to exit, and see if we can still fetch the results. That's basically the two examples in my commit messages. I can try to pack that into a test case. A little problem is that with the implicit $sth->finish(), the child process has already read the pending result from the socket, and the parent process will hang polling for that exact same information. Maybe a timeout would do?

These test cases can only verify though that not too much was freed, but not if something was missed to be freed.

rovo89 · 2022-11-30T08:29:59Z

I'm not really sure that changing code related to PL_dirty does not cause segfault in some library combination.

As far as I understood the comment, the purpose was to avoid accessing $dbh during global destruction as it might have already been destroyed. However the only remaining code doesn't need the database handle, it just frees the memory allocated by the MariaDB API. It wouldn't hurt much to keep the check because the whole process will anyway be destroyed milliseconds later, but I don't think it's necessary.

DBI already takes care to skip the $sth->finish() call when it's not safe/required:

PL_dirty
database handle inactive
statement handle inactive
InactiveDestroy set on the statement handle (which effectively turns the active flag off during destroy)

Point of the commit is not to duplicate this work, but only free what might still be allocated.

pali · 2022-12-04T11:46:53Z

Tell the database we're going to disconnect (mysql_close)

This is an issue. You must call mysql_close (or similar function) to release memory allocated for $dbh even when InactiveDestroy is set. Otherwise you introduce memory leak.

And I think this is the reason why InactiveDestroy was not properly implemented yet.

bugfood · 2022-12-28T22:39:51Z

Thank you for your work on this. I ran into this bug with my own code and wrote a test script (thinking I was going to write a new bug report).

My test script was showing two different types of failure, apparently at random:

The child closes the connection to the server, resulting in the parent failing, as reported in this PR.
The child reports an error instead: panic: DBI active kids (-1) < 0 or > kids (0)

I tested this PR (cherry-picked on top of current master), and the child no longer closes the connection, which is good. I still get the error from the child about half the time, though. I don't know if this is a separate bug or if it is somehow related. It doesn't happen at all with DBD::mysql.

Here is my test script (renamed as .txt to get around github's filtering):
forktest.pl.txt

Here is a good example run, using patched DBD::MariaDB::

$ strace -f -o /tmp/strace.ok.log ./forktest.pl -d MariaDB -f -a
connect to: DBI:MariaDB:;host=localhost
child 480267 : initiating normal exit
parent 480266 : child process complete
parent 480266 : exit

strace.ok.log

Here is a failing example run, also using patched DBD::MariaDB:

$ strace -f -o /tmp/strace.bad.log ./forktest.pl -d MariaDB -f -a
connect to: DBI:MariaDB:;host=localhost
child 480286 : initiating normal exit
panic: DBI active kids (-1) < 0 or > kids (0) at /usr/lib/x86_64-linux-gnu/perl5/5.36/DBI.pm line 759.
END failed--call queue aborted.
parent 480285 : child process complete
parent 480285 : exit

strace.bad.log

I'm using perl v5.36.0 (Debian package 5.36.0-6) with DBI version 1.643 (Debian package 1.643-4).

Thanks,
Corey

pali · 2023-08-26T21:57:58Z

@rovo89 you can use a hack/trick with invalidating file descriptor before calling mysql_close() when you need connection to stay open but want to free mysql client resources. Current master branch already contains similar hack/trick for Windows builds due to dichotomy between file descriptor and socket handle in Perl API and Windows API. This could be a way how to properly implement InactiveDestroy.

cbrandtbuffalo · 2024-01-19T20:44:34Z

We use this feature in Request Tracker, so we can add some possible additional test cases. We are working on a branch to support running with DBD::MariaDB and have gotten most of our tests passing. One remaining failure is one that relies on InactiveDestroy. We have some code that uses it here:

https://github.com/bestpractical/rt/blob/6b3a2158b5d502a9115a59efb5c550050f9b72f2/lib/RT/Util.pm#L60

And tests here:

https://github.com/bestpractical/rt/blob/5.0/db-type-mariadb/t/api/safe-run-child-util.t

Was InactiveDestroy working in the past in DBD::mysql and something changed in MariaDB?

choroba · 2024-02-12T21:07:13Z

@cbrandtbuffalo: The code comments suggest InactiveDestroy never worked properly in DBD::mysql, either. There were resource leaks preventing the connection from closing (regardless of InactiveDestroy). The leaks were fixed in DBD::MariaDB, which means the connection is now always closed.

choroba · 2024-02-12T21:09:06Z

@cbrandtbuffalo: Can you check the WIP change here with your tests?
https://github.com/perl5-dbi/DBD-MariaDB/compare/master...choroba:inactive-destroy?expand=1

cbrandtbuffalo · 2024-02-13T14:44:54Z

Yes, I can take a look. All of our tests run in github actions, so the environment is available in the RT github repo. To test, it looks like I'll need to create a new build of the base Debian docker image using your PR. I'll comment if I'm able to run a test.

cbrandtbuffalo · 2024-02-16T20:38:38Z

The RT tests pass for me when running your branch locally on a Mac. For some reason, the tests still fail in github actions using this docker image: https://hub.docker.com/layers/bpssysadmin/rt-base-debian/RT-5.0.3-buster-testmariadb2-20240215/images/sha256-838a7cb36dad1a2883c30dab9ac6e94cf30d354efdabe2214cf122d9fb5241a5?context=repo I'm still working on debugging. It's possible the build isn't correctly grabbing and building the branch.

cbrandtbuffalo · 2024-02-20T17:59:20Z

We fixed the docker image and can confirm the branch allows our RT tests to pass. You can see the green checkmark here: https://github.com/bestpractical/rt/tree/5.0/db-type-mariadb. As noted before, we also see passing tests running locally on Mac OS X. We also confirmed it passed on Ubuntu.

cbrandtbuffalo · 2024-03-01T15:54:33Z

@pali Do you have other thoughts on this branch based on the testing noted above? Our use case for this feature is we have a running web server process (RT) which can fork other processes. When the forked process exits, we don't want it to disconnect the active DB connection the main RT process is using, so we use this flag.

cbrandtbuffalo · 2024-07-03T19:43:59Z

I noticed some activity with a few commits from @Tux and @choroba so I wanted to ping this PR just in case there might be a release in the future. I know people have had limited time, so I'm happy to help if there's anything do.

rovo89 added 2 commits November 29, 2022 16:41

rovo89 force-pushed the inactive_destroy branch from b77cef5 to ff8dc1d Compare November 29, 2022 15:41

rovo89 marked this pull request as ready for review November 29, 2022 15:43

rovo89 mentioned this pull request Mar 8, 2023

Various fixes for mariadb_use_result #174

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix InactiveDestroy #175

Fix InactiveDestroy #175

rovo89 commented Nov 28, 2022 •

edited

Loading

rovo89 commented Nov 28, 2022

rovo89 commented Nov 29, 2022

pali commented Nov 29, 2022

rovo89 commented Nov 30, 2022

rovo89 commented Nov 30, 2022

pali commented Dec 4, 2022

bugfood commented Dec 28, 2022

pali commented Aug 26, 2023

cbrandtbuffalo commented Jan 19, 2024

choroba commented Feb 12, 2024

choroba commented Feb 12, 2024

cbrandtbuffalo commented Feb 13, 2024

cbrandtbuffalo commented Feb 16, 2024

cbrandtbuffalo commented Feb 20, 2024

cbrandtbuffalo commented Mar 1, 2024

cbrandtbuffalo commented Jul 3, 2024

Fix InactiveDestroy #175

Are you sure you want to change the base?

Fix InactiveDestroy #175

Conversation

rovo89 commented Nov 28, 2022 • edited Loading

rovo89 commented Nov 28, 2022

rovo89 commented Nov 29, 2022

pali commented Nov 29, 2022

rovo89 commented Nov 30, 2022

rovo89 commented Nov 30, 2022

pali commented Dec 4, 2022

bugfood commented Dec 28, 2022

pali commented Aug 26, 2023

cbrandtbuffalo commented Jan 19, 2024

choroba commented Feb 12, 2024

choroba commented Feb 12, 2024

cbrandtbuffalo commented Feb 13, 2024

cbrandtbuffalo commented Feb 16, 2024

cbrandtbuffalo commented Feb 20, 2024

cbrandtbuffalo commented Mar 1, 2024

cbrandtbuffalo commented Jul 3, 2024

rovo89 commented Nov 28, 2022 •

edited

Loading