Truncate large coro repr in retry log output#9197
Truncate large coro repr in retry log output#9197ernestprovo23 wants to merge 2 commits intodask:mainfrom
Conversation
When retry() falls back to str(coro) for the log message, truncate the representation to 200 characters to prevent excessively large logs. This was observed in P2P shuffles where functools.partial repr includes serialized binary data. Closes dask#8529
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 31 files ± 0 31 suites ±0 11h 10m 30s ⏱️ - 11m 22s For more details on these failures, see this check. Results for commit 5cb36b3. ± Comparison against base commit 4fb4814. ♻️ This comment has been updated with latest results. |
jacobtomlinson
left a comment
There was a problem hiding this comment.
There is a recommended solution in the issue to use reprlib to help with this truncation instead of simply chopping the end of the string off #8529 (comment).
Replace manual str[:200] + "..." truncation with stdlib reprlib.Repr() per reviewer feedback. reprlib handles truncation consistently and is a well-known stdlib pattern for this use case.
|
thanks for catching that, updated to use |
|
hey @jacobtomlinson — just checking in on this. the reprlib approach is in place per your suggestion (lines 385-390). the Windows CI failures look like pre-existing flakiness (investigating now), all Linux and macOS tests pass. happy to adjust anything else if needed. |
jacobtomlinson
left a comment
There was a problem hiding this comment.
Left a couple more comments
| aRepr = reprlib.Repr() | ||
| aRepr.maxother = 200 | ||
| operation = aRepr.repr(coro) |
There was a problem hiding this comment.
Let's clean this up into a simple one-liner.
| aRepr = reprlib.Repr() | |
| aRepr.maxother = 200 | |
| operation = aRepr.repr(coro) | |
| operation = reprlib.Repr(maxother=200).repr(coro) |
| jitter_fraction=0, | ||
| ) | ||
|
|
||
| import logging |
There was a problem hiding this comment.
Can you move imports to the top of the file. Claude loves importing stuff randomly in the middle of code for some reason.
| handler = logging.Handler() | ||
| handler.emit = lambda record: log_messages.append(record.getMessage()) | ||
|
|
||
| logger = logging.getLogger("distributed.utils_comm") | ||
| logger.addHandler(handler) | ||
| try: | ||
| with pytest.raises(MyEx): | ||
| asyncio_run(f(), loop_factory=get_loop_factory()) | ||
| finally: | ||
| logger.removeHandler(handler) |
There was a problem hiding this comment.
Can you just use pytest's caplog instead of doing all this?
Summary
retry()falls back tostr(coro)for log messages, truncate the representation to 200 characters to prevent excessively large logsfunctools.partialrepr includes serialized binary data (see Fix excessive logging on P2P retry #8511)Closes #8529
Test plan
test_retry_truncates_large_coro_reprverifies 500-char repr is truncatedoperationparameter bypasses truncation (preserves full user-provided descriptions)🤖 Generated with Claude Code