Skip to content

Truncate large coro repr in retry log output#9197

Open
ernestprovo23 wants to merge 2 commits intodask:mainfrom
ernestprovo23:fix-retry-excessive-logs
Open

Truncate large coro repr in retry log output#9197
ernestprovo23 wants to merge 2 commits intodask:mainfrom
ernestprovo23:fix-retry-excessive-logs

Conversation

@ernestprovo23
Copy link
Contributor

Summary

  • When retry() falls back to str(coro) for log messages, truncate the representation to 200 characters to prevent excessively large logs
  • This was observed in P2P shuffles where functools.partial repr includes serialized binary data (see Fix excessive logging on P2P retry #8511)
  • Added test verifying truncation behavior

Closes #8529

Test plan

  • New test test_retry_truncates_large_coro_repr verifies 500-char repr is truncated
  • Existing retry tests unaffected (no truncation when repr is small)
  • Explicit operation parameter bypasses truncation (preserves full user-provided descriptions)

🤖 Generated with Claude Code

When retry() falls back to str(coro) for the log message, truncate
the representation to 200 characters to prevent excessively large
logs. This was observed in P2P shuffles where functools.partial repr
includes serialized binary data.

Closes dask#8529
@github-actions
Copy link
Contributor

github-actions bot commented Feb 22, 2026

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

    31 files  ± 0      31 suites  ±0   11h 10m 30s ⏱️ - 11m 22s
 4 114 tests + 1   4 007 ✅ ± 0    104 💤 ±0  3 ❌ +1 
59 651 runs  +15  57 173 ✅ +13  2 474 💤 ±0  4 ❌ +2 

For more details on these failures, see this check.

Results for commit 5cb36b3. ± Comparison against base commit 4fb4814.

♻️ This comment has been updated with latest results.

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a recommended solution in the issue to use reprlib to help with this truncation instead of simply chopping the end of the string off #8529 (comment).

Replace manual str[:200] + "..." truncation with stdlib reprlib.Repr()
per reviewer feedback. reprlib handles truncation consistently and is
a well-known stdlib pattern for this use case.
@ernestprovo23
Copy link
Contributor Author

thanks for catching that, updated to use reprlib.Repr() with maxother=200 instead of manual slicing. keeps the truncation logic cleaner and more standard.

@ernestprovo23
Copy link
Contributor Author

hey @jacobtomlinson — just checking in on this. the reprlib approach is in place per your suggestion (lines 385-390). the Windows CI failures look like pre-existing flakiness (investigating now), all Linux and macOS tests pass. happy to adjust anything else if needed.

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple more comments

Comment on lines +389 to +391
aRepr = reprlib.Repr()
aRepr.maxother = 200
operation = aRepr.repr(coro)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's clean this up into a simple one-liner.

Suggested change
aRepr = reprlib.Repr()
aRepr.maxother = 200
operation = aRepr.repr(coro)
operation = reprlib.Repr(maxother=200).repr(coro)

jitter_fraction=0,
)

import logging
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move imports to the top of the file. Claude loves importing stuff randomly in the middle of code for some reason.

Comment on lines +259 to +268
handler = logging.Handler()
handler.emit = lambda record: log_messages.append(record.getMessage())

logger = logging.getLogger("distributed.utils_comm")
logger.addHandler(handler)
try:
with pytest.raises(MyEx):
asyncio_run(f(), loop_factory=get_loop_factory())
finally:
logger.removeHandler(handler)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just use pytest's caplog instead of doing all this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

retry can produce excessively large logs

2 participants