fix: minor robustness fixes on `scan-txlog` scripts #3260

jnicoulaud-ledger · 2025-11-24T09:32:09Z

create log directory if missing
retry on some 5XX statuses
add request retry to unclaimed_sv_rewards.py

Pull Request Checklist

Cluster Testing

If a cluster test is required, comment /cluster_test on this PR to request it, and ping someone with access to the DA-internal system to approve it.
If a hard-migration test is required (from the latest release), comment /hdm_test on this PR to request it, and ping someone with access to the DA-internal system to approve it.

PR Guidelines

Include any change that might be observable by our partners or affect their deployment in the release notes.
Specify fixed issues with Fixes #n, and mention issues worked on using #n
Include a screenshot for frontend-related PRs - see README or use your favorite screenshot tool

Merge Guidelines

Make the git commit message look sensible when squash-merging on GitHub (most likely: just copy your PR description).

martinflorian-da · 2025-11-24T14:44:52Z

@jnicoulaud-ledger please let us know if/once you'd like us to trigger a full CI run! You also might need to force push to rewrite your git log with Signed-off-by commit messages... (see https://github.com/hyperledger-labs/splice/blob/main/CONTRIBUTING.md#testing).

meiersi-da · 2025-11-24T18:19:09Z

scripts/scan-txlog/unclaimed_sv_rewards.py

-            f"{self.url}/api/scan/v0/updates", json=payload
+
+        json = await self.__post_with_retry_on_statuses(
+            f"{self._get_current_url()}/api/scan/v0/updates",


probably best to switch to /v2/updates as a prudent engineering measure:

splice/apps/scan/src/main/openapi/scan.yaml

Lines 374 to 384 in 6dab159

/v1/updates/{update_id}:

get:

deprecated: true

tags: [ deprecated ]

x-jvm-package: scan

operationId: "getUpdateByIdV1"

description: |

Returns the update with the given update_id.

Unlike /v0/updates/{update_id}, this endpoint returns responses that are consistent across different

scan instances. Event ids returned by this endpoint are not comparable to event ids returned by /v0/updates.

The order of items in events_by_id is not defined.

splice/apps/scan/src/main/openapi/scan.yaml

Lines 270 to 285 in 6dab159

/v2/updates:

post:

tags: [external, scan]

x-jvm-package: scan

operationId: "getUpdateHistoryV2"

description: |

Returns the update history in ascending order, paged, from ledger begin or optionally starting after a record time.

Compared to `/v1/updates`, the `/v2/updates` removes the `offset` field in responses,

which was hardcoded to 1 in `/v1/updates` for compatibility, and is now removed.

`/v2/updates` sorts events lexicographically in `events_by_id` by `ID` for convenience, which should not be confused with the

order of events in the transaction, for this you should rely on the order of `root_event_ids` and `child_event_ids`.

Updates are ordered lexicographically by `(migration id, record time)`.

For a given migration id, each update has a unique record time.

The record time ranges of different migrations may overlap, i.e.,

it is not guaranteed that the maximum record time of one migration is smaller than the minimum record time of the next migration,

and there may be two updates with the same record time but different migration ids.

jose-velasco-ieu · 2025-12-02T18:03:29Z

Maybe we could increase the default page-size from 100 to 1000. I’ve been testing locally (connected through a VPN with 20 Mb download speed), and using 1000 improves the timing a bit.

jose-velasco-ieu · 2025-12-02T18:03:46Z

I've been thinking about improving performance by issuing parallel calls to the Scan API, but I’m not sure that’s possible. From what I can tell, pagination in /v2/updates is strictly sequential: each request needs the after_record_time and after_migration_id returned by the last transaction of the previous page. That creates a hard dependency chain where page N+1 cannot be requested until page N has been fetched. In other words, it seems like the Scan API enforces linear pagination, which would prevent issuing multiple updates() calls in parallel — unless there’s some alternative pagination mechanism I’m missing.

meiersi-da · 2025-12-03T15:43:39Z

I've been thinking about improving performance by issuing parallel calls to the Scan API, but I’m not sure that’s possible. From what I can tell, pagination in /v2/updates is strictly sequential: each request needs the after_record_time and after_migration_id returned by the last transaction of the previous page. That creates a hard dependency chain where page N+1 cannot be requested until page N has been fetched. In other words, it seems like the Scan API enforces linear pagination, which would prevent issuing multiple updates() calls in parallel — unless there’s some alternative pagination mechanism I’m missing.

I believe you could organize the parallel fetching by time intervals. For example, run ingestion for every day separately. The ingestion can start at the beginning of the day, and then fetch all the pages until it sees the first update after the current day.

jnicoulaud-ledger added 4 commits November 21, 2025 08:53

chore: retry on 503

71305a0

chore: also retry on 500

663a876

chore: create log directory if needed

a4face0

chore: apply fixes to unclaimed_sv_rewards.py

484ea92

jnicoulaud-ledger had a problem deploying to ci-forks November 24, 2025 09:32 — with GitHub Actions Error

github-actions bot assigned isegall-da, martinflorian-da and ray-roestenburg-da Nov 24, 2025

allow several URLs

532fb80

jnicoulaud-ledger had a problem deploying to ci-forks November 24, 2025 17:10 — with GitHub Actions Error

add time to logs

484732a

jnicoulaud-ledger requested a deployment to ci-forks November 24, 2025 17:20 — with GitHub Actions Waiting

meiersi-da mentioned this pull request Nov 24, 2025

unclaimed_rewards.py: improve throughput and robustness #3264

Open

meiersi-da reviewed Nov 24, 2025

View reviewed changes

martinflorian-da assigned meiersi-da and unassigned ray-roestenburg-da, isegall-da and martinflorian-da Dec 9, 2025

ray-roestenburg-da requested a review from rautenrieth-da December 11, 2025 09:37

rautenrieth-da self-assigned this Dec 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: minor robustness fixes on `scan-txlog` scripts #3260

fix: minor robustness fixes on `scan-txlog` scripts #3260

Uh oh!

jnicoulaud-ledger commented Nov 24, 2025

Uh oh!

martinflorian-da commented Nov 24, 2025

Uh oh!

meiersi-da Nov 24, 2025

Uh oh!

jose-velasco-ieu commented Dec 2, 2025

Uh oh!

jose-velasco-ieu commented Dec 2, 2025 •

edited

Loading

Uh oh!

meiersi-da commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

	/v1/updates/{update_id}:
	get:
	deprecated: true
	tags: [ deprecated ]
	x-jvm-package: scan
	operationId: "getUpdateByIdV1"
	description: \|
	Returns the update with the given update_id.
	Unlike /v0/updates/{update_id}, this endpoint returns responses that are consistent across different
	scan instances. Event ids returned by this endpoint are not comparable to event ids returned by /v0/updates.
	The order of items in events_by_id is not defined.

	/v2/updates:
	post:
	tags: [external, scan]
	x-jvm-package: scan
	operationId: "getUpdateHistoryV2"
	description: \|
	Returns the update history in ascending order, paged, from ledger begin or optionally starting after a record time.
	Compared to `/v1/updates`, the `/v2/updates` removes the `offset` field in responses,
	which was hardcoded to 1 in `/v1/updates` for compatibility, and is now removed.
	`/v2/updates` sorts events lexicographically in `events_by_id` by `ID` for convenience, which should not be confused with the
	order of events in the transaction, for this you should rely on the order of `root_event_ids` and `child_event_ids`.
	Updates are ordered lexicographically by `(migration id, record time)`.
	For a given migration id, each update has a unique record time.
	The record time ranges of different migrations may overlap, i.e.,
	it is not guaranteed that the maximum record time of one migration is smaller than the minimum record time of the next migration,
	and there may be two updates with the same record time but different migration ids.

fix: minor robustness fixes on scan-txlog scripts #3260

Are you sure you want to change the base?

fix: minor robustness fixes on scan-txlog scripts #3260

Uh oh!

Conversation

jnicoulaud-ledger commented Nov 24, 2025

Pull Request Checklist

Cluster Testing

PR Guidelines

Merge Guidelines

Uh oh!

martinflorian-da commented Nov 24, 2025

Uh oh!

meiersi-da Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

jose-velasco-ieu commented Dec 2, 2025

Uh oh!

jose-velasco-ieu commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

meiersi-da commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

fix: minor robustness fixes on `scan-txlog` scripts #3260

fix: minor robustness fixes on `scan-txlog` scripts #3260

jose-velasco-ieu commented Dec 2, 2025 •

edited

Loading