Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reporter: don't expire actively used executables #247

Merged
merged 4 commits into from
Nov 21, 2024

Conversation

Gandem
Copy link
Contributor

@Gandem Gandem commented Nov 20, 2024

Ensure that actively used executable info is not expired from the cache, by extending its lifetime using GetAndRefresh()

Fixes #244

@Gandem Gandem requested review from a team as code owners November 20, 2024 10:31
@Gandem Gandem changed the title reporter: fix executable cache expiry reporter: don't expire actively used executables Nov 20, 2024
Copy link
Contributor

@rockdaboot rockdaboot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -147,7 +151,7 @@ func NewOTLP(cfg *Config) (*OTLPReporter, error) {
if err != nil {
return nil, err
}
executables.SetLifetime(1 * time.Hour) // Allow GC to clean stale items.
executables.SetLifetime(executableCacheLifetime) // Allow GC to clean stale items.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we also increase the size of this cache as 4096 seems small given we still haven't solved the core issue here, which is that items may be dropped from this cache (whether through time-based expiration or due to the LRU being full) and there's no control or guarantee as to when they'll be re-inserted.

I'd set the cache size to 16384 until we really solve the problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong opinion on the cache size (4096 seems quite large to me).
Ideally, we should have metrics for expiration and eviction, then get numbers from production systems.

We can also make the cache sizes and lifetimes configurable.

And we can create an LRU wrapper that automatically resizes the LRUs, as suggested at #248 (comment)

Maybe better continue at #244 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also not forget that (currently), the reporter implementation in this repository is just for demo/example purposes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong opinion on my side either - went ahead and bumped it to 16384 in bf83645

To make it less likely that LRU-full eviction evicts executable metadata for a valuable executable, see open-telemetry#247 (comment)
@rockdaboot rockdaboot merged commit 0dac5dd into open-telemetry:main Nov 21, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Native frames: file name and gnu build id missing after one hour of run time
3 participants