Skip to content

Conversation

@audiomuze
Copy link

During scans we repeatedly resolve identical contributor strings; Slim::Schema::Contributor->add() can do multiple DB round-trips per call.

  • Add a scan-local LRU+TTL memoization cache in Schema.pm around contributor resolution in _mergeAndCreateContributors (scanner-only) to reduce redundant contributor DB work during library scans.
  • Cache key includes contributor + MBID + sort + extid; role is intentionally excluded.
  • Cache size is bounded and derived from existing dbhighmem pref; cache is cleared at scan end. Consider adding additional dropdown "Extreme" in Database Memory Config to enable cache size to be increased to 100K for large libraries.
  • When DEBUGLOG is enabled, emit end-of-scan stats plus a conservative heuristic estimate of avoided SQL (with an END-hook fallback for scanner runs where request events don’t fire).

Add scan-local LRU+TTL cache and DEBUGLOG stats to reduce redundant contributor DB work during library scans.
Copy link
Member

@michaelherger michaelherger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep things simple. IMHO this is way over-engineered, with most of the code satisfying the dev's geek mode only, adding no value to the user.

Comment on lines 380 to +403
if ( !main::SCANNER ) {
require Slim::Control::Request;
# Wipe cached data after rescan
Slim::Control::Request::subscribe( sub {
$class->wipeCaches;
Slim::Schema::Album->addReleaseTypeStrings;
}, [['rescan'], ['done']] );
}
elsif ( main::DEBUGLOG ) {
require Slim::Control::Request;
# Initialize scan-local caches.
$class->_initScanContributorCache;
$class->_registerScanContributorCacheEndHook;

# Scanner-only stats, emitted once per scan if DEBUGLOG is enabled.
Slim::Control::Request::subscribe( sub {
$class->_logScanContributorCacheStats;
}, [['rescan'], ['done']] );
}
elsif ( main::SCANNER ) {
require Slim::Control::Request;
# Initialize scan-local caches even if DEBUGLOG is off.
$class->_initScanContributorCache;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not going to fly. We don't want to use Slim::Control::Request in the scanner. You don't need to subscribe to those events in the scanner, do you? The scanner would already have exited when the server signalled "scanner done".

And having a different flow for debug mode sounds wrong to me, too. I believe in the above the elsif (main::SCANNER) would hardly ever be hit, as nobody is going to disable debug logging for the scanner manually.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

Comment on lines +3293 to +3294
# only store EXTID for track artist, as we don't have it for other roles
my $extid = ($tag eq 'ARTIST' && $attributes->{'ARTIST_EXTID'}) ? $attributes->{'ARTIST_EXTID'} : undef;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an assumption which might be valid now and for some services. But this can change at any time, if it's even true now. Wouldn't Qobuz import different roles, too?

We should probably review even the existing implementation...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea what the streamer plugins do or don't provide.

I could drop the assumption and change the code to:
my $extid = $attributes->{ $tag . '_EXTID' }; ?

Comment on lines +87 to +110
my %_scanContributorAddCache;
my $_scanContributorAddCacheTied = 0;
my $_scanContributorAddCacheMaxEntries;
my $_scanContributorCacheEndHooked = 0;
my $_scanContributorCacheApproxSize = 0;

sub _scanContributorCacheMaxEntries {
# Use the existing cross-platform `dbhighmem` preference as a proxy
# Keep this deliberately bounded: on low-RAM systems, very large hash-based
# caches can become noticeable during big scans.
return $prefs->get('dbhighmem') ? 50000 : 15000;
}

sub _initScanContributorCache {
my $class = shift;
return unless main::SCANNER;
return if $_scanContributorAddCacheTied;

my $entries = $class->_scanContributorCacheMaxEntries;
$_scanContributorAddCacheMaxEntries = $entries;
$_scanContributorCacheApproxSize = 0;
tie %_scanContributorAddCache, 'Tie::Cache::LRU::Expires', EXPIRES => 3600, ENTRIES => $entries;
$_scanContributorAddCacheTied = 1;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this complicated setup? Why not just initialise the cache like we do for the content type? A line or two instead of dozens of them and repeated check for whether it's been initialised or not?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can simplify, but before I do, is sizing the cache according to the user's selected preference in Database Memory Config something you think should be lost or retained? To my mind limiting the cache size is necessary given Lyrion is expected to run on SOC with < 1GB RAM. On the other hand limiting the cache size to satisfy only RAM deficient (mostly legacy) SOCs would significantly limit the benefit of including the cache in the first place.

I had toyed with being more direct around cache size based on installed RAM but that would mean adding additional modules, which I figured you'd be against.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally reasonable. But this doesn't need much code. See eg. https://github.com/LMS-Community/slimserver/blob/public/9.1/Slim/Control/Queries.pm#L69

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This work for you?

# Scan-local contributor cache.
tie %_scanContributorAddCache, 'Tie::Cache::LRU::Expires',
	EXPIRES => 3600,
	ENTRIES => ($prefs->get('dbhighmem') ? 50000 : 15000)
if main::SCANNER;

Comment on lines +125 to +128
# Optional instrumentation (scanner + DEBUGLOG only)
# These counters are used to quantify effectiveness and provide a conservative
# estimate of avoided DB statements.
my (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too much noise for my taste. Who'd be interested in having this other than you as the developer? That's only adding overhead and code for little to no use for the user. Please remove this. You can still have those lines in your clone. But why should we have that much code to satisfy the geek in me for this one particular cache, when we don't do any of this anywhere else?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I specifically included it as a temporary aid to help you assess whether or not the cache has sufficient impact to warrant the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants