-
-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
Hey, thanks for the cool library! I'm using it in Loupe.
I noticed a performance issue:
<?php
use Nitotm\Eld\LanguageDetector;
include __DIR__ . '/vendor/autoload.php';
$languageDetector = new LanguageDetector();
$languageDetector->langSubset(['de']);
var_dump($languageDetector->detect('Guten Tag.'));On the first call, this will write a subset for just de into the vendor (/subsets) directory. This is perfect!
However, the LanguageDetector will always load the small.php ngrams, even though I only need the subset for German.
Also it is loading the dataset already in the __construct() which means that you cannot instantiate the LanguageDetector object without causing the data to be loaded into memory. This means, it's always loaded even if nobody calls ->detect() on the object later. This should ideally be converted to a lazy evaluation. So only load the data once it's used for the first time 😊
Metadata
Metadata
Assignees
Labels
No labels