You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! I am using Layla for baseline detection as a part of Loghi. I've noticed that there are times when, despite being a whole line in the input image, the Laypa model recognizes this line as two separate sentences during inference because the distance between them is just slightly longer. This in turn leads to errors in the subsequent transcribing.
Is it possible to alter the config to enhance the inference results while using the original model weights? Thank you!
The text was updated successfully, but these errors were encountered:
Unfortunately I don't think that there is a lot you can do without finetuning to improve results. The one thing you could look at is the internal size used (INPUT.MIN_SIZE_TEST and INPUT.MAX_SIZE_TEST). However, this might negatively impact performance in another way, since it has not been trained on this size.
Do you have some more info on the type of image where this problem occurs? We have seen this type of behavior on really small images for example. Also worth noting that the opposite of your problem is also something that we are trying to prevent. That being text lines that are close together, but should be separated (e.g. newspapers).
Here I can provide two examples. The first pair of images in line "med de frisinnade" has an unwanted baseline break, the second pair of images after "§6" has several unwanted baseline break due to the large space between words. These should logically in one line, but the model breaks them.
Hi! I am using Layla for baseline detection as a part of Loghi. I've noticed that there are times when, despite being a whole line in the input image, the Laypa model recognizes this line as two separate sentences during inference because the distance between them is just slightly longer. This in turn leads to errors in the subsequent transcribing.
Is it possible to alter the config to enhance the inference results while using the original model weights? Thank you!
The text was updated successfully, but these errors were encountered: