You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Workaround is to identify pages without home pages via consistencycheck.py, check if the home pages size is >1Mb, and if so set a different home page (e.g. the /about page , which isn't ideal) and reindex. That's a bit of a faff though, so want to see if there's a better way of handling.
Ideas include a logging and alerting when this is detected during indexing, and potentially even seeing if there is a way of still adding the home page URL to the search index without indexing the content.
The text was updated successfully, but these errors were encountered:
It is important that all domains have a page in the index with is_home=true. This is so that they appear on the Browse page. See also #102 .
However, some sites have home pages which exceed the maximum pages size (1Mb) and so aren't indexed. Examples include 5.1Mb for https://www.gleech.org/, 1.5Mb https://www.allendowney.com/blog/ and 1.4Mb https://www.swyx.io/ .
Workaround is to identify pages without home pages via consistencycheck.py, check if the home pages size is >1Mb, and if so set a different home page (e.g. the /about page , which isn't ideal) and reindex. That's a bit of a faff though, so want to see if there's a better way of handling.
Ideas include a logging and alerting when this is detected during indexing, and potentially even seeing if there is a way of still adding the home page URL to the search index without indexing the content.
The text was updated successfully, but these errors were encountered: