You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Let's say we have N number of PODs like P1, P2, P3 ... PN. These PODs contain an index dedicated to the targeted subject.
Targeted subject:
The subject is an URI (OBJ)
The subject is a literal (LIT)
Example of scenario for OBJ: each POD contain some products like carrot, tomato, apples and so on. These PODs also contain a product type index listing all products of a certain type. We want a client App to be able to list all the products of a certain type that are hosted on P1 ... PN. When the user clicks on a product type, the App will show all the products of that type.
Example of scenario for LIT: each POD contain some persons with family name. Each of these PODs contain a family name index which groups family names by their first 3 letters. The App will display all the persons on all the POD that match the family name provided by the user.
I see 2 strategies here:
On The Fly (OTF): The App can browse the indexes on each POD and merge them on the fly to get the response.
Agent (BOT): The App can read an already merged index on a POD (being its own or not), this index being managed by a bot (agent).
Equity
Ho to manage "equity": the fact that we take the same number of results on every POD like the first from P1, the first from P2, the first from P3, the second from P3, the second from P1, the second from P2, etc.
Score
Do we want to privilege some PODs over others? Maybe some of them are more relevant regarding speed, accuracy, frequency, etc.
Scalability
What are important metrics for indexing strategies?
What metrics could be used to trigger the appliance of different indexing strategies? In real time or not.
What is the ceiling from which it is critical to split a large index into smaller pieces?
What are the ceilings below which the OTF strategy remains efficient?
How to calculate the indexing cost?
How to calculate the parsing cost?
How to calculate the querying cost?
Calculate the ratio between the number of items and the weight of various index types.
What mechanisms could be used to sync the merging of distributed indexes? Loops? Notifications like ActivityPub?
Security
When a distributed indexing strategy is sync in real time following the growth of data, re-computation of indexes is triggered when data increase or decrease (ex: divide large file, merge small files). An attacker could change again and again the volume of data on one or several PODs of the cluster to try to overload the server. Which mechanisms could be used to prevent this?
introduce a delay between strategy changes
a limit of changes over the time (ex: limit to X changes in 1 hour)
The text was updated successfully, but these errors were encountered:
Let's say we have N number of PODs like P1, P2, P3 ... PN. These PODs contain an index dedicated to the targeted subject.
Targeted subject:
Example of scenario for OBJ: each POD contain some products like carrot, tomato, apples and so on. These PODs also contain a product type index listing all products of a certain type. We want a client App to be able to list all the products of a certain type that are hosted on P1 ... PN. When the user clicks on a product type, the App will show all the products of that type.
Example of scenario for LIT: each POD contain some persons with family name. Each of these PODs contain a family name index which groups family names by their first 3 letters. The App will display all the persons on all the POD that match the family name provided by the user.
I see 2 strategies here:
Equity
Ho to manage "equity": the fact that we take the same number of results on every POD like the first from P1, the first from P2, the first from P3, the second from P3, the second from P1, the second from P2, etc.
Score
Do we want to privilege some PODs over others? Maybe some of them are more relevant regarding speed, accuracy, frequency, etc.
Scalability
Security
The text was updated successfully, but these errors were encountered: