> Probably via a Zanzibar-based system ...
> ... a search index that respects access control
This is the exactly the part I want to understand. How are you modifying your search index, so that it respects the access control.
There are some ways I can think of, but want to learn more from others on how they are doing it:
* each object stores metadata of which access groups can access this data, at the search query time, first I fetch groups user belongs to and send it as part of search query
* fetch all matching objects and hope that list is not huge and for each item assess at run time if object can be accessed by this user, if not, remove from results
* ...
You either compute at query time, which might be costly or you pre-compute it at write time, but then you need to keep at least 2 data sources in sync objects (who can access can change on object level) and groups (group can get more permissions or less)
One approach that can be used is to use the centralized service to answer a broader question like: given this user, what rules can I use to know if a document is accessible for them. And have the service give you a set of rules to apply. Then take the result and embed those restrictions in your query.
An example access service response would be: this user can access data from groups they are part of + documents for which a share exists towards this user + documents for which a share exists to any of the users' groups.
This is not exactly the same as the first option you described, because instead of storing access controls in the index data, you use the available metadata + the rules from the access control service.
This is the exactly the part I want to understand. How are you modifying your search index, so that it respects the access control.
There are some ways I can think of, but want to learn more from others on how they are doing it:
* each object stores metadata of which access groups can access this data, at the search query time, first I fetch groups user belongs to and send it as part of search query
* fetch all matching objects and hope that list is not huge and for each item assess at run time if object can be accessed by this user, if not, remove from results
* ...
You either compute at query time, which might be costly or you pre-compute it at write time, but then you need to keep at least 2 data sources in sync objects (who can access can change on object level) and groups (group can get more permissions or less)