How performant are combined indexes in a query?

I saw this query from Getting started with FQL, Fauna’s native query language - part 2 which I thought was really neat as it’s very simple to write.

Map(
  Paginate(
    Difference(
      Match(Index("all_Planets_by_type"), "TERRESTRIAL"),
      Match(Index("all_Planets_by_color"), "BLUE"),
      Match(Index("all_Planets_by_color"), "RED")
    )
  ),
  Lambda("planetRef", Get(Var("planetRef")))
)

And I was wondering how performant this is when planets collection has 1 million documents. I already know about the TTL and history_days of indexes, what I am not sure about is how combining multiple indexes in one query like this works.

2 Likes

Performance of index operations can vary by the size of the index and the cardinality of the searched terms, not to mention the processing load in the service. For example, searching an index with 10 entries is notably faster than an index with 1M entries where the cardinality is similar for all terms. Searching an index with 1M entries but only one blue term and 999,999 red terms, blue is notably faster.

In your sample query, assuming that the cardinality of terrestrial, blue, and red are similar, three index lookups should take 3 times as long as 1 index lookup. It should be no big deal. If there is one, the problem is how many entries in the result set from each index lookup that Difference has to evaluate.

If you Paginate(Match(Index("all_Planets_by_color"), "RED")), and there are 1M entries in the result set, Paginate stops evaluating entries when it has filled a page (default 64). In your query, Difference (inside the Paginate) must evaluate all 1M entries.

When dealing with large indexes, the rule of thumb is to reduce the matching subsets as much as possible before performing potentially expensive operations on them. Sometimes, that means placing Paginate inside the Difference, which can distort the results.

If the all_Planets indexes only have entries for planets in our solar system, you have very small indexes, so the number of entries involved in Difference processing is small and not a problem.

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.