@wallslide’s answer provides the solution you should be looking for. I just want to expand on what he said here:
Agreed, it should be pretty rare to ONLY use Filter
. Like Wallslide said, a full table scan for this query is going to be very inefficient.
The docs page you link to tries to help you find the balance between creating an index for everything and using Filter
to fill in the gaps.
So you can create a single Index to look up the result in O(1) time.
CreateIndex({
name: "my_collection_by_texture_background-color",
terms: [
{ field: ["data", "texture"] },
{ field: ["data", "background-color"] }
]
}
// and use it like
Paginate(Match(
Index("my_collection_by_texture_background"),
"Solid",
"Terracotta"
))
Okay, so now you have a query for matching by texture and background. What if you also need to query just by texture? Now what if you also need to query just by background? What if you need to also sort the values?
Indexes provide the most efficient way to read data, so it is tempting to naively create another Index that perfectly matches every possible query pattern. While that’s fine if you only have a few query patterns, your Write cost and latency can suffer if you start adding dozens of indexes. At some point, you will want to find a balance between adding more indexes and falling back on Filter
, and the subject docs page tries to help demonstrate one way to do that.
Fauna doesn’t use a query planner, because we have a consumption-based pricing model and the black box of a query-planner would make predicting costs impossible. YOU are the query planner*. So the goal is then to take a set of query patterns, choose which indexes can be used to narrow down the data set most efficiently, and use Filter
to complete the query (if you are not lucky enough that your chosen indexes are sufficient by themselves).
Example
In your case, suppose that texture and background-color are very common filter requirements, but not necessarily at the same time. You might want to have indexes for each
CreateIndex({
name: "my_collection_by_texture",
terms: [
{ field: ["data", "texture"] },
]
}
CreateIndex({
name: "my_collection_by_background-color",
terms: [
{ field: ["data", "background-color"] }
]
}
Let’s assume texture is typically more specific than background-color: in that case, you might choose to only use the texture index here:
// Map(
Paginate(
Filter(
Match(Index("my_collection_by_texture_background"), "Solid"),
Lambda(
"ref",
Equals("Terracotta", Select(["data", "background-color"], Get(Var("ref"))))
)
)
),
// Lambda("ref", Get(Var("ref")))
// )
Other Notes on Filter
Filter
inside of Paginate is “lazy”. That is, this query won’t necessarily scan the entire set (index match on texture). The query will iterate through the set, until it can satisfy the page size requested (default = 64). That means two things:
- If there is a lot of overlap between the index set and what you are filtering for (i.e. not many documents rejected by the filter), then the read cost approaches that of reading a single index.
- If there is little overlap (i.e. most documents are rejected by the filter), then the query will have to read a lot of documents, up to the entire set, in order to fulfill the page size requested or at least as much as it can.