There is definitely a learning curve for FQL. I feel your pain for trying to pick that up at the same time as GraphQL, and neither Fauna nor GraphQL likely feel very native to C#. Add to that, that the recommended way to do text search is to use an undocumented function…
One of the first things you are trying is search, which is probably one of the hardest things to get right. But I get that it’s absolutely critical to prove that you can do it for your application before becoming too invested.
I know you said you went through the links. But please take another look at the StackOverflow answer again. It begins by showing an example with
Filter, similar to what you have done. But it goes on to recognize that this is absolutely not feasible to do for larger data sets. Indeed, it’s too expensive and also unpredictable to Paginate over several pages – for example, you might get different numbers of results for each page, since
Filter is executed after retrieving each page. And the big thing, which you noticed, is that if you
Paginate, and then
Filter it costs read ops for each an every result.
The answer to this is writing your own Index, which use bindings in some way to provide useful search results. For an SQL
WHERE clause, the database is going to try to be clever and plan out as efficient a query as possible before defaulting to scanning the entire table. Fauna, on the other hand, does exactly what you tell it to do. So when you told it to
Filter that’s what it did. Your query instructed Fauna to scan the whole collection.
This is good and bad. Bad, because it’s more work and learning for us building specific indexes and being careful with our queries. Good, because every query is deterministic – it’s always possible to estimate the cost of a query before-hand (if you also understand what your data looks like). It also means that it is harder to do something by accident. For example, the default Page size is 64, so you have to be explicit about requesting more. There is no query planner to decide to scan the entire table on your behalf.
I won’t rehash everything in the SO answer and other links. But I’ll try to hit the highlights.
- setting array fields as an index term creates an index entry for each element of the array.
- Index bindings can compute new fields based on the Document’s data
- Index bindings can create array fields
- Different bindings can provide different kinds of search
- NGram can be used to create bindings for partial word search or fuzzy search
- Don’t give up on the examples.
from the docs:
When a document has a field containing an array of items, and that field is indexed, Fauna creates distinct index entries for each item in the field’s array. That makes it easy to search for any item in the array.
You can create your own computed fields and index them using “bindings”. The docs have several different examples of how they can be used (not just in full-text/fuzzy search).
source field defines one or more collections that should be indexed. It is also used to define zero or more fields that have associated binding functions. The binding functions compute the value for the specified field while the document is being indexed.
Using bindings for search
Slightly simplified from the SO answer, a binding that lets you do a full-word search (case sensitive):
FindStrRegex function is key.
FindStrRegex(Select(["data", "name"], Var("task_document"), "[^\\ ]+"),
Lambda("result", LowerCase(Select(["data"], Var("result"))))
NGram is what you want for partial word (exact contains) search or fuzzy search
The SO answer covers using NGram for these two cases.
Don’t give up on the examples.
Obj functions make things tedious, but the FQL functions are otherwise directly translatable.