Get documents if field doesn't exists

Folks, I’ve been trying to get all fields for my messages that haven’t been read yet (so I can display how many messages are new).
I’m using this query:

Sum(
  Map(
    Paginate(Documents(Collection('messages'))),
    Lambda((x) =>  If(ContainsField('readAt', Select('data', Get(x))), 0, 1))
  )
)

It does the trick, but isn’t there any better solution to do so?
As you can tell, I’m new to FQL.

Thanks in advance!

Since the values you are summing are either 0 or 1, you could perhaps filter and use the Count function. The most efficient way to filter would be with an Index. Filter is an option, but would require a full scan of the Collection (similar to your example).

We can create an Index that sorts your messages Collection by the readAt field. For those Documents that do not have a readAt field, the value will be null. null values are sorted last, per Fauna’s type precedence. We can use that to only search for the null values.

IMPORTANT: this will work because our Index has defined multiple values. If all of an Index’s entries terms or values resolve to null, then no entry will be created. This is one reason we also add the Document’s Ref as a value.

Example Index

CreateIndex({
  name: "messages__readAt_asc",
  source: Collection("messages"),
  // no terms required
  values: [
    { field: ["data", "readAt"] },
    { field: ["ref"] }
  ]
})

Using the Index

Now we can query for all messages without a readAt field. We use the Range function to start reading the Index at null values.

Paginate(
  Range(
    Match(Index("messages__readAt")), 
    null, 
    []
  )
)

// example result
{
  data: [
    [null, Ref(Collection("messages"), "328373908847722562")],
    [null, Ref(Collection("messages"), "328373912910954562")]
  ]
}

If we want to simply know how many there are, then you can use Count.

Count(
  Range(
    Match(Index("messages__readAt")), 
    null, 
    []
  )
)

// example result
2

Additional Notes

It is recommended that you set the history_days setting for your messages Collection as low as possible, if not all the way to 0 (warning: setting to null means infinite history).

I am going to guess that every message starts in an unread state, then is later updated with a readAt field. This means that history is going to build up, which could mean your queries may take longer and read may be more expensive if you don’t clean up the history.

Since there is history for messages with readAt: null, the Index has to scan through all of them to find the active ones. This is intentional, because Fauna’s temporality features allow you to query the Index at any point in time. However, if you don’t need to query a snapshot in the past, it will not help you to keep the history around.

2 Likes

Note that the documentation includes a cookbook recipe for discovering documents that are missing fields by using Difference: Missing field value :: Fauna Documentation

If you take @ptpaterson 's advice to set readAt to a timestamp, each readAt value would almost certainly be distinct, so the Difference approach would be unhelpful, at best.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.