How often are you polling?
Wiping the entire collection, then re uploading can kill you in the cost of write operations. But I suppose you’d have to work hard to hit the free tier limit of 50k write ops/day, especially if the DB is otherwise read only.
Overly-complicated alternative
If
- If you neeeeeeeed to reduce write ops, or
- if you have other collections that link to this data and you do not want to destroy the existing references
Then you can efficiently add/remove new/old items using the Difference
function. This can compare some unique identifier in your polled data to an index of that id in the fauna data, which can determine which ids are new (or old for that matter).
Difference(Var('array_of_polled_ids'), Var('array_of_existing_ids'))
Reverse the arguments to get ids that exist in fauna but not in the polled data, to see what needs to be deleted.
Difference(Var('array_of_existing_ids'), Var('array_of_polled_ids'))
User Intersection
to update existing data, if the polled data contains changed documents, rather than just new ones
Intersection(Var('array_of_polled_ids'), Var('array_of_existing_ids'))
Obtaining the array_of_existing_ids
is one read op, and reading each of the above Sets is also just one more read op each.
Overly-complicated Example
q.CreateFunction({
name: 'createUniqueItems',
body: q.Query(
q.Lambda(
['pollData'],
q.Let(
{
pollDataIds: q.Map(q.Var('pollData'), item => q.Select('id', item)),
existingIds: q.Select(
'data',
q.Paginate(
q.Match(q.Index('idsOfItems'))),
{
size: 100000
}
)
),
newIds: q.Difference(q.Var('pollDataIds'), q.Var('existingIds')),
newData: q.Filter(q.Var('pollData'), item =>
q.Any(
q.Map(q.Var('newIds'), id =>
q.Equals(id, q.Select('id', item))
)
)
),
newItems: q.Map(q.Var('newData'), itemData => q.Create(q.Collection('Item'), itemData)
},
q.Var('newItems')
)
)
)
})
})
Cheers!