As part of research in moving part of our traffic to Fauna, I am looking into our notification feed.
Just to make sure we are on the same page, a notification feed is the screen in many apps that shows a list of top X events that happened, and shows a seen/not seen on each one of them.
Our current implementation using Postgres is a table like this one:
The notification_id column is a forgien key for a table containing the text and other info about the notification. So for example to display the notification feed for user_id 19456, we need to select from table where user_id=19456 and order by row_id desc limit 15, and join it with a notifications table by notification_id.
The notification feed supports this main features:
GetFeed(userId) - Gets the top 15 notifications for a specific user id. To start with, we support fetching only the last 15. Fetching more wonât be possible.
GetCountUnSeen(userId) - Gets the number of unseen notifications. For all the notifications,not just the top 15.
SetSeen(userId, timestamp) - Will set is_seen=true for all the rowsof the userId that are below timestamp.
FanoutNotification(notificationId, [userIds]) - will insert new rows of (notificationId, userId) for each user in the array. The userIds array can be large. For exmaple 400,000.
Can those operations be built with Fauna and perform well? The GetFeed() should be around 500ms since its cllaed from the UI. Regarding FanoutNotification() it could be slower. Itâs no problem that it takes 10 min to fanout a notification.
Seems to me that this is a great use-cases for FaunaDB. Joins + OLTP style of fast updates + big (or slightly bigger) data.
Make an index with values to sort and terms to filter on the user id
Make sure you also put the notification ref in the aforementioned index and just Map() over the results of that index o and use Get() to get references of the notifications. You could also just put the ârow_idâ which would be just the ârefâ of this document in FaunaDB in the index values and Get that to get the whole document and then Select() the notification reference and Get that (this will be slightly pricier), tip: use Let to structure your query, see examples here) for such joins, you can easily do complex joins in one query.
Index on the terms: âuserIdâ and âunseenâ and use Count() on it.
Complex transactions are also possible in FaunaDB, you might run into one thing and that is pagination. Imagine the amount of âseenâ is greater than 100000 (biggest page) youâll have to check whether there is an after cursor and update consequent pages. I donât consider this a limitation personally, itâs generally a good idea to not let users run gigantic update statements as this will might other transactions.
Just take care of the request size limit which is 1MB (see here, you might have to break up an array into a few requests.
@databrecht thanks for the detailed answer! Regarding SetSeen(), if a user has 100,000 rows that belongs to it in the notification_feed table (most of them are already seen=true) and we call SetSeen() wonât it incur 100,000 updates but all we need is the couple of last rows to be set to true?
Indexes are your friend (disclaimer, didnât test the queries since I have to leave my desk in about 5 minutes ), given that you have an index with two terms (user and seen) something in the vein of:
Map(
Paginate(Match(Index('notifications_by_user_and_seen'), 'someuser', false),
Lambda(['notificationRef'], Update(Var('notificationRef'), .. do whatever you want .. ))
)
You could also update the info that is linked to the transaction by first mapping over the notification refs, getting them and get the info and so on and updating whatever you want. Itâs much like how you would do it in a regular programming language since FQL is procedural, not declarative. FQL is literally made for these kind of more complex transactions (hence why it is procedural). Forget about the idea that a scalable databases is because itâs scalable very limited in the way you query, FaunaDBâs FQL is quite powerful.
If you only want to get those between a certain range (e.g. the time the notification came in), you add the creation time to the values of your index and wrap Range around the match (values are for three things, range queries, sorting and the indexâs result values).
Map(
Paginate(
Range(
Match(Index('notifications_by_user_and_seen'), 'someuser', false),
Time(...), Time(...)))
Lambda(['notificationRef'], Update(Var('notificationRef'), .. do whatever you want .. ))
)
Of course, I canât write it for you so Iâll only explain it in pseudo FQL code for now (I might have missed a bracket here and there) . If you do get blocked on how to do something, let us know.
So you probably want to read up on these specifically:
@databrecht I am having a hard time figuring out how to create the index for GetFeed(userId). I need an index that allows to get last 15 notifications for user order by their time of creating desc.
So âtermsâ should include the useId but how should I choose âvaluesâ so that I can sort by creation date desc. Should I add a new field to the document âcreatedAtâ? (the existing ts property can chnage when updating a row and I would not want to sort on it).
Just for reference, this is how a document from the notifications_feed collection looks like: