Best practice for SaaS multi-tenancy

danbars · September 15, 2020, 8:02pm

Hi, I’m new to Fauna.
I’m building a DB for a SaaS. Users are authenticated outside of Fauna (Firebase). All requests are made from client to server using REST, and server to Fauna using FQL. The server has access to userId (parsed from authorization token) which is not represented as entity in Fauna.
I want to be able to define ownerId for each record, and then make sure that each user is only reading/updating/creating their own data.

Create - I just added ownerId field by the server into each doc
Read - I’ve created indexes that always filters by ownerId in addition to whatever else I needed
Update - this is where I’m stuck - let’s say the client is trying to update entity with id 123. They send the revised JSON to the server. The server now wants to update entity with id 123, but only if ownerId of that entity is the user’s id. In SQL I’d do UPDATE foo SET .... WHERE fooId='123' and ownerId='xxx' ....
What’s the best practice to achieve this with Fauna if users are managed externally?

Note: I’ve read about multi-tenancy DB and thought perhaps I should create a child DB for each user. But then I realized that child DBs do not inherit the schema of their parent DB, so it seem to be not practical to maintain a separate schema for each customer.

Thanks!

Luigi_Servini · September 16, 2020, 7:10am

Hi @danbars and welcome!

Let suppose you have a document like this:

data:
   { fooId: '123',
     ownerId: 'user1',
     orderId: 12345,
     description: 'This is my first order' } }

and fooId is unique across the collection, you can create an index like this:

CreateIndex({name:'refByDocId',source:Collection('SaaS'),unique:true,terms:[{field:['data','fooId']}]})

and an UDF like below:

CreateFunction({
  name: "updateSaaSDoc",
  role: null,
  body: Query(
    Lambda(['fooId','ownerId','payload'],
      Let(
        {
          doc: Get(Match('refByDocId',Var('fooId'))),
          ref: Select(['ref'],Var('doc')),
          owner: Select(['data','ownerId'],Var('doc'))
        
        },
        If(Equals(Var('owner'),Var('ownerId')),Update(Var('ref'),{data:Var('payload')}),"Owner mismatch")
      )
    )
  )
}
)

At this point, you can call the function and passing three parameters:

fooId
ownerId
updated document

Call(Function('updateSaaSDoc'),['123','user1',{description:"order updated"}])
{ ref: Ref(Collection("SaaS"), "276802994553487873"),
  ts: 1600239463950000,
  data:
   { fooId: '123',
     ownerId: 'user1',
     orderId: 12345,
     description: 'order updated' } }

By calling the function with owner mismatch, you get:

Call(Function('updateSaaSDoc'),['123','user2',{description:"order updated"}])
'Owner mismatch'

Of course, this is a very basic version of what you can do/check with FQL. For instance, you can add document existence by adding:

Count(Match('refByDocId','123'))
1

In that case, document exist (and you have 1 since the Id is unique across the collection, or

Count(Match('refByDocId','1234'))
0

the document does not exist.

In case there is something wrong in my way of thinking, let me know and if it is possible, provide more details.

Luigi

danbars · September 16, 2020, 7:26am

Hi @Luigi_Servini, thank you very much for the detailed reply. Very helpful!
Using UDF to fetch & update was the missing part for me.

Followup question -
Is this function atomic? Assuming I’d like to use the same technique to prevent concurrent modification. I’d add revision field to the doc, then the UDF could be

CreateFunction({
  name: "updateSaaSDoc",
  role: null,
  body: Query(
    Lambda(['fooId','ownerId','clientRevision','payload'],
      Let(
        {
          doc: Get(Match('refByDocId',Var('fooId'))),
          ref: Select(['ref'],Var('doc')),
          owner: Select(['data','ownerId'],Var('doc')),
          revision: Select(['data','revision'],Var('doc'))
        },
        If(and(Equals(Var('owner'),Var('ownerId')), 
               Equals(Var('clientRevision'),Var('revision'))),
           Update(Var('ref'),{data:Var('payload')}),
           "Owner or revision mismatch")
      )
    )
  )
}
)

Will this work? I mean will get&update be atomic?

Note: of course I have to update the revision to +1 before I update

Luigi_Servini · September 16, 2020, 9:05am

Hi @danbars,

At the time a query starts, Fauna get a consistent snapshot of the database and execute the query against that snapshot. If something changes in the meanwhile, the whole transaction is aborted and restarted.
Getting a client revision might be a good idea in any case, and you can modify the function and get all in one shot:

CreateFunction({
  name: "updateSaaSDoc",
  role: null,
  body: Query(
    Lambda(['fooId','ownerId','clientRevision','payload'],
      Let(
        {
          doc: Get(Match('refByDocId',Var('fooId'))),
          ref: Select(['ref'],Var('doc')),
          owner: Select(['data','ownerId'],Var('doc')),
          revision: Select(['data','revision'],Var('doc'))
        },
        If(and(Equals(Var('owner'),Var('ownerId')), 
               Equals(Var('clientRevision'),Var('revision'))),
           Update(Var('ref'),{data:Merge(Var('payload'),Add(Var('revision'),1))}),
           "Owner or revision mismatch")
      )
    )
  )
}
)

hope this help.

Luigi

databrecht · September 16, 2020, 9:17am

Indeed, you can just as well create collections that have a ‘user key’. It’s a tradeoff. Some like to have data in different databases for each client because the shape of the data might be different for other clients or like the client separation. If that’s not something you need it’ll be easier to keep your data in one database. Especially if at some point there is a relation or a write that is ‘cross-client’. In the child DBs scenario, linking objects of different clients would be hard and iirc writes accross multiple databases is not possible which means such a write will be inconsistent and will have to be done in two steps.

Regarding passing ‘the owner’. Make sure to look into the Identity() function in case the user you are currently connected with would be the owner.

danbars · September 16, 2020, 9:53am

Thanks, that’s helpful

danbars · September 16, 2020, 12:32pm

thanks, @databrecht.
I think creating database per client is not the right way for me. Maybe for B2E SaaS this could make sense as enterprise clients have other requirements.
Calling this “multi tenancy” in Fauna docs was somewhat misleading (for me at least). I originally thought that each child database inherits the parent db schema, so I can define everything once, and get a guaranteed sandbox for each customer.
Having this would be a real killer feature because it would guarantee that there’s real separation between customers’ data, while I can still access everything from the parent DB if I ever need cross tenant data.
In other words, it’s like the “user key” solution, only built-in by the DB.

databrecht · September 16, 2020, 4:59pm

I agree, I opened a concern a while back that it’s missleading and I should take action and rewrite or extend that article. But I didn’t get to it yet

Topic		Replies	Views
Looking for advice on a schema Help pricing , schema	4	110	June 4, 2024
Multi-tenancy database Help	2	317	June 2, 2021
Does the multi-tenancy model support syncing schemas and aggregating across child databases? Help multi-tenancy	3	1038	November 9, 2020
Child database maintenance Help best-practices	6	408	January 5, 2022
Replicate db settings between tenants Help graphql , dashboard	7	535	January 11, 2022

Best practice for SaaS multi-tenancy

Related topics