Hi @tehcromic and welcome!
My response is two-fold. 1) you can use a custom primary key in Fauna to manage relations (with some work), but 2) you’ll want to consider if it’s worth it from a performance/cost perspective. I will still describe what it would be like to use a primary key, and then compare to what is currently more idiomatic.
Regardless of the current state of Fauna, you still have great feedback! I will move this topic to the Feature Requests category. There, you and other folks can vote on it to highlight your desire for the feature. The topic can continue to stand as a place of discussion and use cases for adding custom primary keys to Fauna.
Quick comparison
The document’s ID in Fauna is THE key for that document. When you Get
a document with its ID (by calling Get(Ref(Collection("coll"), ID))
) the database can go straight to your document in storage and fetch it for you. Reads in Fauna can be optimized for the fact that all IDs are 64-bit integers.
If I understand correctly, RethinkDB uses the primary key as a means to shard, index and store data. Provided the primary key, ReDB can go straight to the document in the primary index and fetch it. In this way, ReDB has the advantage of convenience, since the document itself is stored in the primary index.
How to model primary key/primary index in Fauna
You can create an Index in Fauna for whatever business-id that you need. Granted, you do have to be more explicit about it.
With Fauna, you can store a primary key value in a relationship field and resolve that relationship using your “primary index”.
define indexes
CreateIndex({
name: "users_primary",
source: Collection("users"),
terms: [{ field: ["data", "email"] }] // email as PK
})
CreateIndex({
name: "cars_primary",
source: Collection("cars"),
terms: [{ field: ["data", "VIN"] }] // VIN as PK
})
CreateIndex({
name: "cars_by_owner",
source: Collection("cars"),
terms: [{ field: ["data", "owner"] }] // foreign key
})
create some data
Create(Collection("users"), {
data: {
/* ... */
email: "me@fauna.com" // PK
}
})
Create(Collection("cars"), {
data: {
/* ... */
VIN: "WBAFB3345YLH46720",
owner: "me@fauna.com" // store user PK
}
})
query for the data
// get owner of a car
// costs 3 Read Ops
Let(
{
car: Get(Match(Index("cars_primary"), "WBAFB3345YLH46720")), // 1 Read Op for reading the index
owner_email: Select(["data", "owner"], Var("car"))
},
Get( // 1 Read Op for `Get`ing the document
Match(Index("users_primary"), Var("owner_email")) // 1 Read Op for reading the index
)
)
// get all cars owned by user
// costs 1 + N Read Ops
Let(
{
user_email: "me@fauna.com",
},
Map(
Paginate(Match(Index("cars_by_owner"), Var("user_email"))), // 1 min Read Op for reading the Index
Lambda("ref", Get(Var("ref"))) // 1 Read Op for each car document to `Get`
)
)
Note that the example requires an extra layer of indirection when getting the owner of the car, since , and it would be much more efficient to use the related document’s Ref, since that is a pointer directly to the document.
Alternative
It is indeed often helpful to reach for your data via a human-friendly value. My recommendation is then to:
- manage your relations in Fauna using document Refs and Indexes on the Refs
- add indexes on your business id(s) that enable user-friendly queries
- expand your query results using the efficient relations, the code for which does not need to handle the Fauna ids, just the fields in which they are stored.
We’ve already compromised somewhat in our example, since the Indexes return document Refs. But now consider if we stored the user Ref in the car, rather than just the user email.
Create(Collection("cars"), {
data: {
/* ... */
VIN: "WBAFB3345YLH46720",
owner: Ref(Collection("users"), "101") // store user Ref
}
})
we save on read operations by avoiding a second index when we get the owner of a car
// get owner of a car
// costs 2 Read Ops
Let(
{
car: Get(Match(Index("cars_primary"), "WBAFB3345YLH46720")), // 1 Read Op for reading the index
owner: Select(["data", "owner"], Var("car"))
},
Get(Var("owner")) // 1 Read Op for `Get`ing the document
)
We add a small cost to fetch the user Ref to get all cars for a given user email
// get all cars owned by user
// costs 2 + N Read Ops
Let(
{
user: Get(Match(Index("users_primary"), "me@fauna.com")) // 1 Read Op for `Get`ing the document
user_ref: Select("ref", Var("user"))
},
Map(
Paginate(Match(Index("cars_by_owner"), Var("user_ref"))), // 1 min Read Op for reading the Index
Lambda("ref", Get(Var("ref"))) // 1 Read Op for each car document to `Get`
)
)