Data Modelling for Simple App

I’m creating what I hope will be a ‘simple’, but performant and scalable app. Reading the recommended article on data modelling for product catalog tells me that getting the ‘performant and scalable’ part right may not be so simple, but that these objectives can be achieved with an appropriate data model.
This is my current schema:

type User {
active: Boolean!
username: String!
description: String
email: String
mobile: String
playedAs: [Player!]! @relation
}

type Player {
active: Boolean!
rank: Int!
ranking: Ranking! @relation
challenger: User @relation
}

type Ranking {
active: Boolean!
rankingname: String!
rankingdesc: String
player: Player @relation
}

type Mutation {
  createNewUser(active: Boolean!, username : String!, password : String!, description: String, email: String, mobile: String): loginResult! @resolver(name: "create_new_user")
  loginUser(username: String!, password: String!): loginResult! @resolver(name: "login_user")
}

type Query {
  allUserNames: [String!]! @resolver(name: "all_user_names")
  allPlayerUIDs: [String!]! @resolver(name: "all_player_uids")
  allPlayerRanks: [Int!]! @resolver(name: "all_player_ranks")
  allPlayerChallengerUIDs: [String!]! @resolver(name: "all_player_challenger_uids")
  allPlayers: [Player] @resolver(name: "all_players")
  allRankings: [Ranking] @resolver(name: "all_rankings")
  allUsers: [User] @resolver(name: "all_users")
  gotPlayersByRankingId (rankingid: String!): [Player] @resolver(name: "got_players_byrankingid")
  gotRankingIdsByPlayer (uid: String!): [String] @resolver(name: "got_rankings_byplayerid")
}

type loginResult @embedded
{
  token : String
  logginUser : User
}

I would like for there to be unlimited number of Users, Players and Rankings and for each ranking to potentially contain up to 600K players (although this would be very rare and most rankings would contain less than 20 players, especially initially).
I would like to be able to query a list of UserRankings or UserPlayers and to be able to create a new UserRanking from a single click in the app (that and ranking re-sorting on a result are probably the most complex operations in the app). I have read the Fauna docs re: Data Modelling.
I’m attempting to avoid analysis paralysis but at the same time do not wish to oversimplify data modelling now and regret having done so later on.
Do the Fauna docs alone cover everything I need to understand and implement these business requirements effectively or do I need to read over and get to grips with the more complex nuances like these from StackOverflow?
Any comments, recommendations, requests for/links to more information etc. gratefully received.
Thanks …

In an attempt to further clarify my question above I would like to add the following:
If I had a single ranking with a very large number of competitors (as above, say 600K) within a Player collection consisting of millions of players (lets say 10 million members of other rankings as well as the one in question), that has a relation to the User collection (as per schema), is the correct approach, from a performance perspective, to just have an index on the Player collection i.e. the index alone resolves performance issues?
Or is there more I need to do in terms of the db design to forestall bottlenecks at a later date? thanks …

Indexes can’t combine data from different documents together into a single entry. So it can’t lookup a reference to another document, get that document and use its data in a calculation. If you are going to index your Player collection to generate a ranking, all the data you need to generate that ranking needs to be found in the Player documents.

If that won’t work for you, you could create a Collection like PlayerRankInfo that stores multiple Collection’s data together in a single document, by ‘manually’ syncing data that you write to Player documents and others to the PlayerRankInfo documents as well. Or you could create a cron job using an external service which runs a query that periodically generates the ranking information and writes it out to a Rank Collection that is then indexed.

@wallslide
So if I create a collection UserRankings and an index on it ‘all_UserRankings’ then would it be difficult for me to refine this query:

Map(
Paginate(Match(Index("all_UserRankings"))),
Lambda("userRef", Let({
userDoc: Get(Var("userRef"))
},{
userRanking: Var("userDoc")
}))
)

which gives me:

{
  data: [
    {
      userRanking: {
        ref: Ref(Collection("UserRankings"), "288674516289192453"),
        ts: 1611560322890000,
        data: {
          userRef: Ref(Collection("User"), "283120055707763201"),
          rankingRef: Ref(Collection("Ranking"), "282953512300577285"),
          rankingType: "Other"
        }
      }
    }
  ]
}

further to access user and/or ranking data fields held in those collections?
Do you have reservations about my attempting this in this way?
Thanks …

Since you seem to have a pretty straight forward rank that is just a simple number that doesn’t require any cross-document calculations, you can ignore my above advice about what to do in that case.


I’m having a hard time understanding your data model. From your description it seems like you want to have multiple Players grouped under a ranking, but you only have them mapped 1-to-1. I can’t tell if you want to have separate groups of players per each rank type, or if you really have a single calculated across all players.

Maybe this is what you were getting at?

type Player {
active: Boolean!
rank: Int!
ranking: Ranking! @relation
challenger: User @relation
}

type Ranking {
active: Boolean!
rankingname: String!
rankingdesc: String
players: [Player!] @relation
}

Lets say that it is a single rank across all players. In that case to get all Players in order of rank, you can just use create an index on the Player collection, return three values (rank, ranking, and challenger. You need rank to be first in the values array so that the index is sorted by rank. Since indexes are ordered by the values they return.). And then do a Map over the results, Get the Player document from the values returned by your Match, and return those results.

However, if you had different leaderboards (modeled by the Ranking collection), each with its own separate group of ranked Players, then you could again create an index on the Player collection with the same returned values, but this time add the ranking property as a term so that you can include that in your Match call to get the ranks for the group of players in a specific Ranking. And from there do the same Map and return operation.

@wallslide
Apologies if I haven’t been clear about the data model.
It’s the second scenario: ‘different leaderboards (modeled by the Ranking collection), each with its own separate group of ranked Players,’

I don’t require cross-document calculations (rank updates can take place within Player documents), but I do need to lookup a reference to other documents, get those documents and use their data values (e.g. ‘username’ from User collection - as that’s how players in the list will be recognized by the app user).

So it appears I do need a UserPlayer collection as you initially advised (I referred to UserRankings above as I believed it was a simpler use case, however I have essentially the same issue on both collections (obtaining the values from the refs, not just the refs, from a query)).

However, if a UserPlayer collection requires documents that are not just references to the underlying User and Player collections, but contain all the relevant data, then it appears I may as well just query the Player collection with the selected ranking as a term from my app and handle the processing (combining Users and Players into a single set (a ranking)) in the app (this is how the app is currently designed, so not a problem at that end (although I don’t know re: scalability/performance for large datasets))?

For myself, currently, at least, another factor is that I’m not sure how implement ‘manually’ syncing data or a cron job using an external service (although perhaps that’s not as difficult as I may imagine).

So my questions are:

  1. Is it ‘difficult’ to implement ‘manually’ syncing data or a cron job (given my current level of competency) and should I anyway still attempt this approach?

  2. Should I keep it simple from an FQL perspective and just make 2 separate queries from my app, giving me a list of players in a given ranking and a list of the related users, then combine the lists client side?

  3. Or is this the kind of operation that FQL was designed for and with a little more knowledge and application I could achieve much better performance/scalability by writing the correct query?

If the answer is 3. Then below is my best attempt so far:
Below will match players, but I will also need a users_by_member_ranking index
I know how to write a query on one index e.g.:

Map(
Paginate(
Match(Index("players_by_member_ranking"), Ref(Collection("Ranking"), "282953512300577285"))
),
Lambda(
"player",
Get(Var("player"))
)
)

(if relevant) 3. how would I incorporate a query on the second index (users_by_member_ranking)?
Thanks again for your patient assistance …

If you haven’t already found it, this gist from PT Paterson on nested FQL queries may help get you there.

(I am, myself, not much help here, but am also interested in the correct approach.)