UDF to get an aggregated average split by multiple categories across nested documents

Hello Folks,

I’m new to FaunaDB. I’m using it as a back end for a simple Angular application that collects survey responses from users. It’s been tremendously easy to work with so far. I had my GraphQL API up and running in a few minutes!

However, I’m now at a point were I need to do a little bit more. Specifically I want to return some summarised scores for the survey to be used in charts and graphs. I’ve been trying to achieve this with a UDF. I’ve followed along the online tutorials (how to query nested documents and how to aggregate data). I’ve found the FQL language quiet verbose and the online tutorials to be far too simple to help me so I’m turning here for help.

My schema is below where its made up of:

  • BenchmarkResponse: Top level document
    → BenchmarkResponsesOrgDetail: information about the organization
    → benchmarkResponsesProgramDetail: information about the organization data function
    → BenchmarkResponsesScore: a key value pairing of the users response for each of 24 questions included in the survey. The answer can be null or a number between 1 and 6.

The schema is shown below where most fields have their own GraphQL type that are all just enums.

Schema:

type BenchmarkResponse {
  benchmarkResponsesOrgDetails: BenchmarkResponsesOrgDetail
  benchmarkResponsesProgramDetails: BenchmarkResponsesProgramDetails
  benchmarkResponsesScores: BenchmarkResponsesScore
}

type BenchmarkResponsesOrgDetail {
  orgName: String
  orgIndustry: OrgIndustry
  orgFinanceSubIndustry: OrgFinanceSubIndustry
  orgSize: OrgSize
  orgHQCountry: OrgHQCountry
  orgAreaOfOperation: OrgAreaOfOperation
}

type BenchmarkResponsesProgramDetails {
  programScope: ProgramScope
  programYearsInOperation: ProgramYearsInOperation
  programReportsTo: ProgramReportsTo
  programFundingModel: ProgramFundingModel
  programModelUtilized: YesNo
  programDataScienceEstablished: ProgramDataScienceEstablished
  programOversightOfAnalytics: ProgramOversightOfAnalytics
}

type BenchmarkResponsesScore {
  Question_1: Int
  Question_2: Int
  Question_3: Int
  Question_4: Int
  Question_5: Int
  Question_6: Int
  Question_7: Int
  Question_8: Int
  Question_9: Int
  Question_10: Int
  Question_11: Int
  Question_12: Int
  Question_13: Int
  Question_14: Int
  Question_15: Int
  Question_16: Int
  Question_17: Int
  Question_18: Int
  Question_19: Int
  Question_20: Int
  Question_21: Int
  Question_22: Int
  Question_23: Int
  Question_24: Int
}

I want to write a UDF that will return the average of each score across the responses split by various fields. I then want to use this UDF as a custom resolver for a query. Below is the GraphQL schema for what I want the GraphQl response to look like.

type BenchmarkDetails{
  orgIndustry: OrgIndustry
  orgFinanceSubIndustry: OrgFinanceSubIndustry
  orgHQCountry:OrgHQCountry
  orgAreaOfOperation: OrgAreaOfOperation
  orgSize: OrgSize
  programDataScienceEstablished: ProgramDataScienceEstablished
  programOversightOfAnalytics: ProgramOversightOfAnalytics
  programYearsInOperation: ProgramYearsInOperation
  programModelUtilized: YesNo
  programFundingModel: ProgramFundingModel
  programReportsTo: ProgramReportsTo
  programScope: String
  AVG_Question_1: Float
  AVG_Question_2: Float
  ...
  AVG_Question_24: Float
}

I’ve tried various different things to do this over the last couple of days. To be honest I;ve just gotten myself tied up in a series of Map, Query, Let and Lambda statements. I was able to create a query to join the nested documents. I was also able to write a query to aggregate a single question score split by one of the categories. I haven’t been able to get even close to bringing this together into one query.

The SQL equivalent of what I am trying to do is below.

select 
a.orgIndustry,
a.orgFinanceSubIndustry,
...
b. programDataScienceEstablished,
b.programOversightOfAnalytics,
...
AVG(c.Question_1),
AVG(c.Question_2),
...
from a join b join c
group by 
a.orgIndustry,
a.orgFinanceSubIndustry,
...
b. programDataScienceEstablished,
b.programOversightOfAnalytics,
...

This the function that I tried to use to consolidate the documents but I couldn’t get further than this.

CreateFunction({
  name: "allbenchmarks",
  body: Query(Lambda(["size", "afterCursor", "beforeCursor"],
    Map(Paginate(Match(Index("allBenchmarkResponses"))), Lambda("bmRef", Let({
        responseDoc: Get(Var("bmRef")),
        orgRef: Select(
            ["data", "benchmarkResponsesOrgDetails"], Var("responseDoc")),
        orgDoc: Get(Var("orgRef")),
        progRef: Select(
            ["data", "benchmarkResponsesProgramDetails"], Var("responseDoc")),
        progDoc: Get(Var("progRef")),
        scoreRef: Select(
            ["data", "benchmarkResponsesScores"], Var("responseDoc")),
        scoreDoc: Get(Var("scoreRef"))
    }, {
        orgIndustry: Select(["data", "orgIndustry"], Var("orgDoc")),
        orgFinanceSubIndustry: Select(
            ["data", "orgFinanceSubIndustry"], Var("orgDoc")),
        orgHQCountry: Select(["data", "orgHQCountry"], Var("orgDoc")),
        orgAreaOfOperation: Select(
            ["data", "orgAreaOfOperation"], Var("orgDoc")),
        orgSize: Select(["data", "orgSize"], Var("orgDoc")),
        programDataScienceEstablished: Select(
            ["data", "programDataScienceEstablished"], Var("progDoc")),
        programOversightOfAnalytics: Select(
            ["data", "programOversightOfAnalytics"], Var("progDoc")),
        programYearsInOperation: Select(
            ["data", "programYearsInOperation"], Var("progDoc")),
        programModelUtilized: Select(
            ["data", "programModelUtilized"], Var("progDoc")),
        programFundingModel: Select(
            ["data", "programFundingModel"], Var("progDoc")),
        programReportsTo: Select(["data", "programReportsTo"], Var("progDoc")),
        programScope: Select(["data", "programScope"], Var("progDoc")),
        DCAM_BM_2019_1: Select(["data", "DCAM_Question_1"], Var("scoreDoc"))
    })))
  ))
})

Here is a sample document (with a few extra docs).

{
  "ref": Ref(Collection("BenchmarkResponse"), "287396098850423304"),
  "ts": 1610341128140000,
  "data": {
    "benchmarkResponsesScores": Ref(Collection("BenchmarkResponsesScore"), "287396098842037768"),
    "benchmarkResponsesProgramDetails": Ref(Collection("BenchmarkResponsesProgramDetails"), "287396098842034696"),
    "benchmarkResponsesOrgDetails": Ref(Collection("BenchmarkResponsesOrgDetail"), "287396098842036744"),
  }
}

{
  "ref": Ref(Collection("BenchmarkResponsesScore"), "287396098842037768"),
  "ts": 1610341128140000,
  "data": {
    "Question_3": 3,
    "Question_20": 4,
    "Question_11": 3,
    "Question_6": 4,
    "Question_10": 4,
    "Question_15": 4,
    "Question_9": 3,
    "Question_2": 3,
    "Question_21": 4,
    "Question_24": 3,
    "Question_5": 4,
    "Question_14": 3,
    "Question_23": 3,
    "Question_18": 3,
    "Question_13": 4,
    "Question_8": 5,
    "Question_17": 4,
    "Question_22": 4,
    "Question_12": 4,
    "Question_7": 3,
    "Question_4": 2,
    "Question_19": 5,
    "Question_16": 3,
    "Question_1": 3
  }
}

{
  "ref": Ref(Collection("BenchmarkResponsesProgramDetails"), "287396098842034696"),
  "ts": 1610341128140000,
  "data": {
    "programDataScienceEstablished": "In_Process_of_Being_Established",
    "programOversightOfAnalytics": "Partial",
    "programYearsInOperation": "A_Less_Than_One_Year",
    "programModelUtilized": "No",
    "programFundingModel": "Centralized",
    "programReportsTo": "COO",
    "programScope": "Global_or_Enterprise_Wide"
  }
}

{
  "ref": Ref(Collection("BenchmarkResponsesOrgDetail"), "287396098842036744"),
  "ts": 1610341128140000,
  "data": {
    "orgName": "Test",
    "orgIndustry": "Finance_Insurance_and_Banking",
    "orgAreaOfOperation": "Regional",
    "orgHQCountry": "Albania",
    "orgSize": "Tier_2",
    "orgFinanceSubIndustry": "Insurance"
  }
}

Thanks for reading and the help if you can offer it.

Best
Michael