How to sort while filtering by many-to-one relationship

SleepingRobot · September 30, 2021, 5:07am

With a GraphQL schema similar to below, containing a many-to-one relation, I have a resolver UDF that I want to return a page of Review documents that:

Are filtered by their user field, and…
Are filtered by the inclusion of at least one Tag in their tags field that has a given text subfield, and…
Are sorted by their created field

(Basically, I need sorted reviews for a given user and tag)

type Review {
	user: User!
	created: Time!
	text: String!
	tags: [Tag] @relation
}

type Tag {
	review: Review! @relation
	text: String!
}

I tried to use a couple indexes like these…

CreateIndex({
  name: "recent_reviews_by_user",
  source: Collection("Review"),
  terms: [
    { field: ["data", "user"] }
  ],
  values: [
    { field: ["data", "created"], reverse: true },
    { field: ["ref"] }
  ]
})

CreateIndex({
  name: "tag_reviews_by_review_and_tag",
  source: Collection("Tag"),
  terms: [
    { field: ["data", "review"] },
    { field: ["data", "text"] }
  ],
  values: [
    { field: ["data", "review"] }
  ]
})

and Join the indexes together, like…

Paginate(
  Join(
    Match(
      Index("recent_reviews_by_user"),
      Ref(Collection("User"), "some user ID")
    ),
    Lambda(
      ["created", "reviewRef"],
      Match(Index("tag_reviews_by_review_and_tag"), [Var("reviewRef"), "some tag"])
    )
  )
)

but I’m guessing since the tag_reviews_by_review_and_tag index only returns a default ref value, the sorting performed by the recent_reviews_by_user index is lost in the Join

If someone has a suggestion for a better approach to achieve the result I’m after, it’d be greatly appreciated! Currently I’m feeling quite stuck unfortunately…

SleepingRobot · September 30, 2021, 6:10am

Aaaand it turns out I completely forgot about the aptly named Filter function

Switching to the below worked for me!

I’ll mark this as resolved, but if any Fauna experts want to weigh in within the next couple days on whether this is the most appropriate / efficient approach to solve this problem I’d really appreciate the feedback (as I clearly still do not have the firmest of grasps on what I’m doing ).

Paginate(
  Filter(
    Match(
      Index("recent_reviews_by_user"),
      Ref(Collection("User"), "some user ID")
    ),
    Lambda(
      ["_", "reviewRef"],
      ContainsValue(
        true,
        Map(
          Select(["data", "tags"], Get(Var("reviewRef"))),
          Lambda(
            ["tagRef"],
            Equals(Select(["data", "text"], Var("tagRef")), "some tag")
          )
        )
      )
    )
  )
)

ptpaterson · September 30, 2021, 7:09pm

Awesome, glad you got it! Quick notes about using Filter directly on a Set (inside of Paginate).

The filter operation will continue to be performed on the entire set until it returns enough Documents for the Page size (default is 64). If the results you want are sparse, this can mean A LOT of extra compute operations. And if you Get the documents in the filter function, it means a Read Op for every document that you actually end up skipping over. This kinda defeats the purpose of an Index.

Some permissions can be tricky. Since the Filter is working on all of the documents you need permission to read all of the documents. If you have a predicate function for read permission the transaction will fail once you come to a document that you don’t have permission for.

Using Filter may be fine for your use case. I can see that especially since it sounds unlikely for users to start creating many thousands of reviews. And cost-wise, you’ll end up reading all

method without `Filter`

I think you were super close to a solution without Filter. You can try this and then compare how the two solutions perform.

What I did here is

don’t worry about sorting at first when getting user’s reviews
get the reviews only with certain tags just like you did before!
add sorting with an additional index after that
remove the time-part of the page after pagination, since GraphQL doesn’t want that.

Let(
  {
    // these can be arguments for GraphQL resolver if you want
    id: '311182466752708674',
    tag: 'some tag',

    // Rebuild a ref from GraphQL since it will only send an ID value
    userRef: Ref(Collection('User'), Var('id')),

    // this index is already there from importing the schema
    userReviewsSet: Match(Index('review_user_by_user'), Var('userRef')),

    // use Join to get only the reviews with certain tags
    taggedReviewsSet: Join(
      Var('userReviewsSet'),
      Lambda(
        'reviewRef',
        Match(Index('tag_reviews_by_review_and_tag'), [
          Var('reviewRef'),
          Var('tag'),
        ])
      )
    ),

    // use Join with another Index to implement sorting
    sortedSet: Join(
      Var('taggedReviewsSet'),
      Index('reviews_by_ref__created_asc')
    ),

    page: Paginate(Var('sortedSet')),

    // GraphQL wants the Documents, so Get them
    page_for_gql: Map(Var('page'), Lambda(['time', 'ref'], Get(Var('ref'))))
  },

  // If your GraphQL resolver is not set to paginate: true, then only return data.
  Select("data", Var('page_for_gql'))
)

Here’s the extra index:

CreateIndex({
  name: "reviews_by_ref__created_asc",
  source: Collection("Review"),
  terms: [
    {
      field: ["ref"]
    }
  ],
  values: [
    {
      field: ["data", "created"]
    },
    {
      field: ["ref"]
    }
  ]
})

SleepingRobot · October 1, 2021, 2:10am

That was extremely informative, thank you so much as always, Paul. If you’re on buymeacoffee.com or similar, you’ll have to let me know because I’d love to pay you back in some small way for all the stellar support!

I’m getting the hang of writing FQL, but optimizations like this are still murky in my brain and this really helped me connect some important dots. I’ve rewritten my code using the approach you’ve outlined, and it’s working great!

With that said, I’m sorry to add on but do you think you could help me with one last snag I left out of my original question for simplicity’s sake?

In addition to filtering the reviews by tags [many-to-one], I’m also trying to figure out how I can support further filtering the reviews by an extra (optional) media type field [one-to-many].

Here’s a more complete example schema highlighting this.

type Review {
	user: User!
	created: Time!
	media: Media! @relation
	text: String!
	tags: [Tag] @relation
}

type Media {
	externalID: String!
	type: MediaType!
	reviews: [Review] @relation
}

enum MediaType {
	FILM
	SHOW
}

type Tag {
	review: Review! @relation
	text: String!
}

So if my UDF is provided with an additional mediaType variable equal to either "FILM" or "SHOW", what would be the best way to further filter down the set to only include reviews with media fields referencing Media docs with that type value?

I’m guessing I’d need to do this between steps 2. and 3. from your reply, but I’m not sure the most efficient way to go about it. My gut instinct was to try running Reduce on taggedReviewsSet, but I’m not sure what reducing a Set looks like (as far as I can tell, Fauna’s documentation only shows examples with arrays). I also feel like rather than iterating the set, there might be a more efficient way to utilize an index + Join again, despite the [one-to-many] relation making it more difficult for me to wrap my head around how I could get away with this.

Apologies if you’ve already given me the tools to work this out on my own at this point and I’m just being slow on the uptake

Cheers,
Ross

ptpaterson · October 1, 2021, 1:38pm

To filter by the media type you can get the Set of All Reviews for a given MediaType and then use Intersection.

// ...
    mediaRef: Select(0, Paginate(Match(Index('media_by_type'), Var('mediaType')))),
    // "media_reviews_by_media" should be generated from GQL schema
    reviewsByMediaSet: Match(Index('media_reviews_by_media'), Var('mediaRef')),
    // Intersection returns only those elements that are in both Sets
    reviesAllFiltersSet: Intersection(
      Var('taggedReviewsSet'),
      Var('reviewsByMediaSet')
    ),
    sortedSet: Join(
      Var('reviesAllFiltersSet'),
      Index('reviews_by_ref__created_asc')
    ),
// ...

For more Set operations like Intersection, check out the FQL cheat sheet.

You may want to allow users to provide a variable amount of filters (provide null for some terms) For tips on how to mix and match different terms with FQL you can check out this previous topic. SELECT FROM myIndex WERE a=1 AND b=2 AND c=3

SleepingRobot · October 1, 2021, 5:01pm

Ah! That makes sense.

Thank you so much again

system · October 3, 2021, 5:01pm

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to Sort by Count Help graphql , udfs	3	686	July 8, 2021
Index sorting using relationship? Help fql , indexes	2	413	May 4, 2021
Need help with pagination for a sorted index and UDL resolver Help graphql , udfs , functions	9	425	May 23, 2021
Paginate with dynamic relationship count Help graphql , fql , udfs	3	642	August 7, 2021
FQL: Query order by nested document value Help graphql , fql , udfs , functions	4	740	October 7, 2020

How to sort while filtering by many-to-one relationship

method without Filter

Related topics

method without `Filter`