Using Pagination and cursor how to refresh existing results for use in grids

Finding the use of cursor’s and the Java Fauna SDK to be painful work with. After a lot of reading of the docs and the forum and trial and error and seeing the same confusing cursor structure and overlapping page data as referenced here I gave up on trying to keep track of my the ref ID on the after/before pagination result details in hopes of not having to serialize the whole thing and push it down to the client that ultimately uses it in the grid and am serializing and finally got it to work for the grid’s first page, next page, previous page, and last page requirements but having to serialize the whole after/before cursor down to the client and then back to the backend on each request.

Now am struggling on how to deal with the refresh requirement? If I’m on page 5 for example and want to get the latest data and stay on page 5 how do I do that with the cursor and the stateless approach with the Java Fauna SDK (btw, a lot of the helpful methods are private with cursor and other objects). So help here is greatly needed please.

Also as an observation, this has been one of the most confusing and cumbersome efforts for pagination I’ve dealt with so far in 15 years. Having used multiple relational databases, many NoSQL DB including DynamoDB, Couchbase, etc. I really hope there’s something on the roadmap to improve this. Ideally it would be a simple offset number that the index can be leveraged with (i.e. assume I read 20 pages at a time and I’m now on the grid showing page 5 I know how many records in I am and can pass that number as an offset to get the next 20) in combination with the size limit that is easy to build dynamically without having to serialize and deserialize cursors and a lot of wasted code and logic that affect performance.

I’m sorry that you are struggling to get a working solution. Can you provide some of the queries that you have developed so far? We often find that specific details of a query help to find a solution.

Fauna is a consistent database. Unlike the query model used by traditional relational databases, there is no cached temporary result for a query that can be paginated. When you receive a page of results, fetching the next page necessarily gets the latest values, since they might have changed between the first and second queries.

Typically, you don’t have to track details within a cursor. Simply use the value of the after or before cursor (depending on which direction you want to move) in subsequent queries.

For example:

        Value page1 = client.query(
            Map(
                Paginate(Documents(Collection("People")))
                .size(3),
                Lambda("ref", Get(Var("ref")))
            )

        ).get();
        System.out.println("Page1:");
        System.out.println(page1);

        Value page2 = client.query(
            Map(
                Paginate(Documents(Collection("People")))
                .size(3)
                .after(page1.get(Field.at("after"))),
                Lambda("ref", Get(Var("ref")))
            )
        ).get();
        System.out.println("Page2:");
        System.out.println(page2);

With the “People” documents created in the Indexing tutorial, the output is:

Page1:
{after: [ref(id = "323053725221388800", collection = ref(id = "People", collection = ref(id = "collections")))], data: [{ref: ref(id = "323053725206708736", collection = ref(id = "People", collection = ref(id = "collections"))), ts: 1644346890610000, data: {first: "Alan", last: "Perlis", age: 97, degrees: ["BA", "MA", "PhD"], letter: "A"}}, {ref: ref(id = "323053725210903040", collection = ref(id = "People", collection = ref(id = "collections"))), ts: 1644346890610000, data: {first: "Alan", last: "Turing", age: 107, degrees: ["BA", "MA", "MS", "PhD"], letter: "B"}}, {ref: ref(id = "323053725217194496", collection = ref(id = "People", collection = ref(id = "collections"))), ts: 1644346890610000, data: {first: "Grace", last: "Hopper", age: 119, degrees: ["BA", "MA", "PhD"], letter: "C"}}]}
Page2:
{before: [ref(id = "323053725221388800", collection = ref(id = "People", collection = ref(id = "collections")))], after: [ref(id = "323053725233971712", collection = ref(id = "People", collection = ref(id = "collections")))], data: [{ref: ref(id = "323053725221388800", collection = ref(id = "People", collection = ref(id = "collections"))), ts: 1644346890610000, data: {first: "Leslie", last: "Lamport", age: 80, degrees: ["BS", "MA", "PhD"]}}, {ref: ref(id = "323053725225583104", collection = ref(id = "People", collection = ref(id = "collections"))), ts: 1644346890610000, data: {first: "Marvin", last: "Minsky", age: 92, degrees: ["BA", "PhD"], letter: 1}}, {ref: ref(id = "323053725229777408", collection = ref(id = "People", collection = ref(id = "collections"))), ts: 1644346890610000, data: {first: "Stephen", last: "Cook", age: 81, degrees: ["BS", "PhD"], letter: "F"}}]}

Notice that in the first page, only the after cursor is present, since there are only pages following the current page in the result set. In the second page, both the before and after cursors are present, since there is additional data (from the perspective of page 2) in either direction.

All that changed between the first page and second page queries is the addition of .after(page1.get(Field.at("after"))) to the second page query. That inject the value of the after cursor from the first page results into the query for the second page.

If you are concerned that page x has refreshed since you first fetched it, and you want to keep the pagination consistent in your UI, keep track of the cursors returned from each page, and then re-run page x’s query.

We definitely want to improve the developer experience when using our drivers. If you have any suggested code samples that would notably improve the experience, please add them here.

I’m getting the after cursor that looks something like this for some of the indexes and in others like you sampled:

after: [
    123123123,
    Ref(Collection("Sales"), "266869347503309324"),
    Ref(Collection("Sales"), "266869347503309324")
  ]

And like your sample, for some indexes I get an after cursor that has only 1 ref ID returned. While I may understand potentially the reason as explained on the link I found in the forum for similar struggle, it doesn’t make it any easier to develop in a dynamic way for with the lack of consistency.

I don’t understand how I wouldn’t have to track the details within the cursor if I have to send (serialize) those from the stateless backend to a Javascript frontend over a RESTful API call so that it can be consumed in a grid and sent back over to the backend on the back/forward/first page/last page/refresh requests so that I can get the proper page?

Typically, you don’t have to track details within a cursor. Simply use the value of the after or before cursor (depending on which direction you want to move) in subsequent queries.

The frontend javascript client app’s grid has a refresh of current page feature. This is an existing app that is being upgraded and decision was to use Fauna as the DB that the Java backend communicates with on behalf of the frontend. So we can’t remove functionality. But that’s exactly what I’m asking – how to do a refresh of a specific page as there’s no documentation around that which I could find.
So if I’m understanding what you’re saying, then I have to introduce in addition to tracking the after and before cursors on each request that has to serialize them down to the frontend Javascript client, I now have to also add logic to a completely stateless environment to keep the cursor of each request. Ouch. I was afraid of that. Makes it even more useless data to send back and forth per each page to/form client.

If you are concerned that page x has refreshed since you first fetched it, and you want to keep the pagination consistent in your UI, keep track of the cursors returned from each page, and then re-run page x’s query.

I don’t have any code samples, but as I suggested in my original post, having a consistent after/before cursor (not sometimes it’s an array of 2 rows, sometimes 3 rows, sometimes x) would be a start. And if it’s enough to pass in the JSON looking syntax with just the ref ID for after/before then I could just store pass the ref ID to the client and the client can pass it back to the backend and backend can build the same looking JSON object dynamically substituting the ref ID that would already be less to serialize and easier to build dynamically if offsets are not possible.

For example, a lot of frontend grids have a functionality to skip to a specific page, i.e. 101st page out of say 500 pages. I see no way doing that with current pagination without a lot of additional effort for a developer.

Or what if I want to leverage the frontend grid’s column filtering that does >, <, AND,OR,CONTAINS,NOT CONTAINS,EQUALS,NOT EQUALS, and column sorting as part of the pagination in a frontend/backend approach that has to account for building the queries dynamically it seems so difficult with the current implementation. The amount of backend code (at least with your Java SDK) that this appears to require to make it dynamic I cringe.

We definitely want to improve the developer experience when using our drivers. If you have any suggested code samples that would notably improve the experience, please add them here.

it doesn’t make it any easier to develop in a dynamic way for with the lack of consistency.

Fauna’s cursors involve the structure of the index involved in the query. If the index specifies no values fields, only the covered document’s reference is included. Any fields that you do specify in values are included, in that order, in the cursor and the covered document’s reference is appended to the end.

If you want all of your cursors to look the same, then you have to define your indexes to return the same number and type of elements.

It might be helpful to think of indexes as both a traditional index (via the terms definition), and a view (via the values definition). The view effectively aggregates index entries that conform to the defined structure in a way that can be sorted and paginated. Since Fauna does not use a tabular data structure, there is no “nth row” to return, so the cursors have a notably different structure than you’re used to.

I’m sorry if your impression of this is “a bug”, but it is actually a feature. Cursors work exactly like the Range function. The tuples in an index entry provide prefix-based searching into an ordered set. You could construct your own cursor that involves just the first field from the index’s values definition, and you’ll still get results, but the boundary for the range will be less precise than if you had specified all of the fields. So if your index’s values field specifies the price, product, and ref fields, you could use a cursor that just specifies the price field and pagination would work pretty well, with only a few overlaps.

I don’t understand how I wouldn’t have to track the details within the cursor

I’m not sure why you feel you have to track the individual elements in a cursor. If you serialize a cursor so that a UI can receive and send them, serialize the whole cursor. Or serialize the prefix that makes the most sense for your pagination queries.

Because there is no result cache, the size of a set is not known until a query has been processed. As such, there’s no way to jump to a specific page number in the middle of a result set. It’s easy to get the first page: provide no cursor. It’s easy to get the last page: specify a before cursor of null. Per Fauna’s sorting precedence, null always sorts last, so a before of null is guaranteed to be after the last entry in any result set.

If your grid involves a “jump to page x”, you would, indeed, have to walk through all of the pages, capture the cursors, and use those cursors to re-run any particular page’s query.

And if it’s enough to pass in the JSON looking syntax with just the ref ID for after/before

The ref id, alone, is not enough to be used as a cursor. At the very least, you would have to use it to recompose the original document reference.

Or what if I want to leverage the frontend grid’s column filtering that does > , < , AND , OR , CONTAINS , NOT CONTAINS , EQUALS , NOT EQUALS , and column sorting as part of the pagination

Fauna forces you to be much more deliberate about your searching. It’s a shared service, not a bespoke database server hosted per client. We need to implement reasonable restrictions to prevent table scans (and other performance-degraded query patterns) from blocking everyone else’s queries.

As there is no cached result, there is no ad-hoc sorting. You can use Filter, Any, All, Reduce to create various subsets of a set, and you can use Intersection, Union, and Difference to perform AND, OR, or NOT logic. But the primary way to configure sorting is the values definition of an index. That does mean that you end up creating a lot of indexes. I’m afraid that any existing app the depends heavily on SQL semantics is likely going to need an overhaul to work with Fauna.

I hope that helps. But I also recognize that perhaps Fauna is not the best data API for your use case.

I’d say that’s the wrong take away from what I’m saying. I’m not trying to implement something that isn’t a fairly common use case. Even if it were what seems to be more of Fauna’s common use case of straight Javascript to Fauna (at least that’s the impression from reading the forum) without the additional API backend handling things the point is that it needs to work in a more easier way in this case with grids. And the way the pagination is designed as well as the need for supporting a grid’s column filtering abilities isn’t exactly a niche use case but it currently makes some very common grid uses very complex without hard coding each and every kind of variation. From my view point as a developer and someone that has influence on which DB it’s got to be easier to move to not harder.

Even something that isn’t technically possible with jumping around pages can be done as you described but has to be done by the dev which is clearly something that can be also implemented as part of the SDK too and warned in docs on what is happening behind the scene is the page after page jumping based on the size limit and filter conditions set so that it’s easier.

And sure Filter, Any, All, Reduce, Intersection, Union, Difference, etc. etc. etc. can all be used to deal with the grid’s column filtering. But can you appreciate the amount of coding / hard coding that would have to be done to account for all those permutations? And that’s just for 1 grid. What if an app has multiple grids, each with its own queries. Ouch again.

Or are you really suggesting that using Fauna and expecting the data API to work for grids isn’t a good use case?

I don’t understand how I wouldn’t have to track the details within the cursor

I’m not sure why you feel you have to track the individual elements in a cursor. If you serialize a cursor so that a UI can receive and send them, serialize the whole cursor. Or serialize the prefix that makes the most sense for your pagination queries.

Ok, so again, let’s assume I get an after cursor that looks like this, if you’re saying I don’t need to serialize it entirely, what part of it do I need to serialize then? Don’t I need to know the value 123123123 and don’t I need to know the name of the collection and don’t I need to know the ref id? And how would I know that the index requires two terms otherwise so that I can re-create it to go to the next page when the frontend sends me the cursor that I fully serialized? Can you give a concrete example on what minimum I would need to serialize in this case if not the whole cursor below?

after: [
    123123123,
    Ref(Collection("Sales"), "266869347503309324"),
    Ref(Collection("Sales"), "266869347503309324")
  ]

Or are you really suggesting that using Fauna and expecting the data API to work for grids isn’t a good use case?

What I’m telling you is how Fauna works. Grids were developed based on SQL query patterns, and Fauna works differently. As you say, development work would be required. Compared to SQL, Fauna is very new, so the availability of pre-built solutions is far lower.

When you say “ouch”, that seems to me to express sufficient frustration that you won’t entertain the idea of doing that development work, so perhaps Fauna is not the best choice for you. If that’s not the case, I’m happy to help you work out amenable solutions.

Ok, so again, let’s assume I get an after cursor that looks like this, if you’re saying I don’t need to serialize it entirely, what part of it do I need to serialize then?

It’s not about the cursor, exactly, but your query pattern and your tolerance for potentially repeated items.

Since you’re using a grid, the expected partition between one page and the next most often needs to be precise: when page x ends with entry y, page x+1 starts with entry y+1. In that case, you should serialize the entire cursor.

If your use case is tolerant of repeated elements, such as an entry at the end of one page appearing on the beginning of the next, you could trim elements from the cursor. The minimum number of elements is 1.

As for a concrete example, suppose your index covers documents describing people, and its value definition returns the first_name and age fields. As you paginate through the result set, the before and after cursors would contain the structure [first_name, age, ref], where ref is the unique identifier for the covered document. That combination of elements provides an exact boundary between one entry in the result set and the next.

Now, suppose you adjust a cursor to only contain [first_name, age]. That cursor would work, but the boundary now is based just on the first name and age. If there are multiple people with the same first name and age, then you can’t distinguish between them: it’s imprecise. So, even if page 4 ended with two Alan, 34 entries, page 5 could begin with 5 Alan, 34 entries, and you wouldn’t be able to tell if these were 5, 4, or 3 additional entries.

Similarly, if the cursor was adjusted to only contain [first_name], then the cursor is saying "start with the first entry that has first_name equal to a particular first name, regardless of age. There could be a lot of Alan entries.

The description of the Range function covers the concept pretty well, and includes helpful illustrations.

So you have to ask yourself, based on the data that you have, what should pagination through a result set look like? The answer to that question will help you decide if, or by how much, a cursor can be truncated.

The simplest approach is to serialize the entire cursor: you get the precision you likely require and you don’t need to reason about the cursor structure itself.