Sort results by a computed value

Is there a way to sort the result of a paginated set using a value that is computed, for example: my data looks like:

{
  type: 'expenses',
  amount: 123,
  createdAt: 'some date'
}

Now, if I want to fetch all data in this collection and sort it by createdAt and difference. Note that difference does not exist, instead it will be computed on the fly using a value provided by the user… So:

Given:

Data

[
  {
    type: 'expenses',
    amount: 500
  },
  {
    type: 'expenses',
    amount: 100
  },
  {
    type: 'expenses',
    amount: 1000
  }
]

value

5

I should receive the following data in the following order (assuming ascending order):

[
  {
    type: 'expenses',
    amount: 100,
    difference: 95
  },
  {
    type: 'expenses',
    amount: 500,
    difference: 495
  },
  {
    type: 'expenses',
    amount: 1000,
    difference: 995
  }
]

There is no native FQL Sort function, though there is a recent feature request for one already: Native Sort Function

Are you suggesting adding to that a Lambda argument to determine the order? That would be handy! Here might be one example. What do you think?

Sort(
  items,
  Lambda(["first", "second"], Subtract(Var("first"), Var("second")))
)

also…

Index bindings can sort by computed value

If you need to sort values before pagination, you can include index bindings in the values and your index will sort on them.

Avoid computing with FQL if possible

FQL is not a great choice for general-purpose computing. It is very powerful when you need to operate close to the data, and can be critical if that compute needs to be part of the transaction.

However, if you find yourself performing a lot of compute after your data has been paginated, you might consider if it is truly necessary. There are most certainly cases where it will be. For other cases, it might be best to return a page as is and transform client-side.

The case where it is likely not necessary is performing a transformation that does not affect the underlying Set.

If you need to sort values before pagination, you can include index bindings in the values and your index will sort on them.

It’s not really possible to do this on the index level at the moment, because I can’t do something like:

CreateIndex({
  name: 'sampleIndex',
  source: {
    collection: Collection('sampleCollection'),
    fields: {
      computedField: Query(
        Lambda(
          ['document', 'userInputArgument'] // <--- notice userInputArgument has to be given to this lambda,
          Subtract(
            Select(['data', 'amount'], Var('document')),
            Var('userInputArgument')
          )
        )
      )
    }
  },
  values: [
    { binding: 'computedField' } // sort by the computed field
  ]
})

Notice that I passed an argument to the binding, I can’t find that in the documentation so I’m not sure how that can be achieved, because I need that userInputFIeld to compute the difference, right?

Reason why I want it to be tied as a condition to the pagination

Consider the following data:

[
  {
    amount: 1000,
    createdAt: 'some date'
  },
  {
    amount: 594,
    createdAt: 'some date'
  },
  {
    amount: 995,
    createdAt: 'some date'
  },
  {
    amount: 104,
    createdAt: 'some date'
  },
  {
    amount: 607,
    createdAt: 'some date'
  },
  {
    amount: 7896,
    createdAt: 'some date'
  },
  {
    amount: 107,
    createdAt: 'some date'
  },
  {
    amount: 96,
    createdAt: 'some date'
  },
  {
    amount: 12,
    createdAt: 'some date'
  },
  {
    amount: 876,
    createdAt: 'some date'
  },
  {
    amount: 346,
    createdAt: 'some date'
  }
]

and userInput of 5 to compute the difference.

By default, the way that the data sits in the collection is that it’s sorted on when it was created, pretty much like how array.push() (correct me if I’m wrong).

So If I do Paginate with size of 3 and then do the sorting AFTER the data has been fetched so the Paginate is not aware of it.

What will happen is I will get these results:

[
  {
    amount: 1000,
    createdAt: 'some date'
  },
  {
    amount: 594,
    createdAt: 'some date'
  },
  {
    amount: 995,
    createdAt: 'some date'
  }
]

and then sort them to get:

[
  {
    amount: 594,
    difference: 589,
    createdAt: 'some date'
  },
  {
    amount: 1000,
    difference: 995,
    createdAt: 'some date'
  },
  {
    amount: 995,
    difference: 990,
    createdAt: 'some date'
  }
]

Well now this is a bad thing because paginate has skipped on the other data, because the next page will look like (after already sorted)

[
  {
    amount: 104,
    difference: 99,
    createdAt: 'some date'
  },
  {
    amount: 607,
    difference: 602,
    createdAt: 'some date'
  },
  {
    amount: 7896,
    difference: 7891,
    createdAt: 'some date'
  }
]

So the whole data is now incorrectly sorted, why? Because now the whole data (from page 1 to 2) look like:

[
  {
    amount: 594,
    difference: 589,
    createdAt: 'some date'
  },
  {
    amount: 1000,
    difference: 995,
    createdAt: 'some date'
  },
  {
    amount: 995,
    difference: 990,
    createdAt: 'some date'
  },
  {
    amount: 104,
    difference: 99,
    createdAt: 'some date'
  },
  {
    amount: 607,
    difference: 602,
    createdAt: 'some date'
  },
  {
    amount: 7896,
    difference: 7891,
    createdAt: 'some date'
  }
]

Notice how they are no longer in the right order?

The correct order would have been:

Page 1:

[
  {
    amount: 12,
    difference: 7,
    createdAt: "some date",
  },
  {
    amount: 96,
    difference: 91,
    createdAt: "some date",
  },
  {
    amount: 104,
    difference: 99,
    createdAt: "some date",
  }
]

Page 2

[
  {
    amount: 107,
    difference: 102,
    createdAt: "some date",
  },
  {
    amount: 346,
    difference: 341,
    createdAt: "some date",
  },
  {
    amount: 594,
    difference: 589,
    createdAt: "some date",
  }
]

So combining the 2 pages:

[
  {
    amount: 12,
    difference: 7,
    createdAt: "some date",
  },
  {
    amount: 96,
    difference: 91,
    createdAt: "some date",
  },
  {
    amount: 104,
    difference: 99,
    createdAt: "some date",
  },
  {
    amount: 107,
    difference: 102,
    createdAt: "some date",
  },
  {
    amount: 346,
    difference: 341,
    createdAt: "some date",
  },
  {
    amount: 594,
    difference: 589,
    createdAt: "some date",
  }
]

They are still in the correct order.

In MySQL, this would translate to roughly something like:

select *, table.amount - 5 as difference from table order by difference asc

forgive me if the syntax is incorrect, I’ve forgotten SQL :sweat_smile:

For more context

These data are just dummy data that represents what I want to do, in reality, I have a collection of events or meetups and I’m fetching them and displaying them to the user, I want to sort them, in ascending order, based on distance of the current location of the user and the venue of the event. That distance is the computed field. I simply made a simpler version of it as the topic to remove unnecessary complications.

Indexes do not have user inputs, as you found. You would have to have a separate index for each input value and that wouldn’t scale very well.

In this case, since you are performing an operation on a value that preserves the same order, you can index/sort on the amount field. The results will be sorted correctly and then you can perform your user defined operation on it after Pagination.

That doesn’t help with the general problem: you can’t operate like that on a Set before you Paginate.

In my real case, I’m not sure how which field to sort with, I have lat and lng and sorting based on lat and lng doesn’t really sort them by distance, right?

Distance to another latitude and longitude? No. Not without creating an index for each input value – less a location search and more a “how close am I to a known, never changing location?” kinda search.

I’m not the one to ask for help about geohashes, but there should be ways for you to compute a value that you can then sort by, even Range over. Maybe you could Intersection with a latitude range and a longitude range?

But if we’re going to get into the weeds about geohashes then we should move to a new topic, or this existing one: Geohashes in FaunaDB

Yes kind of like that I think, in my case I simply did it myself since I don’t have much choice.

try looking at this:

There’s a sample query there:

SELECT id, ( 3959 * acos( cos( radians(37) ) * cos( radians( lat ) ) * 
cos( radians( lng ) - radians(-122) ) + sin( radians(37) ) * 
sin( radians( lat ) ) ) ) AS distance FROM your_table_name HAVING
distance < 25 ORDER BY distance LIMIT 0 , 20;

Notice the order by distance that’s basically what I’m trying to achieve.

Cool! Native support for geospacial types and queries would be awesome. Geohashes are on the roadmap. If you would like to get more focus on it, make sure to vote up the feature request, and update that topic with your use case.

Note that even though SQL made that look easy, I think we can be pretty confident that query took a full table scan to complete. Even though it limits to 20 rows, it still had to calculate everything to order it. You can do the same thing with Fauna – it just won’t scale well. Then again, neither will that SQL query.

Yes I’m pretty sure it did, it needs to do that in order to do the order by part… I’m probably just gonna leave that part out for now.