What in best solution to stores dates ranges and how to know that given date belongs to the range

We build booking service and dates ranges is the main frequently requested information. And we have difficulties to find not expensive solution to operate with dates ranges.
At now we store date range by 2 fields dateFrom and dateTo.

Let’s look at a small example which is causing the problem:
Suppose we have booking (order) from 2021-07-01T07:00:00Z to 2021-07-10T07:00:00Z. User request vehicle status, provide dates range 2021-07-05T07:00:00Z and 2021-07-07T07:00:00Z (between dateFrom and dateTo).
What is solution to return right status in this case “not available”.

in documentation we find next solution to work with dates, which based on use “q.Range”:
indexes:

{
      name: 'orders_by_vehicle_and_dateFrom',
      serialized: true,
      source: q.Collection('orders'),
      terms: [{ field: ['data', 'vehicle'] }],
      values: [
        { field: ['data', 'dateFrom'] },
        { field: 'ref' },
      ],
 }
{
      name: 'orders_by_vehicle_and_dateTo',
      serialized: true,
      source: q.Collection('orders'),
      terms: [{ field: ['data', 'vehicle'] }],
      values: [
        { field: ['data', 'dateTo'] },
        { field: 'ref' },
      ],
 }

Then we doing request (simplified) :

q.Range(
  q.Match(q.Index('orders_by_vehicle_and_dateFrom'), q.Ref(q.Collection('vehicles'), vehicle.id)),
  [q.Time( 2021-07-05T07:00:00Z), q.Ref(q.Collection('vehicles'), vehicle.id)],
  [q.Time( 2021-07-07T07:00:00Z), q.Ref(q.Collection('vehicles'), vehicle.id)],
)
AND (Omitting different conversions Paginate, Select, Distinct, etc...)
q.Range(
   q.Match(q.Index('orders_by_vehicle_and_dateTo'), q.Ref(q.Collection('vehicles'), vehicle.id)),
   [q.Time(2021-07-05T07:00:00Z), q.Ref(q.Collection('vehicles'), vehicle.id)],
   [q.Time( 2021-07-07T07:00:00Z), q.Ref(q.Collection('vehicles'), vehicle.id)],
)

This solution not work because we looking dateFrom and dateTo beween 2021-07-05T07:00:00Z, 2021-07-07T07:00:00Z, and order where dateFrom: Time(“2021-07-01T07:00:00Z”), dateTo: Time(“2021-07-10T07:00:00Z”) never will be matched.

Another solution is collect all orders for given vehicle and then filter it to find which orders belong to given dates range like (Part of the request):

q.Filter(q.Var('vehicleOrders'),
  q.Lambda('vehicleOrder',
    q.Or(
      q.And(
        q.GTE(q.Var('from'), q.Select(['data', 'dateFrom'], q.Get(q.Var('vehicleOrder')))),
        q.LTE(q.Var('to'), q.Select(['data', 'dateTo'], q.Get(q.Var('vehicleOrder')))),
      ),
      q.And(
         q.LTE(q.Var('from'), q.Select(['data', 'dateFrom'], q.Get(q.Var('vehicleOrder')))),
         q.GTE(q.Var('to'), q.Select(['data', 'dateFrom'], q.Get(q.Var('vehicleOrder')))),
      ),
      q.And(
         q.LTE(q.Var('from'), q.Select(['data', 'dateTo'], q.Get(q.Var('vehicleOrder')))),
         q.GTE(q.Var('to'), q.Select(['data', 'dateTo'], q.Get(q.Var('vehicleOrder')))),
      ),
      q.And(
         q.LTE(q.Var('from'), q.Select(['data', 'dateFrom'], q.Get(q.Var('vehicleOrder')))),
         q.GTE(q.Var('to'), q.Select(['data', 'dateTo'], q.Get(q.Var('vehicleOrder')))),
      ),
)))

This solution work but looking very expensive, because we collect all orders for vehicle for all time, it may be many documents.

The questions is
"How to solve this problem with in right way, to economy resources?" or
"How correctly determine that a date belongs to a range?" or
“How to store date ranges correctly?”

Range operates on a list of values. The list of values that you are providing is either the dateFrom times or the dateTo times. For a single order, that means that Range can only consider a single entry. So, you either get a result or you don’t.

However, getting a matching result for dateFrom or dateTo, since you’re handling those separately, only tells you that there’s an order transition in the desired period, not how big it might be (does it extend outside of the desired period? You can’t know with only one boundary). You have to consider the timespan between dateFrom and dateTo all at once.

Let’s unpack the problem a bit.

By comparing time periods, the only possible answers are:

  1. desired period does not overlap with any existing order (“available”)
  2. desired period does overlap with an existing order (“partial availability”)
  3. desired period is contained within an existing order (“not available”)

Answers 1 and 3 are exactly what you want, but answer 2 is a bit fuzzy: it means that the vehicle is not available for the entirety of the desired period. Maybe that’s what you are looking for.

My impression is that you want to know available/not available for every date in the desired range, which requires iterating over every step in the desired range. I’d like to talk about the steps as dates to make the discussion easier.

If that’s correct, you’ll want to do something like this:

  1. Create a new index on orders that provides both the dateFrom and dateTo values, so that the ordered period can be considered:

    CreateIndex({
      name: "orders_by_vehicle",
      source: Collection("orders"),
      terms: [
        { field: ["data", "vehicle"] }
      ],
      values: [
        { field: ["data", "dateFrom"] },
        { field: ["data", "dateTo"] },
        { field: ["ref"] },
      ]
    })
    
  2. Build an array of datetimes that cover the desired time span:

    This is easy to do in the host language, but is a bit trickier in FQL since there is no For or While loop capability, so we have to use recursion. Here’s a UDF that creates an array of date-based timestamps from the starting datetime to the ending datetime:

    CreateFunction({
      name: "spreadDays",
      body: Query(
        Lambda(
          ["start", "end", "acc"],
          Let(
            {
              current: TimeAdd(Var("start"), 1, "day"),
              list: If(IsArray(Var("acc")), Var("acc"), [Var("start")])
            },
            If(
              LTE(Var("current"), Var("end")),
              Call("spreadDays", [
                Var("current"),
                Var("end"),
                Append(Var("current"), Var("list"))
              ]),
              Var("list")
            )
          )
        )
      )
    })
    

    Recursion works up to a call depth of 200, so the generated array can contain up to that many dates.

    Also note that we pass in a 0 as a dummy parameter. Inside, if the parameter is not an array, we create an array that includes the start date.

    When we run it with your desired range, we get:

    > Call(
      "spreadDays",
      [
        Time("2021-07-05T07:00:00Z"),
        Time("2021-07-11T07:00:00Z"),
        0
      ]
    )
    [
      Time("2021-07-05T07:00:00Z"),
      Time("2021-07-06T07:00:00Z"),
      Time("2021-07-07T07:00:00Z"),
      Time("2021-07-08T07:00:00Z"),
      Time("2021-07-09T07:00:00Z"),
      Time("2021-07-10T07:00:00Z"),
      Time("2021-07-11T07:00:00Z")
    ]
    
  3. Create a UDF to answer the question: is a specific vehicle available on a specific date?

    CreateFunction({
      name: "isVehicleAvailableOnDate",
      body: Query(
        Lambda(
          ["vehicle", "date"],
          Let(
            {
              taken: Filter(
                Match(
                  Index("orders_by_vehicle"),
                  Var("vehicle")
                ),
                Lambda(
                  ["from", "to", "ref"],
                  And(
                    GTE(Var("date"), Var("from")),
                    LTE(Var("date"), Var("to"))
                  )
                )
              )
            },
            Equals(Count(Var("taken")), 0)
          )
        )
      )
    })
    

    This function finds the set of orders that match the specific vehicle, and filters that set to the orders that contain the specified date (“taken” aka “unavailable”). If the taken set contains any orders, the vehicle is unavailable. This also works if there are no orders for the specified vehicle, which means that it is available.

  4. Put it all together. For this call, I’m setting the end date of the desired period outside of the order period so that you can see the availability transition:

    Map(
      Call("spreadDays",
        [
          Time("2021-07-05T07:00:00Z"),
          Time("2021-07-12T07:00:00Z"),
          0
        ]
      ),
      Lambda(
        "date",
        {
          date: Var("date"),
          available: Call(
            "isVehicleAvailableOnDate",
            [
              Ref(Collection("vehicles"), "1"),
              Var("date")
            ]
          )
        }
      )
    )
    [
      { date: Time("2021-07-05T07:00:00Z"), available: false },
      { date: Time("2021-07-06T07:00:00Z"), available: false },
      { date: Time("2021-07-07T07:00:00Z"), available: false },
      { date: Time("2021-07-08T07:00:00Z"), available: false },
      { date: Time("2021-07-09T07:00:00Z"), available: false },
      { date: Time("2021-07-10T07:00:00Z"), available: false },
      { date: Time("2021-07-11T07:00:00Z"), available: true },
      { date: Time("2021-07-12T07:00:00Z"), available: true }
    ]
    

I don’t know if this is the most efficient way to answer your question, but it is fairly performant:

{
  'x-compute-ops': '3',
  'x-read-ops': '6',
  'x-byte-read-ops': '1',
  'x-byte-write-ops': '0',
  'x-write-ops': '0',
  'x-query-time': '61',
  'x-query-bytes-in': '302',
  'x-query-bytes-out': '476',
  'x-storage-bytes-read': '90',
  'x-storage-bytes-write': '0',
  'content-length': '476',
}

The reads are so small because we’re reading from an index that has only one indexed document, and despite calling the isVehicleAvailableOnDate function multiple times, the index read(s) are cached for the duration of the query.

Hi Yuri,

Did Ewan’s answer resolve your question? Or do you need any additional assistance?

Cory

I was a little busy with another task. Now I’m starting to implement it as Iwan advises, if there are problems or questions, I’ll write it down here, thanks.