Correct way to implement a simple rate-limiting?

Hi all!

I’m trying to implement a simple rate-limiting to one of my API endpoints running on Netlify Functions. Being a non-expert in Fauna (and not being a database-expert by any means), I’m looking for feedback on my approach, so I can possibly learn about the best practices here.

In my early days with Fauna, I did a single query per-request, something like:

Get a list of all documents that match a condition, receive the response back in client, do the filtering of required data on client-side, send updated info back to Fauna, etc. I then read somewhere (can’t find where), that it’s better to do this all inside a single Fauna query because:

  1. It will be faster than sending multiple requests
  2. Fauna will count this as 1 transaction (as opposed to triggering multiple transactions in my above approach) - however I’m not 100% sure about this. Please let me know if my example below indeed counts as just 1 transaction.

Before I share the example, let me explain my setup. I have created a collection (accessLog), with history turned off and TTL set to 1 day with the following document structure:

{
  data: {
    email: 'foo@bar.com',
    ip: '192.168.0.1',
    time: Now()
  },
  ref: 'some ref',
  ts: 1234
}

I have 2 indices - 1 for email, 1 for IP. The email index has a unique constraint. I’m performing the following query:

Let(
  {
    logs: Select(
      'data',
      Paginate(
        Union(
          Match(
            Index('emailLogIndex'),
            email
          ),
          Match(
            Index('ipLogIndex'),
            ip
          )
        )
      )
    )
  },
  If(
    Equals(
      Count(
        Var('logs')
      ),
      0
    ),
    Create(
      Collection('accessLog'),
      {
        data: {
          email,
          ip,
          time: Now()
        }
      }
    ),
    Map(
      Var('logs'),
      Lambda(
        'ref',
        Let(
          {
            doc: Get(
              Var('ref')
            )
          },
          If(
            LTE(
              TimeDiff(
                Select(
                  ['data', 'time'],
                  Var('doc')
                ),
                Now(),
                'day'
              ),
              1
            ),
            Abort(
              '429'
            ),
            Update(
              Select(
                'ref',
                Var('doc')
              ),
              {
                data: {
                  time: Now()
                }
              }
            )
          )
        )
      )
    )
  )
)

To explain what I’m doing:

I’m creating a variable named logs which would be an array consisting of a Union of the results of a match by email and any number of matches by IP. Note that, the email can sometimes be an empty string.

Then, if the length of the logs array is equal to 0 (which means, no results were found), I create a new document with the required data (as per format shared above). If even a single result was found (array length greater than 0), I map over the logs array with the following function (Lambda):

I store the document in a variable named doc. Then, I compare that if the time recorded in data.time of the document is less than 1 day before. If yes, I abort the request with 429 as the message. If not, I update the document with the new time. Note that, I do not expect the “if note” condition to be triggered here because, as per my understanding, the document should not exist because the TTL is 1 day.

My questions are:

  1. Is this an efficient query, or could it be made better?
  2. Does anyone else recommend a different approach altogether?

Fauna will count this as 1 transaction (as opposed to triggering multiple transactions in my above approach) - however I’m not 100% sure about this. Please let me know if my example below indeed counts as just 1 transaction.

Each Fauna query that you execute is transactional. Provided that your query “fits” within the transaction limits, you do can whatever you need to do. Note that billing is not per transaction but on the number of read, write, and compute operations that your transaction(s) consume

Is this an efficient query, or could it be made better?

Your query either creates an accessLog entry, or updates an existing one. For low-volume updates, that would work fine. For high-volume updates, you’ll run into contention issues. Contention occurs when two or more transactions attempt to update one or more documents in a short time period. When contention occurs, you’ll get an error. A naive solution would be to try the query again. The retried query might succeed, or it might cause additional contention.

Does anyone else recommend a different approach altogether?

Avoiding contention takes some work, but it can certainly be done. The documentation includes a tutorial that describes a similar scenario that can cause contention and provides a solution based on the Event Sourcing pattern.

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.