Very high storage even though I have less than 100 data in the database

Let(
  {
    collections: Collections(),
    collectionsCount: Count(Var('collections'))
  },
  {
    collectionsCount: Var('collectionsCount'),
    dataCountInAllCollections: Select(
      ['data', 0],
      Reduce(
        Lambda(
          ['acc', 'current'],
          Add(
            Var('acc'),
            Var('current')
          )
        ),
        0,
        Map(
          Paginate(
            Var('collections')
          ),
          Lambda(
            ['ref'],
            Count(Documents(Var('ref')))
          )
        )
      )
    )
  }
)

Running this query will tell you how many collections you have and how many data in total you have. For my database, it returns:

{
  collectionsCount: 9,
  dataCountInAllCollections: 37
}

So I have 9 collections and 37 data in total but the dashboard says that my database has ~16.47 MB. It’s in the US region. All these data are simply numbers and strings.

Even if you have history_days set to zero, document versions are created and stored until the document garbage collector sweeps through your data as a background task.

Storage is effected by

  • Documents
  • Document history
  • Indexes
  • Index history

Index history often surprises people so I think it’s important to note. Any time a change to a document would result in the index being updated (term or value fields changed) the history of the Index is updated. This history is stored since you can use the At function to query an index at any moment in time.

NOTE: Documents uses a built-in Index, and that index counts towards storage. There are no terms involved, so the Documents indexes will only store entries for create and delete events.

You can use the Events function on any reference, including Set references. That means you can use it to enumerate the history for your Indexes.

Paginate(Events(Match(Index("users_by_email"), "jane@doe.com")))

You posted a similar question in Slack; is that for the same database? Either way, we can use that as an example. In that question, you said there are 20 documents and ~7MB storage. But we can also see in that there are 703 Write Ops recorded. In that case, storage might be represented by

  • 703 document versions
  • plus Index entries from the built-in Documents indexes for creating and deleting documents
  • plus Index entries for each update that changes any term or value field.

So, in order to get a more complete picture of how your database is using storage, we would need to know how many indexes you have on your 9 collections, and what terms are used. Then you can query for the history per Index per term.

  • Yes, it’s the same database and it grew 2x in size in just a few days.
  • No, there’s no 703 write ops recorded, as of today there are 225 write ops BUT checking it NOW shows ~4.16MB rather than ~16MB which was yesterday. So I suppose the garbage collection has run, so do I assume that it left some more documents in the history to be garbage collected OR this is now the real size of the database. Either way, it’s way too big, as you can see with some of my other databases, they’re less than 100kb, and there’s one there that’s almost 600kb which contains I believe more than 100 data in it.

This has to be a bug.

Another weird behavior is this:

Picture below shows stats for the last 7 days and displays 19mb

Picture below shows stats for the last 30 days and displays 4mb

Why is that?

I had the same problem when transferring data from the classic region to the us zone.
My collections are created with history_days = 0. I checked for all collections.
These are 2 databases with the same data. but the database in us region is growing in storage size continuously. I also have this problem for other databases in the US region.

@aprilmintacpineda The difference in last 30 days and last 7 days is the effect of averages. The last 7 days may have been higher, so when when averaged out over the last 30, the result is lower.

For both you and @henull, consider the number/frequency of writes, the number of Indexes that you have, and the shape of those Indexes. As I said before, even if you have history_days set to zero, the history will still accumulate until garbage collection sweeps through your data. It can take hours or many days for this happen.

This topic was automatically closed after 11 days. New replies are no longer allowed.