Index building stuck (> 50000 documents)

Hello !

I have been trying to create a new index on a big collection (>50000 documents).
I have managed to create a few new ones over it, by waiting more than 8 hours (and retrying).
But most of the time it is not working, and I guess it’s just because of the number of documents.
There’s no bindings involved, just terms / values.

CreateIndex({
 name: 'myIndex',
 source: Collection("myBigCollection"),
 terms: [{ field: ["data.user"] }],
 values: [{ field: ["data.createdAt"] }, { field: ["ref"] }],
})

The only thing I think might work for me at this point is to create a new collection, recreate all needed indexes, then read all documents to re-create them on the new collection.
I was doing this but it’s really expensive and I forgot some indexes so I had to re-do it again :frowning:

How can we figure this out ? How can we help for the index buildings ? Should we have a dedicated bill for this ? With time, some features needs to have new indexes over big collections.

Thank you !

@n44ps Can you please confirm if your Index still not active ?

Hey @Jay-Fauna !
I re-created the index :

{
  name: "findUserProgressesByCreatedAt",
  unique: false,
  serialized: true,
  source: "userProgresses",
  terms: [
    {
      field: ["data", "user"]
    }
  ],
  values: [
    {
      field: ["data", "createdAt"]
    },
    {
      field: ["ref"]
    }
  ]
}

And this is the current events for it !

Paginate(Events(q.Index("findUserProgressesByCreatedAt")))

{
  data: [
    {
      ts: 1605782796810000,
      action: "create",
      document: Index("findUserProgressesByCreatedAt"),
      data: null
    }
  ]
}

>> Time elapsed: 13ms

Any advices for those use-cases ?
Thanks :slight_smile:

Hi @n44ps, I am assuming userProgresses has more than 128 documents. If so, index will be built through a background asynchronous task. Can you paste Get(Index("findUserProgressesByCreatedAt")) output please. Meanwhile will take a look at the system. Recreating same Index multiple times because it is not active does not help and would not recommend it.

Yes, just like I wrote in the beginning, it has more that 50000 documents.

{
  ref: Index("findUserProgressesByCreatedAt"),
  ts: 1605794007700000,
  active: true,
  serialized: true,
  name: "findUserProgressesByCreatedAt",
  unique: false,
  source: Collection("userProgresses"),
  terms: [
    {
      field: ["data", "user"]
    }
  ],
  values: [
    {
      field: ["data", "createdAt"]
    },
    {
      field: ["ref"]
    }
  ],
  partitions: 1
}

And now it is active ! <3
You did some black magic here because it means it only build it in a few hours.

Okay, I re-created lots of time by deleting current building index, won’t do it again and come here instead !

Thanks a lot !

Awesome. I wish I knew that magic. :slight_smile:
This is the right place to report or shoot an email to support@fauna.com if you see latency in Index building (fews hours is the max SLA for huge collections). When in doubt reach us out and we will help you. Recreating them again would only aggravate the problem and delay more.

1 Like