Transaction exceeded limit of 16MB with 56kb file

For some reason when I am trying to upload the documents of a 56kb JSON file, I get the following error:

‘Transaction exceeded limit: got 17057299, limit 16777216’

However the file I am uploading is only 56kb, not 16MB. I am only using a simple create function for the upload. This is a very strange error

Also, when I try to upload an individual document by copy/pasting the object into the Fauna Client, the whole page just becomes unresponsive and shuts down. Is there a limit to how big an individual document can be? Is there a way to remove this limit if it exists? I am guessing its a limitation on String size as I am trying to store paragraphs of information. Is there no way for me to store long strings?

Either way, this is the wrong error to show for a 56kb file.

The limits are documented here: Limits | Fauna Documentation

Somehow, parsing your JSON document is causing Fauna to exceed the transaction size limit. Is your file valid JSON? Can you show us some/all of the query where you are getting this error?

The documentation does not outline exactly what is going on with this issue. The file itself is only 56kb and does not match any of the limitations in the documentation provided.

The query is a simple create query that worked with larger JSON files (only difference being the size of the individual documents where the larger json file had fields with less than 100 characters of text)

This is one of the objects in my JSON file that fails to load in the Fauna client (So I know its not GraphQL causing the error) I am building a vaccine app. Yes the file is valid JSON.

{
        "catNumber": 2,
        "category": "What COVID 19 vaccines are there and how do these vaccines work?",
        "userType": "All",
        "question": "How many doses of the vaccine do I have to get?",
        "answer": {
            "paragraphs": [
                "One dose for the J&J vaccine.",
                "Two doses for the Pfizer and Moderna vaccines. For the two-dose vaccines:"
            ],
            "bulletPoints": [
                [],
                [
                    "The first shot helps the immune system recognize the virus, and the second shot strengthens the immune response. You need both to get the best protection.",
                    "The recommended time period between the first and second doses is 21 days for the Pfizer vaccine and 28 days for the Moderna vaccine.",
                    "It is important to get both doses of the same vaccine and not to mix-and-match, or the vaccine might not work as well."
                ]
            ],
            "links": []
        },
        "translations": []
    }

I was using the most bare bones query possible. I dont believe this matters as the document failed to create in the Fauna client as well but here you go. (Here the collName is just the collection name and the localData points to my JSON file)

Foreach(
                localData,
                Lambda(
                    "document",
                    Create(Collection(collName), {
                        data: Var("document"),
                    })
                )
            )

My current theory is that Fauna has some hidden limitation to string size that hasn’t been explicitly explained anywhere.

Hi @kdilla301. The limit is for the entire transaction.

Is localData an array of 56 kB documents? If you try to create 300 documents at a time, for example, you’ll get 300 * 56 kB > 16 MB

No the entire file of localData is 56kb. The error is unrelated to the query. If you take the document I posted and try and upload it via the web Fauna client, it will break as well. It breaks on an individual document. The document example was one of the documents in that JSON file. There are only 52 documents in that file.

Screen Shot 2021-08-03 at 5.59.09 PM

I also ran the same exact query on a JSON file that was 7.6 MB and it worked just fine. The problem isn’t the query, it’s how Fauna is handling the individual documents.

Okay, sorry for the confusion, and thank you for clarifying!

If I take the document you posted and upload it via the web client, it works for me. When you say, “upload,” do you mean something other than Create?

What Indexes do you have on the subject Collection? Writes on Indexes are also part of the transaction.

I’ve seen other users report uploading entire chapters of novels as strings without problems. The limit there is the 8MB max Document size.

I was referring to the section to add documents manually via the web client, not the fauna shell. When I tried adding the document here, the web client immediately broke.

However, the fauna shell had no problem creating the document (I used the example you provided and it worked). I do not have any indexes on this collection. I will check to see if updating the javascript fauna client on my local machine will change this because I am using a Create function on my local machine that breaks on this document.

I figured out the problem. It was an index I created outside of my schema. The index is meant to create a lot of substrings for a text based search. My script added the index every time before I uploaded the documents. So even when I deleted it from Fauna, the index would always be recreated before I did a document upload.

However, this does pose an even more interesting dilemma. Because of the ability for this index to grow in size considerably, does this mean that I would not be able to upload more documents to this collection in the future because of how this index is constructed?

My index is probably not written efficiently? But I have not found another way of implementing a robust text search for a given index. I followed this solution I found on Stack Overflow to create a fuzzy logic search out of substrings of the indexed field.

Here was my index:

CreateIndex({
        name: "search_qa_fuzzy",
        source: [
            {
                collection: Collection("freq_asked_questions"),
                fields: {
                    fuzzySubStrings: Query(
                        Lambda(
                            "question",
                            Distinct(
                                Union(
                                    Let(
                                        {
                                           
                                            indexes: [
                                                0,
                                                1,
                                                2,
                                                3,
                                                4,
                                                5,
                                                6,
                                                7,
                                                8,
                                                9,
                                                10,
                                                11,
                                                12,
                                                13,
                                                14,
                                                15,
                                                16,
                                                17,
                                                18,
                                                19,
                                                20,
                                                21,
                                                22,
                                                23,
                                                24,
                                                25,
                                                26,
                                                27,
                                                28,
                                                29,
                                                30,
                                                31,
                                                32,
                                                33,
                                                34,
                                                35,
                                                36,
                                                37,
                                                38,
                                                39,
                                            ],
                                            indexesFiltered: Filter(
                                                Var("indexes"),
                                               
                                                Lambda("l", GT(Var("l"), 0))
                                            ),
                                            ngramsArray: Map(
                                                Var("indexesFiltered"),
                                                Lambda(
                                                    "l",
                                                    NGram(
                                                        LowerCase(
                                                            Select(["data", "question"], Var("question"))
                                                        ),
                                                        Var("l"),
                                                        Var("l")
                                                    )
                                                )
                                            ),
                                        },
                                        Var("ngramsArray")
                                    )
                                )
                            )
                        )
                    ),
                },
            },
        ],
        terms: [
            {
                binding: "fuzzySubStrings",
            },
        ],
    })

I know it is a pretty complex index. I have been searching for a similar solution but to no avail. The memory problem certainly came from the amount of substrings the index was producing.

yep that will do it :slight_smile:

Your binding, along with the example document, will create over 1000 index entries. See below.

It appears you are mixing the two implementations that were shared in the StackOverflow answer: partial string match and fuzzy search.

How to use ngram efficiently for fuzzy search

ngram(x, 3, 3) should be the only implementation you need.

For longer bodies of text, including entries of fewer than 3 characters is just noise. Most sufficiently long strings will contain every letter or combination of 2 characters, and any hits on the search cannot be reasonably sorted by relevance.

Additionally, there are vastly diminishing returns on grams with length > 3. Beyond a gram size of 3, you get too many strange grams that include the ends of some words and beginnings of other words. Such entries become increasingly unlikely to be searched on. Consider this entry towards the end:

"y doses of the vaccine do i have to g"

For fuzzy search, you are not trying to match full strings; this is another reason not to create grams larger than 3 characters. For fuzzy search, you also break down the search term into its grams and then Union the result of searching by each gram. You can refer back to the SO answer for that.

using only grams of size 3 creates about 40 entries. See below.

Considerations for exact match search

  • Do not index on substrings of less than 3 characters. Same note as above applies for 1 or 2-character matches.
  • Limit the maximum size of the search term as much as possible.
  • Consider requiring the search terms to be on word boundaries. (this does not need ngram, but more like the string-split example)
  • If the field to be indexed is long, limit the index to just the first n characters of the target field.

example index binding result

[
  "h",
  "o",
  "w",
  " ",
  "m",
  "a",
  "n",
  "y",
  "d",
  "s",
  "e",
  "f",
  "t",
  "v",
  "c",
  "i",
  "g",
  "?",
  "ho",
  "ow",
  "w ",
  " m",
  "ma",
  "an",
  "ny",
  "y ",
  " d",
  "do",
  "os",
  "se",
  "es",
  "s ",
  " o",
  "of",
  "f ",
  " t",
  "th",
  "he",
  "e ",
  " v",
  "va",
  "ac",
  "cc",
  "ci",
  "in",
  "ne",
  "o ",
  " i",
  "i ",
  " h",
  "ha",
  "av",
  "ve",
  "to",
  " g",
  "ge",
  "et",
  "t?",
  "how",
  "ow ",
  "w m",
  " ma",
  "man",
  "any",
  "ny ",
  "y d",
  " do",
  "dos",
  "ose",
  "ses",
  "es ",
  "s o",
  " of",
  "of ",
  "f t",
  " th",
  "the",
  "he ",
  "e v",
  " va",
  "vac",
  "acc",
  "cci",
  "cin",
  "ine",
  "ne ",
  "e d",
  "do ",
  "o i",
  " i ",
  "i h",
  " ha",
  "hav",
  "ave",
  "ve ",
  "e t",
  " to",
  "to ",
  "o g",
  " ge",
  "get",
  "et?",
  "how ",
  "ow m",
  "w ma",
  " man",
  "many",
  "any ",
  "ny d",
  "y do",
  " dos",
  "dose",
  "oses",
  "ses ",
  "es o",
  "s of",
  " of ",
  "of t",
  "f th",
  " the",
  "the ",
  "he v",
  "e va",
  " vac",
  "vacc",
  "acci",
  "ccin",
  "cine",
  "ine ",
  "ne d",
  "e do",
  " do ",
  "do i",
  "o i ",
  " i h",
  "i ha",
  " hav",
  "have",
  "ave ",
  "ve t",
  "e to",
  " to ",
  "to g",
  "o ge",
  " get",
  "get?",
  "how m",
  "ow ma",
  "w man",
  " many",
  "many ",
  "any d",
  "ny do",
  "y dos",
  " dose",
  "doses",
  "oses ",
  "ses o",
  "es of",
  "s of ",
  " of t",
  "of th",
  "f the",
  " the ",
  "the v",
  "he va",
  "e vac",
  " vacc",
  "vacci",
  "accin",
  "ccine",
  "cine ",
  "ine d",
  "ne do",
  "e do ",
  " do i",
  "do i ",
  "o i h",
  " i ha",
  "i hav",
  " have",
  "have ",
  "ave t",
  "ve to",
  "e to ",
  " to g",
  "to ge",
  "o get",
  " get?",
  "how ma",
  "ow man",
  "w many",
  " many ",
  "many d",
  "any do",
  "ny dos",
  "y dose",
  " doses",
  "doses ",
  "oses o",
  "ses of",
  "es of ",
  "s of t",
  " of th",
  "of the",
  "f the ",
  " the v",
  "the va",
  "he vac",
  "e vacc",
  " vacci",
  "vaccin",
  "accine",
  "ccine ",
  "cine d",
  "ine do",
  "ne do ",
  "e do i",
  " do i ",
  "do i h",
  "o i ha",
  " i hav",
  "i have",
  " have ",
  "have t",
  "ave to",
  "ve to ",
  "e to g",
  " to ge",
  "to get",
  "o get?",
  "how man",
  "ow many",
  "w many ",
  " many d",
  "many do",
  "any dos",
  "ny dose",
  "y doses",
  " doses ",
  "doses o",
  "oses of",
  "ses of ",
  "es of t",
  "s of th",
  " of the",
  "of the ",
  "f the v",
  " the va",
  "the vac",
  "he vacc",
  "e vacci",
  " vaccin",
  "vaccine",
  "accine ",
  "ccine d",
  "cine do",
  "ine do ",
  "ne do i",
  "e do i ",
  " do i h",
  "do i ha",
  "o i hav",
  " i have",
  "i have ",
  " have t",
  "have to",
  "ave to ",
  "ve to g",
  "e to ge",
  " to get",
  "to get?",
  "how many",
  "ow many ",
  "w many d",
  " many do",
  "many dos",
  "any dose",
  "ny doses",
  "y doses ",
  " doses o",
  "doses of",
  "oses of ",
  "ses of t",
  "es of th",
  "s of the",
  " of the ",
  "of the v",
  "f the va",
  " the vac",
  "the vacc",
  "he vacci",
  "e vaccin",
  " vaccine",
  "vaccine ",
  "accine d",
  "ccine do",
  "cine do ",
  "ine do i",
  "ne do i ",
  "e do i h",
  " do i ha",
  "do i hav",
  "o i have",
  " i have ",
  "i have t",
  " have to",
  "have to ",
  "ave to g",
  "ve to ge",
  "e to get",
  " to get?",
  "how many ",
  "ow many d",
  "w many do",
  " many dos",
  "many dose",
  "any doses",
  "ny doses ",
  "y doses o",
  " doses of",
  "doses of ",
  "oses of t",
  "ses of th",
  "es of the",
  "s of the ",
  " of the v",
  "of the va",
  "f the vac",
  " the vacc",
  "the vacci",
  "he vaccin",
  "e vaccine",
  " vaccine ",
  "vaccine d",
  "accine do",
  "ccine do ",
  "cine do i",
  "ine do i ",
  "ne do i h",
  "e do i ha",
  " do i hav",
  "do i have",
  "o i have ",
  " i have t",
  "i have to",
  " have to ",
  "have to g",
  "ave to ge",
  "ve to get",
  "e to get?",
  "how many d",
  "ow many do",
  "w many dos",
  " many dose",
  "many doses",
  "any doses ",
  "ny doses o",
  "y doses of",
  " doses of ",
  "doses of t",
  "oses of th",
  "ses of the",
  "es of the ",
  "s of the v",
  " of the va",
  "of the vac",
  "f the vacc",
  " the vacci",
  "the vaccin",
  "he vaccine",
  "e vaccine ",
  " vaccine d",
  "vaccine do",
  "accine do ",
  "ccine do i",
  "cine do i ",
  "ine do i h",
  "ne do i ha",
  "e do i hav",
  " do i have",
  "do i have ",
  "o i have t",
  " i have to",
  "i have to ",
  " have to g",
  "have to ge",
  "ave to get",
  "ve to get?",
  "how many do",
  "ow many dos",
  "w many dose",
  " many doses",
  "many doses ",
  "any doses o",
  "ny doses of",
  "y doses of ",
  " doses of t",
  "doses of th",
  "oses of the",
  "ses of the ",
  "es of the v",
  "s of the va",
  " of the vac",
  "of the vacc",
  "f the vacci",
  " the vaccin",
  "the vaccine",
  "he vaccine ",
  "e vaccine d",
  " vaccine do",
  "vaccine do ",
  "accine do i",
  "ccine do i ",
  "cine do i h",
  "ine do i ha",
  "ne do i hav",
  "e do i have",
  " do i have ",
  "do i have t",
  "o i have to",
  " i have to ",
  "i have to g",
  " have to ge",
  "have to get",
  "ave to get?",
  "how many dos",
  "ow many dose",
  "w many doses",
  " many doses ",
  "many doses o",
  "any doses of",
  "ny doses of ",
  "y doses of t",
  " doses of th",
  "doses of the",
  "oses of the ",
  "ses of the v",
  "es of the va",
  "s of the vac",
  " of the vacc",
  "of the vacci",
  "f the vaccin",
  " the vaccine",
  "the vaccine ",
  "he vaccine d",
  "e vaccine do",
  " vaccine do ",
  "vaccine do i",
  "accine do i ",
  "ccine do i h",
  "cine do i ha",
  "ine do i hav",
  "ne do i have",
  "e do i have ",
  " do i have t",
  "do i have to",
  "o i have to ",
  " i have to g",
  "i have to ge",
  " have to get",
  "have to get?",
  "how many dose",
  "ow many doses",
  "w many doses ",
  " many doses o",
  "many doses of",
  "any doses of ",
  "ny doses of t",
  "y doses of th",
  " doses of the",
  "doses of the ",
  "oses of the v",
  "ses of the va",
  "es of the vac",
  "s of the vacc",
  " of the vacci",
  "of the vaccin",
  "f the vaccine",
  " the vaccine ",
  "the vaccine d",
  "he vaccine do",
  "e vaccine do ",
  " vaccine do i",
  "vaccine do i ",
  "accine do i h",
  "ccine do i ha",
  "cine do i hav",
  "ine do i have",
  "ne do i have ",
  "e do i have t",
  " do i have to",
  "do i have to ",
  "o i have to g",
  " i have to ge",
  "i have to get",
  " have to get?",
  "how many doses",
  "ow many doses ",
  "w many doses o",
  " many doses of",
  "many doses of ",
  "any doses of t",
  "ny doses of th",
  "y doses of the",
  " doses of the ",
  "doses of the v",
  "oses of the va",
  "ses of the vac",
  "es of the vacc",
  "s of the vacci",
  " of the vaccin",
  "of the vaccine",
  "f the vaccine ",
  " the vaccine d",
  "the vaccine do",
  "he vaccine do ",
  "e vaccine do i",
  " vaccine do i ",
  "vaccine do i h",
  "accine do i ha",
  "ccine do i hav",
  "cine do i have",
  "ine do i have ",
  "ne do i have t",
  "e do i have to",
  " do i have to ",
  "do i have to g",
  "o i have to ge",
  " i have to get",
  "i have to get?",
  "how many doses ",
  "ow many doses o",
  "w many doses of",
  " many doses of ",
  "many doses of t",
  "any doses of th",
  "ny doses of the",
  "y doses of the ",
  " doses of the v",
  "doses of the va",
  "oses of the vac",
  "ses of the vacc",
  "es of the vacci",
  "s of the vaccin",
  " of the vaccine",
  "of the vaccine ",
  "f the vaccine d",
  " the vaccine do",
  "the vaccine do ",
  "he vaccine do i",
  "e vaccine do i ",
  " vaccine do i h",
  "vaccine do i ha",
  "accine do i hav",
  "ccine do i have",
  "cine do i have ",
  "ine do i have t",
  "ne do i have to",
  "e do i have to ",
  " do i have to g",
  "do i have to ge",
  "o i have to get",
  " i have to get?",
  "how many doses o",
  "ow many doses of",
  "w many doses of ",
  " many doses of t",
  "many doses of th",
  "any doses of the",
  "ny doses of the ",
  "y doses of the v",
  " doses of the va",
  "doses of the vac",
  "oses of the vacc",
  "ses of the vacci",
  "es of the vaccin",
  "s of the vaccine",
  " of the vaccine ",
  "of the vaccine d",
  "f the vaccine do",
  " the vaccine do ",
  "the vaccine do i",
  "he vaccine do i ",
  "e vaccine do i h",
  " vaccine do i ha",
  "vaccine do i hav",
  "accine do i have",
  "ccine do i have ",
  "cine do i have t",
  "ine do i have to",
  "ne do i have to ",
  "e do i have to g",
  " do i have to ge",
  "do i have to get",
  "o i have to get?",
  "how many doses of",
  "ow many doses of ",
  "w many doses of t",
  " many doses of th",
  "many doses of the",
  "any doses of the ",
  "ny doses of the v",
  "y doses of the va",
  " doses of the vac",
  "doses of the vacc",
  "oses of the vacci",
  "ses of the vaccin",
  "es of the vaccine",
  "s of the vaccine ",
  " of the vaccine d",
  "of the vaccine do",
  "f the vaccine do ",
  " the vaccine do i",
  "the vaccine do i ",
  "he vaccine do i h",
  "e vaccine do i ha",
  " vaccine do i hav",
  "vaccine do i have",
  "accine do i have ",
  "ccine do i have t",
  "cine do i have to",
  "ine do i have to ",
  "ne do i have to g",
  "e do i have to ge",
  " do i have to get",
  "do i have to get?",
  "how many doses of ",
  "ow many doses of t",
  "w many doses of th",
  " many doses of the",
  "many doses of the ",
  "any doses of the v",
  "ny doses of the va",
  "y doses of the vac",
  " doses of the vacc",
  "doses of the vacci",
  "oses of the vaccin",
  "ses of the vaccine",
  "es of the vaccine ",
  "s of the vaccine d",
  " of the vaccine do",
  "of the vaccine do ",
  "f the vaccine do i",
  " the vaccine do i ",
  "the vaccine do i h",
  "he vaccine do i ha",
  "e vaccine do i hav",
  " vaccine do i have",
  "vaccine do i have ",
  "accine do i have t",
  "ccine do i have to",
  "cine do i have to ",
  "ine do i have to g",
  "ne do i have to ge",
  "e do i have to get",
  " do i have to get?",
  "how many doses of t",
  "ow many doses of th",
  "w many doses of the",
  " many doses of the ",
  "many doses of the v",
  "any doses of the va",
  "ny doses of the vac",
  "y doses of the vacc",
  " doses of the vacci",
  "doses of the vaccin",
  "oses of the vaccine",
  "ses of the vaccine ",
  "es of the vaccine d",
  "s of the vaccine do",
  " of the vaccine do ",
  "of the vaccine do i",
  "f the vaccine do i ",
  " the vaccine do i h",
  "the vaccine do i ha",
  "he vaccine do i hav",
  "e vaccine do i have",
  " vaccine do i have ",
  "vaccine do i have t",
  "accine do i have to",
  "ccine do i have to ",
  "cine do i have to g",
  "ine do i have to ge",
  "ne do i have to get",
  "e do i have to get?",
  "how many doses of th",
  "ow many doses of the",
  "w many doses of the ",
  " many doses of the v",
  "many doses of the va",
  "any doses of the vac",
  "ny doses of the vacc",
  "y doses of the vacci",
  " doses of the vaccin",
  "doses of the vaccine",
  "oses of the vaccine ",
  "ses of the vaccine d",
  "es of the vaccine do",
  "s of the vaccine do ",
  " of the vaccine do i",
  "of the vaccine do i ",
  "f the vaccine do i h",
  " the vaccine do i ha",
  "the vaccine do i hav",
  "he vaccine do i have",
  "e vaccine do i have ",
  " vaccine do i have t",
  "vaccine do i have to",
  "accine do i have to ",
  "ccine do i have to g",
  "cine do i have to ge",
  "ine do i have to get",
  "ne do i have to get?",
  "how many doses of the",
  "ow many doses of the ",
  "w many doses of the v",
  " many doses of the va",
  "many doses of the vac",
  "any doses of the vacc",
  "ny doses of the vacci",
  "y doses of the vaccin",
  " doses of the vaccine",
  "doses of the vaccine ",
  "oses of the vaccine d",
  "ses of the vaccine do",
  "es of the vaccine do ",
  "s of the vaccine do i",
  " of the vaccine do i ",
  "of the vaccine do i h",
  "f the vaccine do i ha",
  " the vaccine do i hav",
  "the vaccine do i have",
  "he vaccine do i have ",
  "e vaccine do i have t",
  " vaccine do i have to",
  "vaccine do i have to ",
  "accine do i have to g",
  "ccine do i have to ge",
  "cine do i have to get",
  "ine do i have to get?",
  "how many doses of the ",
  "ow many doses of the v",
  "w many doses of the va",
  " many doses of the vac",
  "many doses of the vacc",
  "any doses of the vacci",
  "ny doses of the vaccin",
  "y doses of the vaccine",
  " doses of the vaccine ",
  "doses of the vaccine d",
  "oses of the vaccine do",
  "ses of the vaccine do ",
  "es of the vaccine do i",
  "s of the vaccine do i ",
  " of the vaccine do i h",
  "of the vaccine do i ha",
  "f the vaccine do i hav",
  " the vaccine do i have",
  "the vaccine do i have ",
  "he vaccine do i have t",
  "e vaccine do i have to",
  " vaccine do i have to ",
  "vaccine do i have to g",
  "accine do i have to ge",
  "ccine do i have to get",
  "cine do i have to get?",
  "how many doses of the v",
  "ow many doses of the va",
  "w many doses of the vac",
  " many doses of the vacc",
  "many doses of the vacci",
  "any doses of the vaccin",
  "ny doses of the vaccine",
  "y doses of the vaccine ",
  " doses of the vaccine d",
  "doses of the vaccine do",
  "oses of the vaccine do ",
  "ses of the vaccine do i",
  "es of the vaccine do i ",
  "s of the vaccine do i h",
  " of the vaccine do i ha",
  "of the vaccine do i hav",
  "f the vaccine do i have",
  " the vaccine do i have ",
  "the vaccine do i have t",
  "he vaccine do i have to",
  "e vaccine do i have to ",
  " vaccine do i have to g",
  "vaccine do i have to ge",
  "accine do i have to get",
  "ccine do i have to get?",
  "how many doses of the va",
  "ow many doses of the vac",
  "w many doses of the vacc",
  " many doses of the vacci",
  "many doses of the vaccin",
  "any doses of the vaccine",
  "ny doses of the vaccine ",
  "y doses of the vaccine d",
  " doses of the vaccine do",
  "doses of the vaccine do ",
  "oses of the vaccine do i",
  "ses of the vaccine do i ",
  "es of the vaccine do i h",
  "s of the vaccine do i ha",
  " of the vaccine do i hav",
  "of the vaccine do i have",
  "f the vaccine do i have ",
  " the vaccine do i have t",
  "the vaccine do i have to",
  "he vaccine do i have to ",
  "e vaccine do i have to g",
  " vaccine do i have to ge",
  "vaccine do i have to get",
  "accine do i have to get?",
  "how many doses of the vac",
  "ow many doses of the vacc",
  "w many doses of the vacci",
  " many doses of the vaccin",
  "many doses of the vaccine",
  "any doses of the vaccine ",
  "ny doses of the vaccine d",
  "y doses of the vaccine do",
  " doses of the vaccine do ",
  "doses of the vaccine do i",
  "oses of the vaccine do i ",
  "ses of the vaccine do i h",
  "es of the vaccine do i ha",
  "s of the vaccine do i hav",
  " of the vaccine do i have",
  "of the vaccine do i have ",
  "f the vaccine do i have t",
  " the vaccine do i have to",
  "the vaccine do i have to ",
  "he vaccine do i have to g",
  "e vaccine do i have to ge",
  " vaccine do i have to get",
  "vaccine do i have to get?",
  "how many doses of the vacc",
  "ow many doses of the vacci",
  "w many doses of the vaccin",
  " many doses of the vaccine",
  "many doses of the vaccine ",
  "any doses of the vaccine d",
  "ny doses of the vaccine do",
  "y doses of the vaccine do ",
  " doses of the vaccine do i",
  "doses of the vaccine do i ",
  "oses of the vaccine do i h",
  "ses of the vaccine do i ha",
  "es of the vaccine do i hav",
  "s of the vaccine do i have",
  " of the vaccine do i have ",
  "of the vaccine do i have t",
  "f the vaccine do i have to",
  " the vaccine do i have to ",
  "the vaccine do i have to g",
  "he vaccine do i have to ge",
  "e vaccine do i have to get",
  " vaccine do i have to get?",
  "how many doses of the vacci",
  "ow many doses of the vaccin",
  "w many doses of the vaccine",
  " many doses of the vaccine ",
  "many doses of the vaccine d",
  "any doses of the vaccine do",
  "ny doses of the vaccine do ",
  "y doses of the vaccine do i",
  " doses of the vaccine do i ",
  "doses of the vaccine do i h",
  "oses of the vaccine do i ha",
  "ses of the vaccine do i hav",
  "es of the vaccine do i have",
  "s of the vaccine do i have ",
  " of the vaccine do i have t",
  "of the vaccine do i have to",
  "f the vaccine do i have to ",
  " the vaccine do i have to g",
  "the vaccine do i have to ge",
  "he vaccine do i have to get",
  "e vaccine do i have to get?",
  "how many doses of the vaccin",
  "ow many doses of the vaccine",
  "w many doses of the vaccine ",
  " many doses of the vaccine d",
  "many doses of the vaccine do",
  "any doses of the vaccine do ",
  "ny doses of the vaccine do i",
  "y doses of the vaccine do i ",
  " doses of the vaccine do i h",
  "doses of the vaccine do i ha",
  "oses of the vaccine do i hav",
  "ses of the vaccine do i have",
  "es of the vaccine do i have ",
  "s of the vaccine do i have t",
  " of the vaccine do i have to",
  "of the vaccine do i have to ",
  "f the vaccine do i have to g",
  " the vaccine do i have to ge",
  "the vaccine do i have to get",
  "he vaccine do i have to get?",
  "how many doses of the vaccine",
  "ow many doses of the vaccine ",
  "w many doses of the vaccine d",
  " many doses of the vaccine do",
  "many doses of the vaccine do ",
  "any doses of the vaccine do i",
  "ny doses of the vaccine do i ",
  "y doses of the vaccine do i h",
  " doses of the vaccine do i ha",
  "doses of the vaccine do i hav",
  "oses of the vaccine do i have",
  "ses of the vaccine do i have ",
  "es of the vaccine do i have t",
  "s of the vaccine do i have to",
  " of the vaccine do i have to ",
  "of the vaccine do i have to g",
  "f the vaccine do i have to ge",
  " the vaccine do i have to get",
  "the vaccine do i have to get?",
  "how many doses of the vaccine ",
  "ow many doses of the vaccine d",
  "w many doses of the vaccine do",
  " many doses of the vaccine do ",
  "many doses of the vaccine do i",
  "any doses of the vaccine do i ",
  "ny doses of the vaccine do i h",
  "y doses of the vaccine do i ha",
  " doses of the vaccine do i hav",
  "doses of the vaccine do i have",
  "oses of the vaccine do i have ",
  "ses of the vaccine do i have t",
  "es of the vaccine do i have to",
  "s of the vaccine do i have to ",
  " of the vaccine do i have to g",
  "of the vaccine do i have to ge",
  "f the vaccine do i have to get",
  " the vaccine do i have to get?",
  "how many doses of the vaccine d",
  "ow many doses of the vaccine do",
  "w many doses of the vaccine do ",
  " many doses of the vaccine do i",
  "many doses of the vaccine do i ",
  "any doses of the vaccine do i h",
  "ny doses of the vaccine do i ha",
  "y doses of the vaccine do i hav",
  " doses of the vaccine do i have",
  "doses of the vaccine do i have ",
  "oses of the vaccine do i have t",
  "ses of the vaccine do i have to",
  "es of the vaccine do i have to ",
  "s of the vaccine do i have to g",
  " of the vaccine do i have to ge",
  "of the vaccine do i have to get",
  "f the vaccine do i have to get?",
  "how many doses of the vaccine do",
  "ow many doses of the vaccine do ",
  "w many doses of the vaccine do i",
  " many doses of the vaccine do i ",
  "many doses of the vaccine do i h",
  "any doses of the vaccine do i ha",
  "ny doses of the vaccine do i hav",
  "y doses of the vaccine do i have",
  " doses of the vaccine do i have ",
  "doses of the vaccine do i have t",
  "oses of the vaccine do i have to",
  "ses of the vaccine do i have to ",
  "es of the vaccine do i have to g",
  "s of the vaccine do i have to ge",
  " of the vaccine do i have to get",
  "of the vaccine do i have to get?",
  "how many doses of the vaccine do ",
  "ow many doses of the vaccine do i",
  "w many doses of the vaccine do i ",
  " many doses of the vaccine do i h",
  "many doses of the vaccine do i ha",
  "any doses of the vaccine do i hav",
  "ny doses of the vaccine do i have",
  "y doses of the vaccine do i have ",
  " doses of the vaccine do i have t",
  "doses of the vaccine do i have to",
  "oses of the vaccine do i have to ",
  "ses of the vaccine do i have to g",
  "es of the vaccine do i have to ge",
  "s of the vaccine do i have to get",
  " of the vaccine do i have to get?",
  "how many doses of the vaccine do i",
  "ow many doses of the vaccine do i ",
  "w many doses of the vaccine do i h",
  " many doses of the vaccine do i ha",
  "many doses of the vaccine do i hav",
  "any doses of the vaccine do i have",
  "ny doses of the vaccine do i have ",
  "y doses of the vaccine do i have t",
  " doses of the vaccine do i have to",
  "doses of the vaccine do i have to ",
  "oses of the vaccine do i have to g",
  "ses of the vaccine do i have to ge",
  "es of the vaccine do i have to get",
  "s of the vaccine do i have to get?",
  "how many doses of the vaccine do i ",
  "ow many doses of the vaccine do i h",
  "w many doses of the vaccine do i ha",
  " many doses of the vaccine do i hav",
  "many doses of the vaccine do i have",
  "any doses of the vaccine do i have ",
  "ny doses of the vaccine do i have t",
  "y doses of the vaccine do i have to",
  " doses of the vaccine do i have to ",
  "doses of the vaccine do i have to g",
  "oses of the vaccine do i have to ge",
  "ses of the vaccine do i have to get",
  "es of the vaccine do i have to get?",
  "how many doses of the vaccine do i h",
  "ow many doses of the vaccine do i ha",
  "w many doses of the vaccine do i hav",
  " many doses of the vaccine do i have",
  "many doses of the vaccine do i have ",
  "any doses of the vaccine do i have t",
  "ny doses of the vaccine do i have to",
  "y doses of the vaccine do i have to ",
  " doses of the vaccine do i have to g",
  "doses of the vaccine do i have to ge",
  "oses of the vaccine do i have to get",
  "ses of the vaccine do i have to get?",
  "how many doses of the vaccine do i ha",
  "ow many doses of the vaccine do i hav",
  "w many doses of the vaccine do i have",
  " many doses of the vaccine do i have ",
  "many doses of the vaccine do i have t",
  "any doses of the vaccine do i have to",
  "ny doses of the vaccine do i have to ",
  "y doses of the vaccine do i have to g",
  " doses of the vaccine do i have to ge",
  "doses of the vaccine do i have to get",
  "oses of the vaccine do i have to get?",
  "how many doses of the vaccine do i hav",
  "ow many doses of the vaccine do i have",
  "w many doses of the vaccine do i have ",
  " many doses of the vaccine do i have t",
  "many doses of the vaccine do i have to",
  "any doses of the vaccine do i have to ",
  "ny doses of the vaccine do i have to g",
  "y doses of the vaccine do i have to ge",
  " doses of the vaccine do i have to get",
  "doses of the vaccine do i have to get?",
  "how many doses of the vaccine do i have",
  "ow many doses of the vaccine do i have ",
  "w many doses of the vaccine do i have t",
  " many doses of the vaccine do i have to",
  "many doses of the vaccine do i have to ",
  "any doses of the vaccine do i have to g",
  "ny doses of the vaccine do i have to ge",
  "y doses of the vaccine do i have to get",
  " doses of the vaccine do i have to get?"
]

example with gram size = 3

[
  "how",
  "ow ",
  "w m",
  " ma",
  "man",
  "any",
  "ny ",
  "y d",
  " do",
  "dos",
  "ose",
  "ses",
  "es ",
  "s o",
  " of",
  "of ",
  "f t",
  " th",
  "the",
  "he ",
  "e v",
  " va",
  "vac",
  "acc",
  "cci",
  "cin",
  "ine",
  "ne ",
  "e d",
  "do ",
  "o i",
  " i ",
  "i h",
  " ha",
  "hav",
  "ave",
  "ve ",
  "e t",
  " to",
  "to ",
  "o g",
  " ge",
  "get",
  "et?"
]
1 Like

Thank you for this explanation! My previous understanding was pretty much the opposite of your explanation. I thought only having 3 characters for substrings would limit the search capabilities if the search query contained more than 3 characters so instead I was building substrings for the potential length of the search query haha. Thank you for explaining this. I will adjust my index and save the headache!

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.