Search Index: using StartsWith

Followup question

Instead of Ngrams, I am trying to achieve something like startsWith -for each word in a sentence.
For that, I am facing some problem in writing forLoop. Could you please help me.

function GenearateStartsWithKeywords(str) {
  const myArr = [];
  for (let i = 0; i < str.length; i++) {
    myArr.push(str.substr(0, i + 1));
  }
  return myArr;
}

async function main(client) {
  const fql = q.Let(
    {
      words: ["Jagadeesh", "Palaniappan"],
      results: q.Map(
        q.Var("words"),
        q.Lambda("eachWord", GenearateStartsWithKeywords(q.Var("eachWord")))
      ),
    },
    q.Var("results")
  );
  const resp = await client.query(fql);
  console.log(resp);
}

I am looking for the below output,

[
 ["J", "Ja", "Jag", "Jaga", "Jagad", "Jagade", "Jagadee", "Jagadees", "Jagadeesh"],
 ["P", "Pa", "Pal", "Pala", "Palan", "Palani", "Palania", "Palaniap", "Palaniapp", "Palaniappa", "Palaniappan"]
]

Could you please help me.

First, you should know that:

  • you CAN use plain javascript to compose FQL functions into a query,
  • you CANNOT include plain javascript within an FQL query.
  • FQL is highly “functional” style, there are no “for loops”

That means that GenerateStartsWithKeywords cannot be run with a plain for loop. Rather, you need to use available FQL functions to compose a loop.

This is very possible to do, though.

(I also could not find a way to convert a string to an array of chars, so the next best thing is to map over a pre-defined array)

This is taken almost Directly from an example from @databrecht in Slack. Which I believe is taken from the Fwitter example app.

function GenearateStartsWithKeywords (str) {
  return q.Let(
    {
      original: str,
      lengths: 
        // Reduce this array if you want fewer search terms.
        // Setting it to [ 1 ] would only create the first letter, Setting it to [1, 2]
        // will result in the first letter and the second letter.
        [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
      lengthsFiltered: q.Filter(
        q.Var('lengths'),
        // filter out the ones larger than word length
        q.Lambda('l', q.LTE(q.Var('l'), q.Length(q.Var('original'))))
      )
    },
    q.Map(q.Var('lengthsFiltered'), q.Lambda('l', q.SubString(q.Var('original'), 0, q.Var('l'))))
  )
}

This can then be used while creating a new index

CreateIndex({
  name: 'thing_by_substrings',
  source: {
    collection: Collection('Thing'),
    fields: {
      substrings: Query(
        Lambda(
          'thing', 
          GenearateStartsWithKeywords(
            Select(['data', 'searchField'], Var('thing'))
          )
        )
      )
    }
  },
  terms: [
    {
      binding: 'substrings'
    }
  ]
})
2 Likes

Sorry, completely missed this question somehow, Great answer Paul! :slight_smile:

1 Like

Hi @ptpaterson , Thank you very much for the response. Even i tried to convert string to chars, I am unable to succeed.

@databrecht Is there anyway, can we convert string to characterArray ?

We don’t have such a function out-of-the-box at this point.
Best you can do is start with an array of numbers like above, then use SubString to get the char out. I have some things to finish so pseudo-code this time:

str: ... some str....
lengths: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
strings: Map(
   Var('lengths'),
   Lambda(['l'], 
      // check if Length(Var('str')) is big enough
      // if so.. Substring(Var('str'), Var('l'), 1)
      // else return null
   )
)

And then, next step of course, Filter out the nulls with Filter.

That will of course only work for strings that are below the length of that array. We are missing a feature to loop or map for a specific amount of elements.
Something like

Map([0 ... Length(Var('something'))], Lambda(..)) 

would be awesome. but atm we don’t have anything like that yet.
Unless our community comes up with something clever I don’t think I can give you a better solution.

1 Like

You can use NGram to get a character array, right? NGram('hello world', 1, 1) => ['h','e','l','l','o',' ','w','o','r','l','d']

Re: mapping a specific number of elements, I found a way to do it:

First, Space(n) gives you a string containing n spaces.
Second, NGram(s, 1, 1) gives you a character array (array of n items).
Finally, you can Map() or Reduce() over this character array to repeat a lambda n times.

2 Likes