Hi there, I’m looking for help validating my approach.
I’m currently implementing a full-text search via NGram. Each row in the collection will have a field called description
and the goal is to search the collection based on that field.
My index:
q.CreateIndex({
name: 'transactionsByDescription',
terms: [
{ binding: 'search' }
],
source: {
collection: q.Collection('transactions'),
fields: {
search: q.Query(
q.Lambda(
'transaction',
q.NGram(
q.LowerCase(
q.Select(['data', 'description'], q.Var('transaction'))
),
3,
3
)
)
)
}
}
});
Now, I’m trying to test what the query will be via the Shell on the dashboard. Currently, I only have one data on the transactions collection
, the description is Testing Only
. The user will provide a search string, in the example below, the search string is Test
.
The NGram produced will be ["tes", "est", "sti", "tin", "ing"]
The way I do it is by splitting the search string
into NGram of 3 as well, and then looping through each of the items and then run Match
for that item.
Map(
NGram(
LowerCase('Testing'),
3,
3
),
Lambda(
'needle',
Paginate(
Match(
Index('transactionsByDescription'),
Var('needle')
)
)
)
)
Since there were 5 NGrams produced, this will loop 5 times and actually result in an array of 5 repeating references:
[
{
data: [Ref(Collection("transactions"), "278191047968817665")]
},
{
data: [Ref(Collection("transactions"), "278191047968817665")]
},
{
data: [Ref(Collection("transactions"), "278191047968817665")]
},
{
data: [Ref(Collection("transactions"), "278191047968817665")]
},
{
data: [Ref(Collection("transactions"), "278191047968817665")]
}
]
Which is not the desirable result, I thought of doing
Distinct(
Map(
NGram(
LowerCase('Testing'),
3,
3
),
Lambda(
'needle',
Select(
['data'],
Paginate(
Match(
Index('transactionsByDescription'),
Var('needle')
)
)
)
)
)
)
Which will now result in the following array:
[
[Ref(Collection("transactions"), "278191047968817665")]
]
There are no repeating items which is the desired result. I think my approach is incorrect, particularly because the paginate is inside the lambda, and I actually want to implement pagination on this with size of 10, I think this approach will not work with that.