FQL Query for Prefix of Search term

I’m trying to write a query to retrieve documents which are tagged with a prefix
corresponding to possible search strings. To put it concretely:

  • I have transactions with hierarchical categories of the form
    {category: ["Healthcare", "Medecine", "Family Medicine"]}.
  • I’m storing mappings which correspond to a subtree of the hierarchy, such
    • {prefix: ["Healthcare", "Medecine"], color: 'Yellow'}
    • {prefix: ["Shops", "Tobacco"], color: 'Green'}
    • {prefix: ["Bank Fees"], color: 'Red'}
  • When a new transaction comes in, I’m trying to find the colour which matches
    its full category.

For now, I’ve solved the problem by indexing on a string derived from the prefix
arrays ("Healthcare#Medecine"). When I need to retrieve a colour, I try to
index on all possible prefixes of the category like so:

          'Healthcare#Medecine#Family Medicine',
        (x) => Match(Index('category_by_prefix'), x)
      (x) => Exists(x)

This works, but it feels like there ought to be a solution that doesn’t involve
making multiple reads each time. Any ideas?

I would look into using Intersection for the query. I also have a few cases myself were I stringify a list in an index binding so that would be another option that would allow you to lookup by the full category.

I’m not following 100%.
It seems to me though that you could preprocess these prefixed that you want to match on and save them in an index binding?

Like the Ngram solution here… only you would do it with prefixes instead.

Sorry, I’m sure that there’s a better way of explaining myself but I haven’t found it yet!

I actually already have an index binding so that a document with the category “prefix” ["Healthcare", "Medecine"] is indexable as Healthcare#Medecine.

The problem is that Healthcare#Medecine is a category “prefix” - representing not just one category but a whole subtree of categories (Healthcare#Medecine#Family Medecine, Healthcare#Medecine#Alternative Medecine, …).

Given a “full” category (such as [A, B, C, D]) I want to retrieve documents with a matching prefix (such as [A,B] or [A,B,C] but not [A,D], for example). Currently I’m doing it by searching separately for every possible prefix (i.e. ABCD, ABC, AB, and A in this example), but I was hoping that there would be a more efficient solution - possibly including range queries or something.

I also just wanted to add this to be explicit: the possible “postfixes” (not sure that’s a word!) are not known at the time the document is created. By “postfix” I mean any string that can be constructed from a given prefix. e.g. for a prefix “ABC” the possible postfixes would be “ABCD”, “ABCDDD”, “ABCPizza”, etc.

Given that, I can’t really adopt the ngram approach of precomputing every possible search string. Because they’re prefixes though, it intuitively feels like there’s a solution using sorted ranges, but maybe that’s just me being too optimistic!