Why does an array in an index value return as null?

After trying to help @PaulieScanlon in this thread I noticed a weird behavior.

When having an object such as this:

{
  ref: Ref(Collection("TestArrays"), "269156828710961670"),
  ts: 1592946804640000,
  data: {
    slug: "hello-world",
    someArray: [
      {
        name: "Foo"
      },
      {
        name: "Bar"
      }
    ]
  }
}

And this index:

CreateIndex({
  name: "get_all_Things_by_slug",
  source: Collection("TestArrays"),
  terms: [
    {field: ["data", "slug"]}
  ],
  values: [
    {field: ["ref"]},
    {field: ["data", "slug"]},
    {field: ["data", "someArray"]}
  ]
})

The someArray value returns null:

Paginate(Match(Index("get_all_Things_by_slug"), "hello-world"))

{
  data: [
    [Ref(Collection("TestArrays"), "269156828710961670"), "hello-world", null]
  ]
}

Is this because Fauna canā€™t really use an array to sort the results?

Itā€™s because we only index scalar values. Objects are not scalars. This is noted in the index documentation, but perhaps it could be clearer.

2 Likes

Ah right.

Thanks @ben !

It makes a lot of sense when considering that indexes are ā€œpre-calculatedā€, so to speak, but somehow I always imagine indexes going to the documents on every execution (which doesnā€™t make any sense but thatā€™s how I tend to picture it).

It somewhat an artificial restriction (which is good news, we havenā€™t painted ourselves into any architectural corners on that front). We could do it if we had a universal ordering on objects for instance, or if we could say ā€œinline this object, but it doesnā€™t participate in sortingā€. There is work on going to design more expansive use of indexes, but for the moment indexed values have to be scalars.

2 Likes

I also did a quick dig after the previous question.

Docs says, ā€œIndex terms are always scalar valuesā€¦ā€, but I do not see anything about values that suggests this should be the case.

Yeah, thatā€™s fair. Iā€™ll circle back with docs to make sure this is a bit more obvious.

2 Likes

Hi @pier this is exactly the issue I ran into. Iā€™ve tried a few of you suggestions and using a Lambda seems to work. Do you have a more complete example of how youā€™ve resolved your issue so I can check it against mine.

Iā€™ve still got GraphQL issues but I think thatā€™s separate to this Fauna array question.

Hey @PaulieScanlon !

Using Lambda is the standard way of getting an array of documents from an array of references provided by an index.

Like I mentioned in the other thread, if you know in advance you will only ever have a single document for each slug, you can create an index like this:

CreateIndex({
  name: "Things_by_slug",
  source: Collection("Things"),
  terms: [
    {field: ["data", "slug"]}
  ],
  unique: true
})

And query it like this without Lambda to get your document:

Get(Match(Index("Things_by_slug"), "some_slug"))
1 Like

Just to be clear and complete, it would work if your objects look like:

{
  ref: Ref(Collection("TestArrays"), "269156828710961670"),
  ts: 1592946804640000,
  data: {
    slug: "hello-world",
    someArray: [ "Foo", "Bar"]
  }
}

Arrays are ā€˜unrolledā€™ (if you can call it like that). But the value they contain have to be scalar.
So I think a more complete answer is that it has to be a scalar or an array of scalars.

2 Likes

I would love to use the ability to store objects/arrays as an index value, even if these non-scalars will not be allowed to participate in sorting.
Is there any timeline as to when such a feature would become available?

Can you describe a use case?

letā€™s say we have a schemaā€¦

type Property {
    label: String
    description: String
    variants: [PropertyVariant!]
}

type PropertyVariant @embedded {
    status: String
    configuration: String
    price: Int
    area: Int
}

Now let us say we have the following property (with 3 variants)

{
    label: "Some label"
    description: "Some description...."
    variants: [
        {
            status: "Ready"
            configuration: "1"
            price: 100
            area: 100
        },
        {
            status: "Ongoing"
            configuration: "2"
            price: 200
            area: 200
        },
        {
            status: "Sold"
            configuration: "3"
            price: 300
            area: 300
        }
    ]
}

Now, let us suppose that I want to get hold of a property which has a variant that needs to satisfy multiple conditions, that becomes a problem. For example, letā€™s say I am looking for a configuration of 3 which is not sold, with an area less than equal to 200. Second

According to stackoverflow, to query over multiple condition, requires multiple indexes. So letā€™s create multiple indexes. With the terms set asā€¦

  1. data.variants.status
  2. data.variants.configuration
  3. data.variants.price

And automatically I can see that the Intersect(Join(Match)) method is not going to work. It is going to return this document even if this document does not have a variant that satisfy the condition.

I have suspicion that if I could store an object (the individual variants) in the index itself along with the property ref, i could use the index in conjunction with a udf as a custom resolver, to return the property correctly. But itā€™s difficult for me to visualize exactly how this will work without seeing it in front of my eyes.

^^Use case ends here. Actual problem follows below

The obvious alternative, of course, is to create a separate collection for the variants, but for some reason, the graphql endpoint is not processing that correctly.
If I

  1. remove the @embedded directive.
  2. add a ā€˜property: Property!ā€™ field to the variant schema definition

it creates the new ā€˜variantsā€™ collection, which is ok. It creates the necessary index for the one-to-many-relationship, which is ok. But for some reason, the generated schema does not have the new ā€˜propertyā€™ field in it. And when I add a new property with variants, these variants are inlined in the property document in the property collection, exactly as before, instead of having separate documents in the ā€˜variantsā€™ collection

You can turn almost anything into a serialized string and index that. I have several indexes with bindings that convert objects into an array of strings. Then I have a custom fql based function to generate the lookup string. One caveat is that there are a few functions that does not work in bindings (and does not give you an error). Some functions that can be used on both arrays and sets like Reduce does not currently work.

1 Like

Hey @DibyodyutiMondal. I can tell you that we do intend to provide options for that, but we donā€™t give out timelines for features that we might consider in the future. What is important is that we develop the right thing and develop it well. There is a lot to consider when indexing objects on how that would work. If you just intend to index them lexically without any advanced retrieval features, you can just as well index as Eigil suggested.

It does seem though that a lot of your use case is rather triggered due to other limitations (e.g. with our GraphQL) so it might even serve you better if these limitations are fixed.

I donā€™t really see that automatically. Could you elaborate? If you query on price 200, query on area 300 in another index, then on configuration 2 in another index and intersect those 3 setrefs (given that they have the same formatā€¦ which requires you to go back to ā€˜refsā€™) then you will not return documents that do not satisfy the condition, so Iā€™m quite confused by that statement :slight_smile: and suspect there is something else tha we can clarify that might help you.

Well it wonā€™t work bcs the document above will have 3 entries in the indexes created above, one for each of the inlined variants, right? Thus, the document is indexed as price 200, as well as indexed for area 300.
The document will therefore satisfy the price 200 && area 300 condition, even if the individual variants (which are the actual objects being indexed) do not satisfy the condition.
In other words, this document will be returned if, for each of the given conditions, that condition is fulfilled by at least one of the inlined variants. Kinda like an OR condition, but not exactly.

Had I been indexing a single document, it would not have been a problem, but since Iā€™m indexing an array of objects which all exist as part of a single document, there ref being indexed is the same resulting in this behaviour.

This can be solved if it was possible to store the arrayā€™s index as well, to differentiate, or in my case, if I could at least store the object itself on an index field.

Or, if graphql would kindly work as expected/intended.

PS: You guys are doing a great job. Iā€™m only interested in the timeline bcs thatā€™ll affect whether Iā€™ll tell my client to wait for some features, or build my app slightly differently (read laboriously) for the time being. Itā€™s ok if there isnā€™t one - Faunadb is new, but itā€™s growing fast, so Iā€™m sure you guys know what you are doing.

Yes you are right, thatā€™s because the intersection happens on the reference in that example. If that is not working for you then you want to intersect on both the reference and the status (or configuration). That means, instead of getting rid of all values except the reference as is done in that multiconditional query answer, you would get keep both the reference and the status on each intermediate step and intersect on that.

Regardless of being able to index complete objects, I do think there are solutions for your problem. Itā€™s however incredibly hard to place oneself in someone elseā€™s domain :slight_smile: so I didnā€™t understood it right away (and maybe I still didnā€™t?)