Queries and prices similarities with MySql

Hi, I’d like to know if I make a query with WHERE or LIKE (similar like I do with my MySql) and I made the same in FQL, will I be billed with TRO or TCO?

Another question I have is that with TRO I’m billed with each 4kb chunch I return as a response right? Does it also consider the time it takes to give me the response? Giving an example of this is consider I have a 50GB database and I make a simple WHERE query that gives me a 2kb response, does will I be billed with only 1 TRO for a 2kb response even if I search on a 50GB or 1000GB database?

The equivalent queries you are looking for would be based on indexes. You would be billed with both TRO and TCO according to what is specified in the billing page of the docs.

WHERE example

The FQL for SQL Users page offers an example WHERE query.

Map(
  Paginate(
    Match(Index("dept_by_deptno"), 10)
  ),
  Lambda("X", Get(Var("X")))
)

This would cost n + 1 TRO minimum where n is the number of documents fetched. That is, at least 1 TRO per Get call and at least 1 TRO for the Paginate call. More TROs will be charged if the data read is larger – this is per document or page.

That is, for each time an index is paged you are charged TROs for how big that index page is. And for each document that is read, you are charged for how big that document is. They do not roll up together. For example, if each document read is 4.1kB, then you will be charged 2 TRO’s for each document read; 100 documents at 4.1kB each will require 200 TROs, not 102 TROs.

There is some caching involved, so repeated calls to Get the same Ref, for example, would only count as one read

Compute ops for this example would be about n / 50 TCOs per the billing page.

SQL LIKE

This would be based on a more advanced index, likely with bindings. The Read and Compute ops to read from the index would likely be similar to the WHERE query. But an index with bindings requires additional TCOs while writing documents to the index.

Responses

Note that this has nothing to do with what you send back in your request. If you fetch all of this data, and return a single value, then you still get charged for all of the incurred TROs, TCOs, and TWOs.

Pagination by default.

Again, you get billed for what you fetch. And, Fauna can only read your data with pagination. You will never be surprised by accidentally doing a full table scan and getting charged for reading all 50GB of your Collection.

The default page size is 64, and its max size is 100000. If you fetch only a fraction of your documents (and if you have 50GB then you’d almost definitely only be grabbing a small piece at a time) then you will only be charged for that.

So to sum up and get a clear idea, every time I need to get even a 0 - 4kb of data I will be billed 2 TRO becase the query and the paginate count. Is there a query where consumes just 1 TRO?

Now, what about is I have a collection with 2 indexes, for example if I need to make a “WHERE X = 1 AND Y = 2” (where X and Y have indexes), so if I need to query that I will consume 3 TRO for making that double where condition + the paginate?

Review your query metrics

The cost of your queries will always depend on your specific data. It is best to get into the habit of reviewing the metrics sent back with your queries. You can do this by inspecting the response headers. You can also see a summary of the stats when you execute a query in the dashboard shell.

Reducing Read Ops

You can define values for an index. If all of the data you need is contained within the index, and thus you do not need to Get additional documents, you will only be charged for the read cost of the index.

For this reason, it can be considered a good practice (depending on your use case of course) to put as much information into indexes as possible. This can greatly increase storage, but storage is cheap. This option becomes less appealing when there is a higher mix of write operations, since writing to the index will also be more expensive. Is up to you what the best trade off is.

The following queries could cost one TRO and one TCO, so long as size is small enough.

Get(Ref(Collection("things"), '1'))

// OR

CreateIndex({
  name: "all_things_by_color",
  source: Collection("things"),
  terms: [{ field: ["data", "color"] }],
  values: [
    { field: ["data", "name"] },
    { field: ["data", "description"] },
    { field: ["data", "length"] },
    { field: ["data", "width"] },
    { field: ["ref"] }, // <-- good to include in case you do need more information or want to use elsewhere
  ]
})

Paginate(Match(Index("all_things_by_color"), "blue"))

Cost of Intersection and other Set operations

Regarding multiple WHERE conditions… you need to think about this in terms of FQL. FQL is your precise query plan, so knowing the FQL is a must for estimating the cost.

The Paginate function is not what incurs the cost. It’s actually reading an index. When you Paginate a single Match(Index(...)) it happens to only read one index, once, so it only costs 1 TRO.

Multiple conditions with AND would use Intersection. The Pagination step for Set operations, such as Intersection, Difference, Union etc. cost more because they operate on multiple indexes and possibly need to do so multiple times. Intersection, for example, costs about ([# of indexes] - 1) * 2 to account for the work to compare the indexes

// 2 TROs
Paginate(
  Intersection(
    Match(Index("things_by_a"), 1),
    Match(Index("things_by_b"), 2),
  )
)

// 4 TROs
Paginate(
  Intersection(
    Match(Index("things_by_a"), 1),
    Match(Index("things_by_b"), 2),
    Match(Index("things_by_c"), 3),
  )
)

// 6 TROs
Paginate(
  Intersection(
    Match(Index("things_by_a"), 1),
    Match(Index("things_by_b"), 2),
    Match(Index("things_by_c"), 3),
    Match(Index("things_by_d"), 4),
  )
)

Great! It’s more clear now.

One last question would be what about the traverse queries, I mean if I need to get a user from the user table, then his posts from the post table (with user_id foreign key) and then his likes from the post_like table (with post_id foreign key) and all this starts with a SELECT * FROM users WHERE id = X. In this example, can you please explain in detail how is the calculation in TRO or TCO? (I’m asking about a single user and the get all his posts and his posts_likes)

It depends on how you set up your indexes. If indexes return the default value (the document’s ref), then there will be a read op for every post and every like document, plus one more for each new index page that needs fetched in the process. For example, if there are 10 posts and each post has 10 likes, then

   1 TRO  `Get` the user
   1 TRO  `Paginate(Match(Index("posts_by_user"), Var("userRef"))`
  10 TRO  `Get` each post
  10 TRO  `Paginate(Match(Index("likes_by_post"), Var("postRef"))`
+100 TRO  `Get` each like
--------
 122 TRO

This could be lowered significantly, though, by specifying the values for the indexes and avoiding Get.

For TCOs, again you have to look at the query. Each function call is 1/50th of a TCO. If you need to manipulate the data, for example if you need to Merge the like documents into the post documents and into the user document (one big nested document), then it will take more functions and thus more compute. It’s all of course dependent on how many posts per user and likes per post is in your dataset.

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.