So for my application I have a collection of surveys completed by users. The ultimate goal would be to gather statistics on all the surveys that are submitted. i.e. how many surveys are submitted in total, how many people answered x on a given question, and how many people answered x on a given question that answered y on another question, etc.
I thought of two different approaches. Approach 1 would be to index all the surveys, and retrieve them all and generate the statistics by iterating through the list of surveys that come back as a response. While this would allow all my stats to be accurate, I believe this idea would take longer the more surveys are returned as the application scales. If I get more than 100k surveys, no more accurate stats as the limit will be hit. And at that point, it would also be very computationally expensive to calculate any statistics.
For the second approach, I could create a new collection called survey_statistics with a single document in it. And every time a survey would be created, I could increase the count of an integer value on that document. I could have an key such as totalSurveys that has a value of 0 and increment it every time the survey enters the database. This way, whenever I need to retrieve the statistics document, it would be fairly quick as it is only retrieving one document with the calculations completed. Intuitively, this method seems like the way to go. However, I was concerned with the idea of there ever coming a time where the statistics pulled from this document would not be accurate. If there was ever a point in time where a survey was stored successfully but the function to update the statistics document failed, the document would not have correct information when it is retrieved. Is there any way to ensure that both operations complete? I thought about incorporating the Do() function so that everything occurs sequentially (surveys are stored first, then stats are stored second). However I do not know if Do() ensures both steps of the process will complete.
Any thoughts? Is there a more efficient way anyone else has come up with if I am not on the right track?