@ptpaterson thanks again for that link to Consistency without Clocks. Just read through it, and I could be wrong, but think there is a mistake in the image directly under the line “In the case of T11, the original reads are different from the reads of the correct snapshot:” it shows Washington DC with “T10’s Buffered Writes” but I think it is supposed to be “T11’s Buffered Writes”?
(Just some thoughts on things - not important to read if you are busy…)
Learning about the distributed transaction log reminded me of the solution I came up with to make sure clients got the most recent data (in a PHP/MySQL application I wrote 15 years ago; but which has been constantly updated with better data syncing). At the beginning I just used a time stamp of the last update; but it was unreliable (reads could skip writes that were not yet complete at the time of the read). I then updated server code to fetch data that was timestamp minus 5 seconds which sometimes resulted in duplication so the server side had to keep a cache of what it had seen before (for each client - though this cache was itself written to the DB and read back again on each reconnect), and therefore only pass new entries back to the client. But even this could result in occasional missed writes. It was not serialisable. What I then devised was to write each update to anything in the DB to a log table, and rather than use a timestamp, use the primary key of that table as a means to understand what had and had not been read. Since primary keys (in SQL) are in sequence, it became easy to understand that a write had not been read - so the server could wait a bit, and go back and try to read the log again until it had a complete block of in sequence primary keys. Once it had a chunk of logs that were in sequence it could read the data they related to, and pass that back to the client. Thus, time was no longer important. Of course, this didn’t necessarily pass the serialisability test, but it was much better.
The latest design I had (but which I never put into production - because I came across Fauna) was to use Redis for the log; so the load on the DB server would be reduced (web servers would only query the DB if they saw new logs in the Redis server), and if different DB servers in the replication were not quite in sync, the log would help with that.
Fauna is literally one of the most amazing database systems ever put together. Transactionally ACID compliant, serialisable, distributed, non-blocking reads - and it’s fast. I had fully designed a new system with Apache, PHP, SQL - the usual, and was about to start building it when I came across Svelte (and Richard Harris demo that he made a few years back), which lead me to Cloudflare, and then Fauna. My world changed forever.
I am so curious how Fauna has been implemented on the hardware and software level…
This is an aside, but are you going to set up servers in Asia at some point? I’m in Japan, and the service works very well, but I hope that one day you’ll get servers set up here too. Thanks!