Replicache Internals - Behind the scenes of store.subscribe
/ 3 min read
This is a follow up of my post about basic overview of Replicache. This is how a typical code looks like for subscriptions from the earlier post -
import { store } from "./init-store";
const todoId = "123";const todoSelector = async (tx) => { (await tx.get(`todo/${todoId}`)) ?? null;};
// calling subscribe returns an unsubscribe functionconst unsub = store.subscribe(todoSelector, (todo) => { console.log("my subscribed todo", todo);});
// you can also get one off value without subscribingconst todo = await store.get(todoSelector);
// or for subscribing to all todosstore.subscribe( async (tx) => { const todos = []; for await (const { key, value } of tx.scan({ prefix: "todo/" })) { todos.push({ id: key.split("/")[1], ...value }); } return todos; }, (todos) => { // this is called whenever the todos change console.log("todos", todos); },);
At first glance, this might appear inefficient, as the subscription seems to be called for every change in the store. But that is not the case. They have an in-built Reselect like mechanism for the selector functions. It is called only if the keys it accessed last time changes. Because of this internal tracking of keys, in the above code, even though we’re creating a todos array from the scan, the selector won’t run if the todos don’t change thus preventing unnecessary subscription callbacks. If something like this were to be done in the Redux/Zustand world, we would either have to break down the selector(for using reselect) or use custom equality check functions.
Diving into the codebase
-
Each key accessed when the subscription’s body(the first argument to subscribe) runs is added to an array by the transaction object(the
tx
argument on which get is called). -
The top level
Replicache
class has a reference to the SubscriptionsManager class containing a set of all subscriptions, each of them tracking the keys and scans which their body function accessed. -
The
body
function is invoked initially on add. -
Thereafter, on any change(i.e. on write, refresh and pull) a DiffsMap object is calculated. And only the body functions which accessed the keys in the diff are re-evaluated. For this reason, the body must be a pure function.
-
Internally, the diff above is calculated on the BTree representation of the data.
-
Subscribe only re-fires the
onData
callback(the second argument) when the result of thebody
function changes. The defaultisEqual
comparison is a deep JSON check.
Final Notes
Replicache is a very well written source code. I also came across Noms and a pretty cool data structure called Prolly Trees which Noms uses while looking more into Replicache. It is probabilistic tree with combined benefits of Merkle and B-Tree. The parent nodes store hashes of their child nodes(which makes it easy to check equality of two trees). And the same set of keys would lead to same graph irrespective of the order(history independence) which makes it act like a sorted set. This is what make fast diff, sync and merge possible in Noms. I would likely a write a detailed post on the data structure in future. You can read up on them in the Noms Docs here.