A Wanderer's Descent into the Abyss

Or, the Methods and Madness of the Cartography of Chaos

Index

Created: ·Updated: ·Tags: crdt database systems-design notetaking draft

Note. Unfinished draft, mainly just a braindump.

The Road to Hell is Paved with Good Intentions

Christ’s Descent into Hell, by Follower of Hieronymus Bosch
Christ’s Descent into Hell, by Follower of Hieronymus Bosch

“I am a cage, in search of a bird.”

Franz Kafka

It is only fitting that my journey had led me to this point.

The mind is designed to generate ideas rather than to store them. In my case, this seems doubly true: information has long escaped my grasp with disconcerting ease. I have therefore spent considerable time searching for a chalice of remembrance, to capture what would otherwise be lost. But the perfectionist in me remains dissatisfied with the results to date.

My childhood was spent – as should that of any child’s – satisfied. Before me sits a sheet of paper, and in my grasp lies a black pen. The venerable pen and paper remains peerless in the realm of flexibility – nothing compares to it in its capacity to bend to the shape of the mind. Yet it was not merit that shackled my hand to the pen and my gaze to the paper, but ignorant bliss. My concerns were scant; the life of a mere child presented no heft to seek might.

Alas, the fate of man is to be banished from Eden. A child does not remain a child forever.

Befitting of a hellish penitentiary seized directly from the depths of Kafka’s personal Tartarus (a place one would otherwise refer to as “high school”), I was soon thrust into a fresh world of hurt. No longer is the faithful pen and paper enough, and the stakes soon became too high for abandon. Thus began my first steps out of the garden, out in search for the chalice.

Unsurprisingly, my first foray landed me in the realm of Notion. An honourable first try, it was leaps and bounds ahead of mere pen and paper in its organisational capabilities, with the ability to assign arbitrary, structured attributes to pages and summon them with a mere query into tables. In retrospect, it is also rather convenient compared to my later tools of choice: it is always availableGranted, only if you have a decent network connection., and sharing and collaborating does not involve an arcane ritual sacrificing the blood of virginsSee the previous footnote.. Though, that comes with the territory of being a restrictive, proprietary tool locked into a server. Just like that, one would see the first cracks forming in the foundations of this marvellous glass castle.

If it has not been made painfully obvious, the Achilles’ heel of Notion had been offline access capabilitiesThis would later be introduced on the August of 2025, long after I have then discovered far greener pastures.. It’s also painfully inflexible: it has been designed with a specific workflow in mind, and it is nigh impossible to bend it to fit any other. Not that I can integrate external tools with it without jumping through a few hoops (why can’t I run ripgrep through my own data again?). Worst of all, it is painfully slow.This is not just my own lack of patience speaking (though it is partly responsible as well): slow application response times can interrupt one’s flow of thought and focus.

What Notion did show me, however, is that there is a way: perhaps not back to the Garden, but onwards, to the Promised Land. Where numbers and letters may order itself neatly in a single-file line. Where text may sing to a tune conducted by queries. Where the Adamic language may once again be spoken.

Taking a bite into the forbidden fruit, one begins to fall. Down, and down, into the abyss.

One Must Imagine Sisyphus Happy

Sisyphus by Titian
Sisyphus by Titian

“Who are you then?”

“I am part of that power which eternally wills evil and eternally works good.”

Johann Wolfgang von Goethe, Faust, First Part

Of course, it would only be honest to preface the following sections with the following disclaimer: that this endeavour is almost entirely the fruit of my own internal pedant. I’ve tried almost every note-taking-adjacent software that are available in the market, and is nothing but trivial nitpicks that I can raise with a lot of them. If I were to be eternally bound to Obsidian or Emacs Org-mode, I would not be dissatisfied.

How Standards Proliferate – xkcd #927
How Standards Proliferate – xkcd #927

Why then, one might ask, would I go through such lengths to make yet another system? Simply, because I can. By both endeavour and fortune, I have been granted the knowledge and ability to write my own software; would it not be a waste not to put it to use? After all, the best works are not the children of profit, but passion. May this, then, be of such fruitfulness.

There is no further reason beyond that. After all, we get into the habit of living before acquiring the habit of thinking.Albert Camus, The Myth of Sisyphus. Life is but a boulder which all must eternally push uphill. Happiness is choosing the most enjoyable boulder to push. And with this endeavour, I am happy.

A Problem Well-Stated Is a Problem Half-Solved

Melencolia I by Albrecht Dürer
Melencolia I by Albrecht Dürer

“It is foolish to answer a question you do not understand.”

George Pólya, How to Solve It.

In his book, How to Solve It, Pólya argued that there are four steps to solve a problem. One must first understand the parameters to the given problem, namely

Only then can we seek for a plan, which we do by seeking for ideas. Every idea transmutes the state or formulation of the problem, and thus we wish to find some conjuration of ideas which transmutes our initial problem to the desired, solved state.One may notice parallels between this and machine-assisted theorem proving using languages such as Isabelle or Lean, where a theorem is proven by assembling tactics, which are rules for rewriting the formulation of the problem from one state to another.

A well-formulated plan is for naught if left unattempted – an obvious next step. Here, Pólya emphasised the verification of each step: one may see that the step is correct, but can they also prove so? Proving can be done in one of two ways, either

With regards to this, a personal remark would be on what we should do in the event of realising that our step is incorrect. A falsifying proof is just a formulation of another, auxiliary problem: given the original problem, our progress on solving it, and our incorrect step, along with the falsifying proof, what change should be made to the incorrect step, or even the plan as a whole, to render it correct? Once this auxiliary problem is solved, we may resume with the execution of the now-revised plan.

Finally, one must review the problem alongside the newfound solution upon the successful completion of it, in order to reap the fruits of their efforts when the next problem arrives. Similar to how we verify each step during the execution of the plan, we must now verify the result as a whole. One method with which Pólya suggested we do so is by attempting to derive the results in with a different manner:

“And as we prefer perception through two different senses, so we prefer conviction by two different proofs: Can you derive the results differently? We prefer, of course, a short and intuitive argument to a long and heavy one: Can you see it at a glance?

George Pólya, How to Solve It.

After verification, what is left to do is to ask oneself if the results or methods from this problem for some other problem. This will become the ideas which we derive our plans for our next problem for. In other words, whereas planning involves the application, or specialisation, of our ideas, our review involves the abstraction of our results and methods into generalisable ideas. Likewise, this is analogous to the formulation of computation in lambda\\lambda-calculus as a term defined by the inductive definition, consisting of a variable xx, an abstraction lambdax.;t\\lambda x.; t, and an application t;st; s.

Hence, we hereby seek to solve the problem: What makes a good note-taking system, and how does one implement it? The remainder of this article will involve the first two steps of problem-solving: we will examine the stated problem, and devise a plan to solve it. Later articles (once the code have been written) will cover the latter two, namely the execution and review steps.

The Abyss Must First Be Measured

The Ancient of Days, by William Blake
The Ancient of Days, by William Blake

“Beware that, when fighting monsters, you yourself do not become a monster. For when you gaze long into the abyss, the abyss gazes also into you.”

Friedrich Nietzsche, Beyond Good and Evil.

We begin with an examination of the problem itself. What are the unknown? Why, it is a specification of what constitutes a good note-taking system, of course. Then, what are our data? Well… nothing. Everything must start from somewhere, I suppose. Let us do so, then, with the construction of an auxiliary elementGeorge Pólya, How to Solve It, Part III. Short Dictionary of Heuristic., then, in the form of the most fundamental definition of a note-taking system.

I am of the opinion that there are two, otherwise equally correct approaches to the specification of a system: whether top-down, i.e. from a purely logical view, going down the layers of abstraction to a desired level, or bottom-up, which is its inverse.

Personally, I am more partial to the former, as the logical view is an arguably concrete, upper ceiling to the layeyrs of abstraction. The inverse may not be necessarily true; while by most definitions, assembly and binary would the the botoom floor of abtraction for control and data respectively, there are vanishingly few cases in which one would necessarily trouble themselves with such low-level implementation details. Thus, while one would always have to consider the logical implementation of a system, it is on a case-by-case basis which one would decide what their floor of abstraction is.

We will therefore begin with the logical definition of a note-taking system. The goal here is to distil the definition down to purely mechanism, without any policy.As an aside, the separation of mechanism and policy is a core principle in the architecture of microkernels. In other words, our definiton should be able to model any policy or system, e.g. Zettelkasten, Cornell, etc. Sparing all but the most fundamental requirements, a note-taking system is a store of information, from which a user can enter and retrieve information.

Here, Pólya suggests introducing suitable notation; one should find it adequate to denote a note-taking system SS as a quadruple

S=(Sigma,Alpha,oplus,rho), S = (\\Sigma, \\Alpha, \\oplus, \\rho),

where

\\begin{align\*} \\oplus&: \\Sigma \\times \\Alpha \\to \\Sigma\\ \\rho&: \\Sigma \\times \\Alpha \\to \\Sigma. \\end{align\*}

A key difference in the invariants maintained between the input function oplus\\oplus and the reduction function rho\\rho is that

We may observe that there exists a necessary relation between Sigma\\Sigma and Alpha\\Alpha: since Sigma\\Sigma is a store of one or more Alpha\\Alpha, then it must be some structure of Alpha\\Alpha. That is, Sigma\\Sigma must be defined with respect to Alpha\\Alpha.It may be possible to formalise this as a term algebra, such that Sigma\\Sigma is some term algebra mathcalT(Alpha)\\mathcal{T}(\\Alpha) over Alpha\\Alpha.

As this is our only construct so far, this will be one of our axioms. We must, therefore, rely on intuition to convince ourselves of its correctness. We may do so by drawing comparisons with existing systems and softwares. Let us begin with the most basic system, the pen and paper. In a pen-and-paper system, there exists two operations, namely

Therefore, we may map the paper to Sigma\\Sigma as a structure of inscriptions of ink, rendering the latter as our Alpha\\Alpha. Writing, as the addition of ink to the paper, can be mapped to a function taking the paper and inscription as inputs, and returning a paper with it inscribed – a superset of the original sheet. It is therefore our oplus\\oplus.

However, it may be less obvious how reading maps to rho\\rho. But it may become more apparent if we introduce a new sheet of paper as an auxiliary construct. On it is inscribed some instruction, say, “find the opening paragraph of Beowulf”. One may therefore read said instruction, find the opening paragraph of Beowulf, and inscribe it verbatim on a new sheet of paper. Hence, one would have taken in a piece of paper (the text of Beowulf) and an inscription (the instructions to find its opening paragraph), and outputted another sheet of paper with the instructed content – that it is a subset of the original text should be apparent.

Mapping this onto a digital system should also be straightforward: instead of inscriptions, it is bytes, and instead of sheets of papers, it is CPU caches, RAM, or hard drives. But then, how does this hold up to the logical abstractions built upon it?

To Drink from the River Lethe

The Waters of the Lethe by the Plains of Elysium by John Roddam Spencer Stanhope
The Waters of the Lethe by the Plains of Elysium by John Roddam Spencer Stanhope

Fade far away, dissolve, and quite forget

What thou among the leaves hast never known.

John Keats, Ode to a Nightingale

One noticeable difference between content on paper and on the computer is the ability to erase. If one makes a mistake, it is a mere keystroke away to delete it. How does this map to our system?

Interestingly, there are two viable mappings for erasure. Both are equally satisfactory with respect to the definition, but have significantly different implications. We may first notice the straightforward mapping to our rho\\rho function, which, given some atom alphainAlpha\\alpha \\in \\Alpha, returns a strict subset sigma\\sigma' of the original store sigma\\sigma. Deletion may therefore be defined as some function textdel:mathbbStimesAlphatomathbbS\\text{del}: \\mathbb{S} \\times \\Alpha \\to \\mathbb{S}, such that

textdel=lambdaalpha;S.(rho;(f;alpha);sigma,Alpha,oplus,rho)textwheresigmainS, \\text{del} = \\lambda \\alpha; S. (\\rho; (f; \\alpha); \\sigma, \\Alpha, \\oplus, \\rho) \\text{ where }\\sigma\\in S,

where ff is some function mapping the atom alpha\\alpha to the correct expression for its deletion. Therefore, given some S=(sigma,Alpha,oplus,rho)S = (\\sigma, \\Alpha, \\oplus, \\rho) and S=(sigma,Alpha,oplus,rho)S' = (\\sigma', \\Alpha, \\oplus, \\rho) such that S=textdel;alpha;SS' = \\text{del}; \\alpha; S, we have that alphainsigma\\alpha\\in\\sigma but alphanotinsigma\\alpha\\not\\in\\sigma.

An astute reader may make two observations from the above, namely that

These observations may hint at the construction of an alternative mapping for deletion: why not simply add the atom for the deletion operation into the store? This is plausible given the right structural representation of Sigma\\Sigma, which we will get to later.Hint: it’s CRDTs.

As such, we may also map deletion as

textdel=lambdaalpha;S.(oplus;(f;alpha);sigma,Alpha,oplus,rho)textwheresigmainS. \\text{del} = \\lambda \\alpha ;S. (\\oplus; (f; \\alpha); \\sigma, \\Alpha, \\oplus, \\rho) \\text{ where }\\sigma\\in S.

The consequence of the above construct is that deletion is a reversible operation: one may insert another atom that deletes said deletion atom, or reduce the store with rho\\rho to permanently remove said atom.

To give shape to what would otherwise be a highly abstract notion, we may analogise

Hence, we may have arrived at a satisfactory intuition on the correctness of our construct SS, but there is another, profound consequence of it: the expressibility of actions as atoms alphainAlpha\\alpha\\in\\Alpha grants us the ability to store not just the space, but time. The second construction of textdel\\text{del} is able to reverse any action at any given time.


It is written in the Aeneid Now, in a secret vale, the Trojan sees A sep’rate grove, thro’ which a gentle breeze

Plays with a passing breath, and whispers thro’ the trees;

And, just before the confines of the wood,

The gliding Lethe leads her silent flood.

About the boughs an airy nation flew,

Thick as the humming bees, that hunt the golden dew;

In summer’s heat on tops of lilies feed,

And creep within their bells, to suck the balmy seed:

The winged army roams the fields around;

The rivers and the rocks remurmur to the sound.

Aeneas wond’ring stood, then ask’d the cause

Which to the stream the crowding people draws.

Then thus the sire: “The souls that throng the flood

Are those to whom, by fate, are other bodies ow’d:

In Lethe’s lake they long oblivion taste,

Of future life secure, forgetful of the past.

Long has my soul desir’d this time and place,

To set before your sight your glorious race,

That this presaging joy may fire your mind

To seek the shores by destiny design’d.”—

“O father, can it be, that souls sublime

Return to visit our terrestrial clime,

And that the gen’rous mind, releas’d by death,

Can covet lazy limbs and mortal breath?”


Virgil, Aeneid, Book IV.
that the weary shades of the dead may only earn their reincarnation by drinking from the river Lethe, washing away their memories – it is only through oblivion that one may be reborn. Oblivion, then, is the conclusion of one life, and a presage to the beginning of the next.

It’s an apt metaphor for what we seek to do. In a manner that I believe is not too dissimilar from a language model, the human mind is but a finite chalice. There is only so much that we can remember, and to be forced to remember our current endeavour is to be barred from proceeding to our next. To be freed from our endeavour, and to proceed to our next, we must be allowed to forget. Yet to release us from remembrance, another must carry that burden in our stead.

What we seek to build, therefore, is our river Lethe, washing our memories, carrying it in its ebb and flow. We may then journey onwards, our burdens light, and when the time is right, may we be granted another encounter with our parted memories further down its stream, in a time when we need it the most.


With that said, it should be apparent the motivation for declaring the second construction of the deletion operation as a sensible default. It is non-destructive: it is always reversible, and its traces fully recorded in the state of SS. We will take this a step further and declare that reversible actions should be the default. We want our system to remember.

That is not to say that the second, destructive construction does not have its place. Like the mind, the computer is of finite capacity – it, too, needs to forget. Hence, destructive, irreversible actions are to be judiciously used to cope with the limitations of the computer. If our storage has reached its limits, then we must carefully choose what to let go. But this is ideally to be avoided, as what is let go cannot be retrieved again.

A Cartography of State

We have arrived at the most abstract representation of the note-taking system. A suitable next step would be to iteratively examine each component Sigma\\Sigma, Alpha\\Alpha, oplus\\oplus, and rho\\rho in SS; in more colloquial terms, we seek to flesh out our design, solidifying it in place.There are two approaches with which we could iterate through this design, corresponding to the two fundamental graph exploration algorithms: a breadth-first search and a depth-first search. That is, we could either iterate through every component at the current level of abstraction before proceeding to examine their sub-components further, or focus on one sub-component until we reach our most concrete level of abstraction. Here, I will choose to perform the equivalent of a breadth-first search, as I claim that every component in each level is dependent upon each other. For instance, it should be self-explanatory that the design of the input and reduction functions depends on that of the state Sigma\\Sigma as their medium of “communication”. Hence, decisions on one component heavily influences the trade-offs of others, and we must consider them thoroughly before proceeding down the layers. I conjecture that each iteration restricts the space of possible design, and that there exists a threshold at which such a space is sufficiently restricted that the variations will not cause any significant discrepancy in behaviour. It is at this threshold at which we will begin the implementation. Of course, this is not in any way a formal nor rigorous definition–and I claim that one cannot be constructed with any reasonable amount of effort–hence it is up to our own discretion to determine where that threshold lies. That is, rather unsatisfactorily, we will know it when we get there.

I will arbitrarily select the atoms Alpha\\Alpha and state space Sigma\\Sigma as the first component to examine closer. The design of this is central with respect to that of the input and reduction functions oplus\\oplus and rho\\rho, as they form the structure on which these functions operate. Furthermore, these two components are heavily related with each other–note that Sigma\\Sigma is literally defined in terms of some structuring of alphainAlpha\\alpha \\in \\Alpha. Our first task is therefore to formalise this relation between Alpha\\Alpha and Sigma\\Sigma.

Index