<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title><![CDATA[Acko.net]]></title>
  <link href="https://acko.net/atom.xml" rel="self"/>
  <link href="https://acko.net/"/>
  <updated>2026-03-05T12:10:39+01:00</updated>
  <id>https://acko.net</id>
  <author>
    <name><![CDATA[Steven Wittens]]></name>
    
  </author>

  
  <entry>
    <title type="html"><![CDATA[The Database is on Fire]]></title>
    <link href="https://acko.net/blog/the-database-is-on-fire/"/>
    <updated>2020-10-04T00:00:00+02:00</updated>
    <id>https://acko.net/blog/the-database-is-on-fire</id>
    <content type="html"><![CDATA[<div class="c"></div>

<p><img src="https://acko.net/files/on-fire/cover.jpg" style="display: none" alt="cover image" /></p>

<div class="g8 i2"><div class="pad">

<p>Let me tell you a tech story.</p>

<p>Many years ago, I was working on an app with real-time collaboration built into it. It was a neat experimental stack, using early React and CouchDB to their full potential. It synced data live over JSON <a href="https://en.wikipedia.org/wiki/Operational_transformation" target="_blank">OT</a>. It was used to do actual work internally, but the general applicability and potential for re-use was&nbsp;evident.</p>

<p>Trying to sell potential customers on this technology, we ran into an unexpected wall. Our tech looked and worked great in the demo video, no problem there. It showed exactly how it worked, and nothing was faked. We had come up with and coded a credible field&nbsp;scenario.</p>

</div></div>

<div class="g5 mt1">
  <img class="flat" src="https://acko.net/files/on-fire/clip-art-2.png" alt="Two users interacting through a mobile app" />
</div>

<div class="g7 mt1"><div class="pad">

<p>That was in fact the problem. Our demo worked exactly the way everyone else always pretended their apps worked. Namely that information would flow instantly from A to B, even large media files. That everyone would see new records as they came in. That by adopting an app, users could just work together without friction on the exact same things, even if they had a spotty internet connection in the middle of nowhere. This is the sort of thing every After Effects-product-video does&nbsp;implicitly.</p>

<p>Despite the fact that everyone knew what a Refresh button was for, it didn't register at all that the web apps they were asking us to build would normally be subject to its limitations. Or that it would be an entirely different experience if you stopped needing it altogether. What they mainly noticed was that you could "chat" by leaving notes between people, so they wondered how this was different from, say, Slack.&nbsp;Oof.</p>
</div></div>

<div class="g8 i2 mt2"><div class="pad">

<h2>The Design of Everyday Syncs</h2>

<p>If you've been working in software for a while, it can be a bit jarring to remember that most people cannot look at a picture of an interface and infer what it will do when they interact with it. Let alone know what's going to happen behind the scenes. Knowing what <i>can</i> happen is largely a result of knowing what cannot happen, or should not happen. It requires a <a href="https://uxdesign.cc/design-of-everyday-things-chapter-01-summary-and-keypoints-664d426a11e0" target="_blank">mental model</a> not just of what the software is doing, but how its different parts are arranged and can&nbsp;communicate.</p>

<p>A classic example is the user who is staring at a <i>spinner.gif</i> for 20 minutes, wondering when the work will finally be done. The developer would know the process is probably dead and that the gif will never go away. Its animation mimes that work is happening, but it's not connected to the state of the work. This is the sort of thing where some techies like to roll their eyes at just how misinformed a user can be. Notice though, which one of them is pointing at a spinning clock and saying it's actually standing&nbsp;still?</p>

<p class="tc mt3 mb3"><img class="auto sized flat" src="https://acko.net/files/on-fire/spinner.gif" alt="An animated activity spinner" width="48" height="48" style="transition: .25s ease-in-out opacity; opacity: 0.75; cursor: pointer" onclick="this.style.opacity = 0.75 - this.style.opacity" /></p>

<p>This is the value proposition of real-time in a nutshell. These days real-time databases are still mostly underused, and still regarded with suspicion. Most of them lean heavily into NoSQL style, which tend to bring up Mongo-fueled benders people would rather forget. For me though, it meant getting comfy on the CouchDB, as well as learning to design schemas that somebody other than a bureaucrat could correctly fill in. I think my time was better&nbsp;spent.</p>

<p>The real topic of this post is what I'm using these days. Not by choice, but through uncaring and blindly applied corporate policy. Hence a Totally Fair and Unbiased comparison of two closely related Real-Time Database Products By&nbsp;Google™.</p>

<div class="tc mt3 mb3">
  <div class="flex center">
    <img class="inline flat" src="https://acko.net/files/on-fire/firebase.svg" alt="Firebase Real-Time Database" title="Real-Time Database" width="128" height="128" />
    <big><big>+</big></big>
    <img class="inline flat" src="https://acko.net/files/on-fire/firestore.svg" alt="Firebase Cloud Firestore" title="Cloud Firestore" width="128" height="128" />
    <big><big>=</big></big>
    <img class="inline flat" src="https://acko.net/files/on-fire/firebase.png" alt="Firebase" title="Firebase Suite" width="128" height="128" />
  </div>
  <p><i>Anyone need more stickers?</i></p>
</div>

<p>They both have fire in their name. One I regard with fondness. The other is a different kind of fire. If I'm hesitant to name them, it's because once I do, you will run into the first big problem, their&nbsp;names.</p>

<p>One is called the <i>Firebase Real-Time Database</i> and the other is the <i>Firebase Cloud Firestore</i>. They are both products in Google's <i>Firebase suite</i>. The APIs are respectively <code>firebase.database(…)</code> and <code>firebase.firestore(…)</code>.</p>

<p>This is because <i>Real-Time Database</i> is just the original <i>Firebase</i>, before Google bought it in 2014. They then decided to build a <i>copy</i> of Firebase on their big data whatsit, as a parallel offering, and named it Firestore with a cloud. Hope you're still following. If not, no worries, I have tried to rewrite this part a dozen times&nbsp;already.</p>

<p>Cos really you have a <i>Firebase</i> in your Firebase, and a <i>Firestore</i> in your Firebase, or at least you do if you want to be able to make sense of Stack Overflow from a few years&nbsp;ago.</p>

<p>If there is a Raspberry award for worst naming in software products, this surely qualifies. The mental hamming distance between these names is so small it will trip up even seasoned engineers, whose fingers will type one thing as their lips speak another. It will royally fuck up plans made with the best of intentions, fulfilling the prophecy of having your database be on fire. I'm not joking in the slightest. The person who came up with this naming scheme has blood, sweat and tears on their&nbsp;hands.</p>

<p>I will call them <i>Firebase</i>⚾️ and <i>Firestore</i>🧯. The baseball is for Fire<i>base</i>, and the fire extinguisher is for Fire<i>store</i>, because you need&nbsp;one.</p>

</div></div>

<div class="c"></div>

<div class="wide mt3 mb3">
  <img src="https://acko.net/files/on-fire/cover-wide.jpg" alt="A man chasing bugs with a butterfly net in a forest that's on fire" />
</div>

<div class="c"></div>

<div class="g8 i2"><div class="pad">

<h2>Pyrrhic Victory</h2>

<p>You might think <span class="nowrap">Firestore🧯</span> is the <i>replacement</i> for <span class="nowrap">Firebase⚾️</span>, a next-generation successor, but that would be false. <span class="nowrap">Firestore🧯</span> is patently unsuitable as a <span class="nowrap">Firebase⚾️</span> replacement. Somebody has seemingly engineered away almost everything that is interesting about it, and messed up most of the rest in various&nbsp;ways.</p>

<p>A cursory glance at the two offerings will confuse you though, because it seems they do the same thing, through mostly the same APIs and even the very same DB session. The differences are subtle and will only be discovered after careful comparative study of the sparse docs. Or when you're trying to port code that works perfectly fine on <span class="nowrap">Firebase⚾️</span> over to work with <span class="nowrap">Firestore🧯</span>. When you discover that your database interface catches fire as soon as you try to do a mouse drag in real-time. Again, not&nbsp;kidding.</p>

<p><span class="nowrap">Firebase⚾️</span>'s client is polite in that it buffers changes and auto-retries with last-write-wins best-effort consistency. <span class="nowrap">Firestore🧯</span> however has a throttle of 1 write per document per user per second, enforced by the server. It's up to you to implement a rate limiter around it yourself when you use it, even though you're just trying to build your app. So <span class="nowrap">Firestore🧯</span> is a real-time database without a real-time client, wearing the API of one as a skin&nbsp;suit.</p>

<p>This is also where you start to see the first hints of the reason for <span class="nowrap">Firestore🧯</span>'s existence. Those words "best-effort consistency" tend to make Grumpy Database Admins' few remaining hairs stand up. I could be wrong about this, but I suspect that somebody high up enough at Google looked at <span class="nowrap">Firebase⚾️</span> after they bought it and simply said: "No. Dear God, No. This is unacceptable. Not on my&nbsp;watch."</p>

</div></div>

<div class="g6">

  <p class="tc"><img src="https://acko.net/files/on-fire/graybeard.jpg" alt="A gray beard monk reading from a tome" /></p>

</div>

<div class="g6"><div class="pad">

<p>They emerged from their chambers and declared:</p>

<p><i>"One big JSON document? No. You will split your data into separate documents, each no more than 1MB in&nbsp;size."</i></p>

<p>This seems like a limit that will not survive first encounter with any sufficiently motivated user base. You know it's true. Like, we have a few 1500+ slide presentations at my current job, which is Perfectly&nbsp;Normal.</p>

<p>Under this constraint, you will be forced to accept that one "document" in your database bears no resemblance to anything a user might refer to as a document. This is the kind of thing that will cause subtle misery from day 1 in attempting to match end-user requirements, and never&nbsp;cease.</p>

</div></div>

<div class="g8 i2"><div class="pad">

<p><i>"Arrays of arrays that can contain other things recursively? No. Arrays will contain only objects, or numbers of a fixed size, as God&nbsp;intended."</i></p>

<p>So if you were hoping to put GeoJSON in your On<span class="nowrap">Firestore🧯</span>, you will discover that's not possible. With anything remotely not 1-dimensional. Hope you like Base64 and/or JSON inside&nbsp;JSON.</p>

<p><i>"Import and Export JSON via HTTP, command line tools or the admin panel? No. You will only export and import data to and from Google Cloud Storage. At least I think that's what it's called these days. When I say 'you', I am only talking to those of you with Project Owner permissions. Everybody else go away and file a&nbsp;ticket."</i></p>

<p class="mt3">You see, FireBae🤵⚾️'s data model is straightforward to describe. It contains one big enormous JSON document, which maps JSON keys to URL paths. If you <code>HTTP PUT</code> this to your ⚾️ at <code>/</code>:</p>

<pre><code>{
  "hello": "world"
}</code></pre>
<div class="c"></div>

<p>Then <code>GET /hello</code> will return <code>"world"</code>. For the most part this works as you'd expect. A ⚾️ collection of objects <code>/my-collection/:id</code> is equivalent to a JSON dictionary <code>{"my-collection": {...}}</code> at the root, whose contents are available at <code>/my-collection</code>:</p>

<pre><code>{
  "id1": {...object},
  "id2": {...object},
  "id3": {...object},
  // ...
}</code></pre>
<div class="c"></div>

<p>This works fine as long as each insert has a non-colliding ID, which you are provided a standard solution for.</p>

<p>In other words, ⚾️ is 100% JSON compatible (*) and plays nice with HTTP, like CouchDB. But you mostly use it through its real-time API, which abstracts away the websockets, auth and subscriptions. The admin panel does both, allowing real-time edits and JSON import/export. If you embrace the same in your code, it's quite astonishing how much specialized code disappears once you realize JSON patch and diff can solve 90% of your persistent state handling&nbsp;chores.</p>

<p>🧯's data model is JSON-like, but diverges in a few critical ways. No arrays of arrays was already mentioned. The model for sub-collections is to have them be first class concepts, separate from the JSON document holding them. Because there is no readily available serialization for this, getting data in and out requires a specialized code path. You must build your own scripts and tools to handle your own collections. The admin panel only allows you to make small edits, one field at a time, and does not have any import/export.</p>

<p>They took a real-time NoSQL database and turned it into a slow, auto-joined not-SQL with a dedicated not-JSON column. <i>Something something&nbsp;GraftQL</i>.</p>

<p class="tc mt3 mb3"><img src="https://acko.net/files/on-fire/dude-at-desk.jpg" alt="A developer sitting at a computer that's on fire" /></p>

<h2>Hot Java</h2>

<p>If 🧯 was supposed to be more reliable and scalable, then the irony is that the average developer will end up with a less reliable solution than if they had just used ⚾️ out of the box. The kind of software that Grumpy Database Admin has designed for requires a level of effort, and caliber of engineering, that just isn't realistic for the niche they are supposedly good at. It's similar to how HTML5 Canvas is not at all a Flash substitute, if the tools and player aren't there. What's more, it is steeped in an attitude towards data purity and sterile validation that simply doesn't match how the average business user actually <i>likes to get work done</i>, which is that everything is optional because everything is a draft until the very&nbsp;end.</p>

<p>The main flaw in ⚾️ is that the client was created a few years too soon, before you could reasonably expect most web developers to know what immutability was. As a result, ⚾️ assumes you are mutating your data, and will not benefit from any immutability you feed it. It will also not reuse data in the snapshots it gives you, making diffs much harder. For larger documents, its mutable-diff based transactional mechanism is just inadequate. C'mon guys, we have <code>WeakMap</code> in JavaScript now. It's&nbsp;cool.</p>

<p>If you give your data the right shape, and keep your trees shallow, you can work around this. But I do wonder whether ⚾️ wouldn't get a whole lot more interesting to people if they just released a really good client API based on leveraging immutability, coupled with serious practical advice on schema design. Instead it's like they tried to fix what wasn't broken, and made it&nbsp;worse.</p>

</div></div>

<div class="g5 mt2 mb2"><img src="https://acko.net/files/on-fire/dude-on-fire.jpg" alt="A developer on fire" /></div>

<div class="g7 mt2"><div class="pad">
  
<p>I don't know the full intent behind building 🧯. Speculating about motives inside the Borg cube is part of the fun. The juxtaposition of these two extremely similar but discapable databases is a relative rarity in the scene. It's almost as if somebody thought <i>"</i>⚾️<i> is just a feature we can emulate on </i><span class="nowrap"><i>&lt;Google </i>☁️<i>&gt;"</i></span> but hadn't yet discovered the concept of gathering real-world requirements, or coming up with useful solutions for all of them. <i>"Let the developers worry about that. Just make the UI pretty... Can we add more&nbsp;fire?"</i></p>

<p>I do know a thing or two about data structures. I can certainly see that "everything is one big JSON tree" is attempting to abstract away any sense of large-scale structure from a database. Expecting the software to just cope with any kind of dubious schema fractal is madness. I really don't need to imagine how bad these things can get, for I have done hostile code audits, and <i>seen things you wouldn't believe</i>. But I also know what good schemas look like, <a href="https://acko.net/blog/apis-are-about-policy">how you can leverage them</a>, and <a href="https://acko.net/blog/software-development-as-advanced-damage-control/">why you should</a>. I can imagine a world where 🧯 seemed absolutely like the right thing to do and where the people who did it think they did a good job. But it's not the one I live&nbsp;in.</p>

</div></div>

<div class="g8 i2"><div class="pad">

<p>⚾️'s support for querying is poor by anyone's standards, borderline non-existent. It certainly could have used improvement, or even a rethink. But 🧯 isn't much better, as it is limited to the same 1 dimensional indices you get in basic SQL. If you want the kinds of queries humans like to do on messy data, that means full text search, multi-range filters, and arbitrary user-defined ordering. Anything that's basic SQL if you squint is itself too limited. Besides, the only SQL queries people can run in production are the fast ones. What you want is a dedicated indexing solution with grown up data structures. For everything else, there ought to at least be incremental map-reduce or&nbsp;something.</p>

<p>If you ask Google's docs about this, you will be helpfully pointed in the direction of things like BigTable and BigQuery. All these solutions however come with so much unadulterated enterprise sales speak, that you will quickly turn back and look&nbsp;elsewhere.</p>

<p>The last thing you need with a real-time database is something made by and for people who work on an executive&nbsp;schedule.</p>

<p class="mt3"><i>(*) Just kidding, there is no such thing as <a href="http://seriot.ch/parsing_json.php" target="_blank">100% JSON&nbsp;compatible</a>.</i></p>


</div></div>

<div class="c mt2"></div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Summer of Code - Ajax Functionality for Drupal]]></title>
    <link href="https://acko.net/blog/summer-of-code-ajax-functionality-for-drupal/"/>
    <updated>2005-09-01T00:00:00+02:00</updated>
    <id>https://acko.net/blog/summer-of-code-ajax-functionality-for-drupal</id>
    <content type="html"><![CDATA[<div class="g8 i2 first"><div class="pad"><aside class="r m1"><img class="natural" src="/files/soc2006/soc.png" alt="" /></aside><p>
This last summer I was sponsored by Google as part of their <a href="http://code.google.com/summerofcode.html">Summer of Code</a> progam to work on Drupal. My goal was to introduce various <a href="http://en.wikipedia.org/wiki/AJAX" title="Asynchronous JavaScript and XML">AJAX</a> functionalities to <a href="http://www.drupal.org/">Drupal</a>.
</p>

<p>
The official project description was:
<blockquote><em>"Drupal has recently begun to find meaningful ways to introduce AJAX functionality with the goal of improving the user experience. Work with Drupal's usability experts to identify the next steps and help implement new dynamic functions based on interaction with the XMLHttpRequest object."</em></blockquote>
</p>

<p>
I focused on the following Ajax-powered features:
<ul>
<li><strong>Inline Editing of posts</strong>: Though I built a working prototype module, I decided not to develop this feature further because it is not flexible enough to work as a generic Drupal module. It would break on too many configurations and has limited usefulness anyhow.</li>
<li><a href="http://drupal.org/node/28483">Uploading of files</a>: allows you to attach files to Drupal nodes (with upload.module) without having to reload the page.</li>
<li><a href="http://drupal.org/node/30150">Sorting tables inside a page</a>: this changes the sort order of a table without reloading the entire page. It is not client-side sorting as you'd expect at first sight: because most tables in Drupal are spread across multiple pages, client-side sorting is not very useful.</li>
<li><strong>Switching between multiple pages</strong>: this was implemented on top of the sorting functionality, and only works on paged tables (this covers most of the useful pagers though).</li>
<li><a href="http://acko.net/yay-progress">Progressbar widget</a>: a typical progressbar that fetches the status from the server through Ajax.</li>
</ul>
</p>

<p>
The resulting code can be found in <a href="http://cvs.drupal.org/viewcvs/drupal/contributions/sandbox/unconed/soc/">my sandbox</a> in the Drupal contributions repository. Note however that most of the code is in patches against the (rapidly changing) Drupal HEAD, so they are likely to go out of date soon.
</p>

<p>
The file uploader is now already part of the Drupal HEAD, and at least the tablesorter is sitting in the patch queue being reviewed. I will try and keep them up to date.
</p>

<p>
A big thanks goes to Google for organising the Summer of Code!</p></div></div>
]]></content>
  </entry>
  
</feed>
