The Technical Architecture of the Quantum Cats

Quantum Cats is a collection of 3333 Ordinals Inscriptions that evolve over time, to reveal different artwork. This is the first ever collection of Inscriptions that will evolve over time, and was created in a time of high fees and an unpredictable future fee market. This is not an article about the aesthetic virtues of the artwork (I think they look cool) or reasons to participate in the market for them; this is an article about the technical implementation of Quantum Cats. I think the engineering challenges we faced and the techniques we implemented to meet those challenges are interesting and potentially useful to both future Ordinals creators and to other Bitcoin application developers generally.

Before getting into the technical nitty gritty of Quantum Cats, it’ll be useful to understand the experience we were trying to create. Ordinals users hold inscriptions (digital collectables that are implemented in the Ordinals protocol and are transferred with Bitcoin transaction) in self-custody Bitcoin wallets that have coin control and transaction construction features that allow for transfer of specific ordinals, as well as the signing of more complex transaction types (such as trustless offers and swaps on ordinals marketplaces). We wanted to create an Inscription collection that would evolve over time – adding or changing attributes or traits of the Cats.

The artwork for Inscriptions is published on-chain in the witness of a Taproot transaction (in a special encoding called an Envelope – ordinals-aware software parse transactions looking for this envelope in order to find inscriptions). That means that any particular inscription data is immutable and can not be changed once it’s been published (short of a re-org). However, there are a couple different ways that we can deliver the experience of changing artwork, even though the artwork never actually changes (and in-fact, having access to the old artwork is great if you like it more!).

Recursion is an ordinals feature where one inscription can reference the content of another. For example, you can inscribe an HTML page, and have it include images that are in other inscriptions. Ordinals software renders HTML pages in iframes, so you can have an ordinal’s content be built-up client side from multiple inscriptions. HTML inscriptions can not include content from the broader web, only from other inscriptions or a small set of other endpoints provided by the ordinals software (for example, there is an endpoint to fetch the current bitcoin block height). This means that recursive inscriptions are all still on-chain, they just are decomposed which allows for composability and re-use of common components. For example all the Quantum Cats with a red background can refer to a single inscription containing the red background, instead of all of them needing to put the same data on-chain.

When one inscription refers to another, it does so by its Inscription ID. An Inscription ID is made up of the Bitcoin transaction ID in which the inscription data is revealed, the letter i and then an output index of the inscription that is created. For example, the inscription 4b31771df21656d2a77e6fa18720a6dd94b04510b9065a7c67250d5c89ad2079i0 is the first inscription created in the bitcoin transaction 4b31771df21656d2a77e6fa18720a6dd94b04510b9065a7c67250d5c89ad2079. That means that if you inscribe an image (like a png) and then inscribe an HTML page that includes the inscription ID of the image in an img tag, you can have the HTML inscription render the content of the image inscription. If the HTML inscription refers to an image inscription that is not actually on-chain (yet), then the ordinals server will return a 404 (not found) error, which the HTML inscription can quietly swallow. If we pre-sign image inscriptions – but don’t broadcast them to the Bitcoin network – we can obtain their future inscription IDs (because they are just a transaction ID and an index), and include those inscription IDs in HTML inscriptions that we do broadcast. When someone views the HTML inscription, it is able to render the content of its references that are on-chain, but will not be able to render the presigned but not broadcasted components. As more components are published, the HTML inscription will automatically be able to render them. This is the core mechanism that the Quantum Cats collection uses to evolve its artwork – presigned transactions for traits that are progressively revealed over time. As we’ll see, fee management and market dynamics introduced complexities that made the Quantum Cats need some additional layers of indirection and features, but presigned transactions with pre-computed transaction IDs are the key feature of Bitcoin that made the collection possible.

Even though the contents of a presigned but unrevealed inscription are unknown before the transaction is broadcast, the same inscription ID will have the same content. This created a problem: even though people can’t tell what a future trait would be (like a background or a body trait), they would be able to count the number of times that a particular inscription ID occurred and be able to tell which future traits were more-or-less rare, and be able to trade Cats on their future evolutions. We really wanted evolutions to be surprising and fun, and not knowing ahead of time what future evolutions would do to the relative rarity of different cats is a lot of fun. So, we introduced a layer of indirection: every cat refers to presigned (but unrevealed) “Layer Connector” that map a Cat by a unique ID to presigned artwork. That means for example that every Cat refers to the same Layer Connector for its initial background image. It is only once this Layer Connector is broadcast to the network that people can learn which backgrounds are more or less common. This technique also allowed for space-savings: since every cat refers to identical layer-connectors, the HTML for the cat to import the layer connectors can be inscribed once and then referred to by each of the 3333 Cat inscriptions. In fact, each Cat inscription was reduced down to 109 bytes: just a unique Cat ID and a script tag to import the logic to fetch and render the common set of Layer Connectors, look up the unique artwork for each layer by cat, and render that artwork. Being able to move the mapping of each Cat to its artwork out of the individual Cat inscriptions and into a common inscription, and adding the layer of presigned indirection not only solved the information leak about relative rarity in traits, but also saved approximately 5 BTC in inscription costs!

With this introduction of Layer-Connector inscriptions and the factoring of rendering logic to a common component, there are now 4 kinds of assets being inscribed:

  • Actual artwork for each trait in the Cat (a background image, or a body, or the eyes)
  • A layer-connector that maps a Cat by its ID to a specific artwork asset. This mapping happens once per “layer” (background, body, eyes, mouth, etc.)
  • The core dispatch and rendering logic. We call this the “Dispatcher”. It is responsible for fetching a layer connector, looking up the artwork for the Cat in the layer connector, fetching that artwork asset, and then rendering it to a canvas in order. This successive rendering in order is why we model the artwork as a layer.
  • The individual Cat that is distributed to a collector. This is 109 bytes and includes a unique ID and a reference to the dispatcher, which contains all the rendering code

In Quantum Cats, there are several hundred artwork assets, 40 layers (meaning 40 layer-connectors), 1 dispatcher, and 3333 cats. The 3333 Cat inscriptions refer to the inscription ID of the Dispatcher, which refers the the inscription IDs of the 40 layer-connectors, each of which refers to one or more inscription IDs of artwork assets. We presigned these assets in the reverse order: first the artwork to get their inscription IDs, then we rendered those into layer-connectors and presigned those to get their inscription IDs, then rendered the Dispatcher and presigned it, and then finally assembled the individual Cat inscriptions.

Inscription IDs include a Bitcoin transaction ID. Bitcoin Transaction IDs are a function of their inputs, outputs, version, and locktime. That means that if we spend the UTXO that funds a presigned transaction on some other transaction, then we will never be able to re-create that same transaction ID again, and we will break our presigned inscription reference! To avoid this, we created a UTXO to fund every presigned transaction, and then maintained a database to track which UTXO was assigned to fund which presigned transaction. We also had automated sanity checks to assert that no two inscriptions spent the same UTXO, that every inscription commit transaction only spent its assigned UTXO, and that the total inputs and outputs of all transactions (including fees) were what we expected. These checks ran whenever the system touched wallets or keys, and gave us confidence that nothing was being signed that shouldn’t be. Additionally, we used segregated wallets for different asset inscription types, to add further protections against a bug causing a UTXO being double-assigned. We also built a test harness that ran through all of the presigning and publication of inscriptions on regtest and then validated that the data that ended up on-chain matched what was in our control-plane database.

Presigning transactions in this way meant that we had to pre-commit to the fees that each inscription would pay. We can’t know what fee rates will be when we eventually reveal these evolutions, so what we decided to do is presign the transactions with a reasonable fee rate and then build tooling to bump the fees in the future if we presigned too low (if we presigned a fee higher than needed, we would just have to live with it, so part of the analysis here was picking a fee rate we were comfortable with even if it turned out we overpaid). Other than using a transaction accelerator service (paying a miner out of band to include a transaction in a block even if it pays below-market in fees), there are two techniques to increase the effective fee-rate of a transaction: Replace-by-fee (RBF) and Child-Pays-For-Parent (CPFP). RBF involves re-spending the inputs of a transaction in a new transaction that pays a higher fee. Because our application relies on pre-committed transaction IDs, this was not an option. CPFP involves spending the unconfirmed output of a transaction in a new transaction that pays a higher fee than the “parent”. In order for miners to capture the fees from this “child” transaction, they have to include both parent and the child as a package. The effective fee-rate ends up being the total fees paid divided by the total virtual size of the package (all the transactions together). Since the parent transaction is unperturbed, this was exactly the fee-bumping mechanism that we needed.

One remaining wrinkle is that we had potentially hundreds of transactions that would need to be fee-bumped. In addition to the difficulty of accurately bumping 10’s or 100’s of unconfirmed transactions by hand, there are also relay policies that prevent a package of more than 101 KvB (virtual kilobytes) or more than 25 transactions from being relayed through the network. That means that if we needed to CPFP 50 transactions, we’d want to do them all in parallel, rather than serially. To accomplish this, we built tooling that would:

  • look at a list of unconfirmed transactions and for each one calculate the cost to CPFP-bump that transactions to a target fee rate
  • Aggregate those amounts as outputs in a new transaction that spent from a single input to all of the UTXOs needed to bump the target transactions in parallel
  • Prompt the operator to send the total amount of bitcoin required (it calculated fees for the splitting transaction as well) to a single address
  • Once the deposit was received, it would broadcast the transaction to split the deposit into one UTXO for each transaction that needed to be bumped
  • It would then construct and broadcast CPFP transactions for each of the stuck transactions

We tested this system on Regtest bumping up to 300 transactions at a time. We also had an opportunity to use it when we needed to bump the fees of several layer-connector reveal transactions on mainnet! You can see the “split” transaction here: https://mempool.space/tx/2ec4a8708524faf9901c69da8518b632ec31762730218d3b38ff40954cee882f Each of those outputs funds the CPFP to bump an inscription reveal transaction from 65 to 150 sat/vb.

The art assets made up ~90% of the total data for the project. What we wanted to do was opportunistically publish all or as much of the art as we could when fees were low. But, we also didn’t want to have people see the art before the cats were ready to evolve. So, we decided to encrypt the artwork and then publish the decryption key for the artwork with the layer connector (which contains the mapping needed for a Cat to fetch its trait). This let us decouple the data publication step from the trait reveal. This let us take advantage of a time of lower fees to do the bulk data publication, while still being able to show the world the artwork at a time that made sense for the collection. The mechanics here are straightforward: before presigning artwork assets, all of the artwork for a particular layer (again, think background or eyes or mouth) is encrypted with a per-layer encryption key. That encrypted artwork is used in a presigned inscription as a stream of bytes. Then the encryption key is rendered into the layer connector (which again is presigned). When the dispatcher fetches a layer connector, it reads the mapping of Cat-ID -> art asset, and also the decryption key for that layer. When it fetches the art asset, it gets it as a byte array, and then uses browser cryptography libraries to decrypt the artwork as a png, and then finally writes it to the canvas.

Putting this all together, each Quantum Cat is a small inscription that fetches a common inscription that contains dispatch, decryption, and rendering code. That code fetches as many layer-connectors as are available on-chain (some of them won’t be because they are pre-signed but unbroadcast). It then uses the inscription IDs and decryption keys in these layer connectors to fetch encrypted artwork in other inscriptions, decrypts them, and then renders them to a canvas. When we need to broadcast these presigned inscriptions, we use bulk parallel CPFP transactions to bump them up to the correct fee-rate without having to commit up-front to too-high a fee. The net result of all of this is that users have a Quantum Cat in their wallet that evolves new traits and attributes over time, while still having all of its assets be immutable on Bitcoin.

There are other aspects of the project that we haven’t covered here – how the browser code manages intermittent failures when fetching all these assets, how you handle curation of an evolving collection, how we managed the UTXO creation process for all the presigned assets in the first place (that one’s easy: it’s the same fan-out UTXO splitting code described above for funding the CPFP UTXOs). But I hope you find the above discussion interesting and helpful in either an inscription project or another project involving presigned transactions. 

This is a guest post by Rijndael. Opinions expressed are entirely their own and do not necessarily reflect those of BTC Inc or Bitcoin Magazine.

Source: bitcoinmagazine.com