The Chirping Machine

b0rg
Coinmonks

--

Pau Klee’s Die Zwitscher-Maschine – a group of machines telling each other the truth

One way you could describe a distributed ledger is a network of machines which tell each other the truth.

In a distributed ledger networked nodes send messages back and forth that look this:

Nodes send and receive these truth statements, verify if they are true, and collect them into a list of true statements which we call the ledger:

This statement is true: ______ and this statement is true: _____ and this statement is true: _____ and this statement is true: _____ and this statement is true: _____ and this statement is true: _____ and this statement is true: _____ and …

The contents of these messages are usually statements about debts, like “Kendra owes Vinicius $40.” But these statements could really be anything: the sky is blue, the peacock’s tail is bright, there was a gentle rain this afternoon. In the context of a distributed ledger, it doesn’t matter if these statements are true in our (human) reality, only that they are true with respect to the reality agreed upon by the nodes and recorded in the ledger. A node can verify if a message is “true” by checking that it does not conflict with any statements already in the ledger.

For example, the two statements “All cats have six legs” and “All cats have eight legs” cannot both be true so only one should be included in a ledger. A distributed ledger must ensure that all nodes eventually agree on the exact same ledger and that the ledger does not contain any conflicting statements — a kind of distributed truth telling machine.

The Matryoshka Principle

The first Matryoshka dolls, carved by Vasily Zvyozdochkin in 1890

In the IOTA protocol there is a basic rule that if I want other nodes to accept my message, I have to publicly verify the truth of two or more messages I’ve received from others. So, in IOTA a message would look like this:

The first part of the message is a new statement which I am adding to the ledger, and the second part is are some verified messages of the same format received from other nodes. Because of this recursive property messages eventually expand into something like this:

The Tangle, as seen from inside a single message
The same Tangle, as seen from outer space

Because a message attests to the truth of the messages it references and the messages they reference and so on, every message ends up containing the entire history of the Tangle, at least as seen from a certain point of view, at a certain moment in time. When a node sends a message it is actually saying “This statement is true: ____ and all the messages in the history of ledger referenced by this message back to the first message ever sent are also true.” Every message in the Tangle is a direct view to the beginning of the Tangle like branching rivers that all lead back to the sea.

Of course, practically speaking, nodes cannot send around messages that contain millions or billions of nested messages. So nodes just prove they are referencing this history by including in each message the hash of the messages it references (which contain the hashes of the messages they reference, and so on). This compresses the entire history of the ledger into a very small amount of information which can be easily verified.

I Promise To Pay The Bearer On Demand The Sum Of One Pound

One pound is a promise to pay the bearer on demand the sum of one pound.

When dealing with a cryptocurrency the statements in the ledger make no reference to the weather or exotic animals. They are just statements about who owes how much money to whom. In the same way that a pound (or any fiat currency) is just a promise to pay someone a pound, digital money is a promise to pay someone digital money. These promises are the statements recorded in the ledger and they look like this:

C owes D $20

The money which C owes D is itself just a reference to another message (or messages): the message containing the statement which promised the $20 to C in the first place. With this in mind, a ledger update can also be represented like this:

C owes D $20 from [message]

or

and in the end messages start to look like this:

It should be pointed out that in the UTXO model you can promise someone a portion of the money which is owed to you, and then promise the remainder back to yourself, or combine many different promises into a new promise. But the main thing is that digital money, like real money, is just a promise to give someone money which is itself another, older promise to give someone money.

In the same way that a messages contains the entire past of the ledger, a single transaction, or UTXO update, references the entire history of the funds that it spends, something like the cumulative endorsements on the back of a check, all the way back to the first message which created all the funds in existence.

The Order of Things

Some interesting properties emerge in the Tangle due to the recursive nature of its messages. For instance, old parts of the ledger become extremely hard to change. A node could try to send a message that references another version of the Tangle, whose past is different from that seen by the other nodes, but this message would conflict with the perception of all the other nodes, and they would not consider the message to be true. For example: if you tried to change any of the nested images on the cover of Pink Floyd’s Ummagumma, it would be visible to anyone who had one of the hundreds of thousands of other copies of the album. How would you convince all those people that yours is the real version?

As the word “past” already suggests, another important property emerges from this simple requirement that a message validates two earlier messages: the ledger has at least a partial order of events, a sense of time. It may be impossible (or at least difficult) to determine the order of two messages which are not connected by a reference, but it is always possible to say that a message comes after any messages it verifies and before any messages which verify it. It would also be impossible to issue a new message that is already referenced by a message in the ledger.

The Tangle is partially ordered

These two properties of (partial) order and an immutable past are fundamental properties of any distributed ledger. They are deeply connected to one another in the sense that “time” is meaningless if the past can be arbitrarily changed, if there were no sequential relationship between events. Seen this way, a message in a distributed ledger is not just a statement about the truth, but also an event¹ which must be causally consistent with the reality established by the events which have preceded it in time. In the same way I cannot drink from a glass which was was just shattered on the floor, a distributed ledger prevents a double spend, and reproduces a kind of minimal expression of causality.

The Double Spend

The Mona Lisa after it was stolen

The story goes that when Vincenzo Peruggia stole the Mona Lisa in 1911 he brought it to a master forger Yves Chaudron who produced several perfect copies which were in turn sold to various buyers while Chaudron was paid with the original. Some have speculated that the original was lost and a group art dealers and historians, unable to agree on which of the perfectly executed copies was the original, secretly chose one and destroyed the others so that we could all carry on with our lives without the unbearable thought that the Mona Lisa is in fact a copy without an original².

With this in mind, imagine Beatrice broadcasts a ledger update like this:

Beatrice owes Cecilio $20 from [Ariadne owes Beatrice $20 from […]]

When nodes receive this update they check to see if it’s true by looking in the ledger for the message where Ariadne promised those $20 to Beatrice, and to see if Beatrice has not already promised those $20 to someone else. If both of these things are true, the message does not conflict with the reality recorded in the ledger. The statement from Beatrice can be recorded and used as a reference for future messages³.

By the time all the other nodes in the network have made this update to the ledger, Beatrice issues a new update, trying spend the same funds:

Beatrice owes Dalia $20 from [Ariadne owes Beatrice $20 from […]]

When other nodes receive this message they perform the same checks as before and see that Beatrice received funds from Ariadne. But they will also see a message that Beatrice already sent these exact funds to Cecilio. Because the new message conflicts with the reality recorded in the ledger it will be rejected.

However, in certain circumstances, it could happen that Beatrice sends both messages at almost exactly the same time and some nodes see one or the other message first. Some nodes would decide the first message is true and others the second. Each of these conflicting opinions represents a valid but mutually exclusive version of the ledger, two possible futures for the reality it records.

Although no one else was in the room when it happened, the experts who gathered to validate the famous painting were said to have taken a vote to choose which of the Mona Lisas to save from destruction. In the same way, the question of consensus in a distributed ledger is about finding a way for all the nodes in the network to agree on a single version of the ledger when presented with a conflict, to establish a shared perception of reality.

Deciding whether the cat lives or dies

Two possible realities

One simple way to resolve a conflict would be to ask everyone else in the network which version of the ledger they prefer and then have everyone follow the majority vote.

The first problem with with this approach, a problem for all distributed ledgers, is that someone can set up a large number of nodes and cast any number of votes to influence consensus or prevent consensus altogether — forcing the network to split into two irreconcilable realities.

A solution to the so-called Sybil attack is to require nodes prove access to a limited resource in order to cast a vote in the consensus process. For instance, in Bitcoin the miner who can prove access to a certain amount of hashing power is given permission to write to the ledger and in doing so cast its vote for the longest chain. In IOTA, someone who can prove access to a certain amount of tokens is granted a vote proportional to the amount of tokens they hold. This voting right is called mana and can be delegated to any node participating in the network. With mana the right to vote is expensive and it is impossible for anyone to cast an arbitrarily large number of votes.

Unlike many other distributed ledgers like Bitcoin or Ethereum which elect a single block producer to write an update to the ledger, IOTA is a leaderless protocol. Every node participating in the network, no matter the amount of voting rights it holds, can participate in the consensus process.

This leads to another problem with simply asking every node in the network for its opinion on a conflict: every node would have to send its opinion to every other node. In a small network this would be fine, but as soon as there are thousands or tens of thousands of nodes casting votes, most of the network’s bandwidth will be consumed with communicating these votes whenever a conflict arises, and nodes will be burdened with the computational cost of verifying the signatures of all the votes they receive. A leaderless protocol needs a voting mechanism that minimizes the volume and size of messages used to communicate the vote.

Fast Probabilistic Consensus is a peered reviewed leaderless consensus protocol with robust security guarantees which will be implemented in the IOTA protocol later this year. Instead of querying all nodes in the network for their opinion, nodes request opinions from a random sub-sample of nodes. In the case of a conflict, nodes repeatedly update their opinions based on the opinions they randomly select from others and eventually agree on which version of the ledger to follow. Because nodes with a lot of mana will be queried more often they can also publish their opinions directly in the tangle, further minimizing the message overhead. Although this is a highly simplified description of the FPC this voting protocol has provable security guarantees and scales with an arbitrarily large set of validators.

Choose your own adventure

Choose whichever reality you prefer

Another possible and slightly more experimental solution outlined here (multiverse consensus, a.k.a. on-tangle-voting) would be to simply consider all messages themselves as votes. If I see two conflicting messages, I can vote for one or the other message by issuing new messages that references whichever of the conflicting messages I prefer — the one I saw first or the one with more votes. Because every message is a statement “this entire history of the ledger is true,” by choosing to reference one message in a conflict set a node is casting vote, weighted by its mana, on the version of the ledger-reality it prefers.

If I cast a vote for one reality and later see that most of the other nodes are voting for a different reality I can switch sides and begin issuing messages in the version of the ledger which more nodes voted for (weighted by their mana). Once 51% of the mana in the network (if it is controlled by honest nodes) converges on one side of the conflict it will become impossible for an attacker to prevent the network from achieving consensus, as long as all nodes eventually receive the same messages. The advantage of this approach is that the message overhead is zero and it is potentially very difficult to attack.

One possible attack vector would be for a malicious node to rapidly switch sides in the conflict, voting for one reality and then the other in such a way that neither reality stabilizes with over 50% of the vote — turning the Tangle into something like Schrödinger’s zombie cat, neither living nor dead. Although it’s probably extremely difficult to execute, a meta-stability attacked could theoretically be carried out by any node regardless of the amount of mana it holds, but just timing its votes exactly right.

I believe, however, there may be a simple solution to this problem based on the fundamental idea stated at the beginning of this piece that every node must verify the truth of the statements it receives from other nodes.

This Statement is False

When a node casts a vote for a reality in a conflict set, it is not only making a statement about the truth of the messages it references (these messages are true…), it is also making a second order statement: “This is the reality I perceive to have the most approval weight.”

In order to prevent a node from switching its opinion arbitrarily, one could use the Tangle itself to verify that a node is casting an “honest” vote — i.e. switching from the reality it perceives to have fewer votes to the one it perceives to have more votes.

If we consider the approval weight observable in the past cone of a node’s vote for a given reality as its “perceived reality weight,” we can then exclude from consensus any “dishonest” votes which express an opinion change from a heavier reality to a lighter one. So when a node updates its opinion from Reality A to Reality B the vote will only “count” (i.e the approval weight of the node will be applied to Reality B) if the node’s perceived reality weight of B is greater than its perceived reality weight of A plus its own weight.

The blue node voted for Reality A first, and then changed its opinion by voting for Reality B. The honesty of its vote can be verified by looking at the Tangle.

In the above diagram, at time = 1 the Blue node must have perceived Reality A to have, at a minimum, the approval weight visible in the past cone of its first vote plus its own approval weight. The Blue node would also have to have seen that Reality B has the approval weight in the past cone of its second vote. Other nodes can easily validate the honesty of the Blue node’s opinion change by comparing these two values.

Performing this simple validation would mean that an attacker who attempts to switch sides arbitrarily would be forced to vote for increasingly heavy parts of each reality in the conflict set, and would at most, and with great difficulty, only be able to delay consensus for some period of time.

This idea only extends the protocol level principle that a node should verify the statements of other nodes, since a vote for a given reality is equivalent to the statement “This is the heavier reality.” As I’ve tried to show throughout this piece, it is as if the simple requirement that nodes “tell the truth” in their own statements and about the statements of others, leads almost inexorably to consensus as an emergent property of the network.

Beyond Bitcoin

A description of longest-chain-wins consensus from the Bitcoin whitepaper

The breakthrough of Nakamoto consensus was to design a distributed consensus protocol that scales in a permission-less network by having participants vote for the version of the ledger they prefer by simply gossiping new updates to the ledger.

However, Bitcoin was designed to have slow confirmation times in order to prevent lots of block producers from adding new blocks at the same time. If dozens of blocks were issued in parallel the network would see many different chains, competing versions of the ledger, and would never be able to agree on one longest chain. In some sense, by limiting confirmation times to enforce linear updates to the ledger, Bitcoin was designed to prevent a Tangle from emerging out of the blockchain.

Considered this way the drawback of Bitcoin was that it did not go far enough. IOTA is unique in many regards, but especially in this fact that it expands on the principles of Nakamoto consensus to build a fee-less, leaderless protocol of parallelized ledger updates, that scales with an arbitrarily large set of validators. Because all nodes collaborate in consensus mining races are eliminated and the networks computational power is used to secure the network in the most efficient and decentralized way possible.

These are some of the reasons I’m interested in this project, one of the few which tries to expand on the innovations of Bitcoin.

  1. If you think of ledger updates as events in a causal fabric then the messages which communicate them are something like observation-events which record an observer’s encounter with this event. The more observers that see an event, or see others seeing the event, the more the event itself becomes solidified into the causal fabric of the ledger-reality. In this sense the distinction between an event and the observation of an event seems to collapse at the most granular level of the ledger’s causal fabric. A conflict is what happens when two causally possible but mutually exclusive events are registered by different observers in the network. Consensus means that observers all agree to “know” one event in a conflict set, which is another way of saying that this event has “happened.” Stated by way of a question: if a ledger update is issued in the forest and no one is around to hear it, does it make a sound?
  2. This part of the story was entirely fabricated for dramatic effect.
  3. This is why consensus does not depend on preserving the entire history of the ledger for all time. In order to come to consensus nodes only to keep track of the messages which prove the current ledger state. Once a new state is confirmed nodes can prune old information which is not needed to detect a conflict. A distributed ledger is a truth telling machine, not an infinite archive.

--

--