Multisig: A Revolution Incomplete

The author is involved in the Ethereum project. The author has no relationship with BitGo, Bitrated, Codius or CryptoCorp. Update: Ben Davenport from Bitgo has
The author is involved in the Ethereum project. The author has no relationship with BitGo, Bitrated, Codius or CryptoCorp. Update: Ben Davenport from Bitgo has
Op-ed - Multisig: A Revolution Incomplete

The author is involved in the Ethereum project. The author has no relationship with BitGo, Bitrated, Codius or CryptoCorp.

Update: Ben Davenport from Bitgo has replied, saying that they already have an API to support use their service as a pure CryptoCorp-like oracle and soon intend to provide a browser extension. I commend them for their rapid response and dedication to sound security practices. GreenAddress also provides an API allowing applications to plug into their services

After several years of improving infrastructure and technology, it seems like multisignature wallet technologies are finally making a headway in the Bitcoin world. Greenaddress.it and BitGo have emerged as primary contenders in the space, and the latter has recently raised $12 million in venture capital funding and boasts that it stores over $100 million in BTC. The growing emergence of multisig is a very welcome sight in the Bitcoin ecosystem, as the benefits have been known and the components in the Bitcoin protocol have been available for nearly two years and now finaly mainstream consumers can start bearing the fruits. Particular benefits of multisig include consumer-merchant escrow applications, allowing for an open and free marketplace for arbitrators to make Bitcoin commerce relatively safe and fraud-free in areas where such protections are necessary, as well as the personal use-case of savings wallets, protecting users from the loss or compromise of any single one of their private keys.

At a time when consumer protection as arisen at the forefront as a concern, with the looming BitLicense regulations imposing a heavy level of restriction on cryptocurrency businesses, multisig additionally offers a promise as an alternative to a regulation-centric approach to consumer protection – instead of trying to make absolutely sure that each individual business is trustworthy, we can set up systems to maximally remove single points of failure and rely primarily on safety-in-numbers. However, as potentially revolutionary a technology as multisig is, it is also at risk of being overhyped and misrepresented, as some businesses may try to claim the branding benefit of having their addresses start with a “3” without taking the effort of actually being trust-free. The point of this article will be to define a concept of “good multisig” – technologies that actually remove the need for trust in individuals and promote consumer protection, and “bad multisig” – mere cryptoeconomic security theater, and try to determine the dividing line between the two.

The Client Side Revolution

Before we get to actual multisig, we must first dissect one particular technology that is being used by a number of companies in order to seemingly enhance security and reduce the need for trust: the client-side web app. Before the client-side web app took hold, there were two dominant paradigms in Bitcoin clients. First, there are the desktop clients, programs that you download straight to your computer. The benefit of desktop clients is that the users hold the private keys themselves on their own machines, so there is no need to trust any third parties to store one’s funds. However, this comes at a usability cost: the user needs to download an application. Second, there are the server-side web wallets, where a third party holds the bitcoins for you, and gives you access to conveniently deposit and withdraw using an account much like a Google or Facebook account without downloading any software. This has a high degree of usability, but at the cost of requiring trust.

Client-side web apps are an elegant third solution: although access to the website is still done using a webapp, with no inconvenience of downloading software required, the private keys are stored and transactions are signed on the client side inside the web browser using Javascript. Hence, although the application has the same level of convenience as the web interface provided by a trusted server-side wallet, the server does not have access to your private keys and you do – a seeming best of both worlds. The most popular client-side wallet right now is probably blockchain.info.

Now, let us evaluate the merits of this paradigm. Client-side Javascript is certainly not without its critics; there is even an entire article by Matasano entitled “Javascript Cryptography Considered Harmful”. Although the piece is quite extreme in its negation of absolutely any benefits of client-side browser-based cryptography, it does make valid points – particularly, that when you download browser Javascript you are still trusting the source. That is to say, if blockchain.info or a rogue employee of blockchain.info wanted to, or if a government extorted them to, they could simply send you browser code that would take your private key and sign a transaction sending all your funds to their address, and you would never know the difference until it is too late. Now, if one takes this argument to the extreme, one may argue that even with a downloadable client it is possible to distribute a version that steals your private keys, but intuitively it seems obvious that this is much less of a problem in that case – particularly because you are only downloading the software once.

So how much are you trusting the software provider in each case? Let’s break down just what would need to happen for a successful exploit to take place in each case:

  • Desktop client, built from source – the provider, or an attacker who hacked into the provider’s systems, would need to submit a patch to the client’s repository on Github including a backdoor, and you would need to download the client before someone inside or outside the organization would scan the source code and notice
  • Desktop client, binary (the option normal people use) – the provider, or an attacker who hacked into the provider’s systems, would need to compile and publish a version of the client including a backdoor, and you would need to download before someone inside the organization detects the malfeasance (it’s too hard to decompile binaries to detect backdoors until it’s too late in most cases, so it would have to be internal, although in the long term once an exploit is found it is possible to isolate it)
  • Client-side browser webapp – an attacker would need to quickly slip a version of the client including a backdoor into the content delivery network, and only everyone who logs in between that time and the time when the malicious version is taken offline is vulnerable
  • Server-side browser webapp – an attacker would need to access the site’s cold wallet, at which point absolutely every user’s account would be compromised

Hence, we can see a hierarchy of security, where the lower down you go the less secure you are and the more you need to trust. One particular distinction is the difference between a short-term and long-term attacker: is it the company that’s evil, or is it simply someone jumping into their servers through some exploit for a few minutes or hours before they notice? Against long-term attackers, only downloading older versions directly from an open source repository can help you; against short-term attackers binary desktop applications work reasonably well, and even client-side browser webapps may limit the tragedy to a small number of users.

In general, though, there is a fundamental divide between the desktop cases and the browser cases: in the first two, if an attacker gains access for a short period of time, if security is set up correctly that is not a problem at all, because the issue can be corrected in time, but in the latter cases it is. Hence, client-side browser-based apps provide only a partial gain of security over server-based wallets.

How can the problem be fixed? The simplest approach is to move from client-side webpages to browser extensions. This solves the problem almost completely; from a security perspective, a browser extension is almost exactly equivalent to a desktop application run inside of an interpreted environment like Java or Python. However, this does come at the cost of adding an extra step – the user must download a browser extension instead of just trusting the server, and for this reason even if sites like blockchain.info offer a secure browser extension version of themselves most people still use the website.

Note that all of the above is certainly not an indictment of client-side browser Javascript; all it says is that client-side browser Javascript is not that much more secure or trust-free than the approach where the server holds all of your money. There are reasons other than security and trust to write a client-side browser Javascript cryptocurrency application; the biggest one is likely convenience, since the more is done in-browser the less infrastructure you as the application developer need to manage yourself. The ether sale webapp uses client-side Javascript for this exact reason (convenience of development and robustness against denial-of-service attacks); of course you are trusting Ethereum when using the app, but that’s not a problem because you are trusting Ethereum to develop the platform anyway. Thus, if we admit that we are trusting providers like blockchain.info, we can say that their use of client-side cryptography is legitimate. For multisig providers, however, the story is completely different…

The Fused Multisig Wallet

The previous discussion about client-side security is important because it brings to attention an important, and sometimes forgotten, ingredient in security when dealing with cryptographic protocols: the security of the source code itself. Although a cryptographic protocol such as Bitcoin may be theoretically trust-free, in reality almost everyone is not nearly technically competent to evaluate the entirety of the code for themselves. The kinds of clever exploits developed in the Underhanded C contest show quite well how difficult it is to make sure a piece of software is completely attack free; hence, as a result, for pretty much everyone except the original author of a program protocols do still require a certain amount of trust.

In the case of multisig, what we are trying to do is explicitly eliminate the need to trust any single entity. In general, there are two ways in which multisig is implemented. The first we will call extended 2-of-2. The basic concept of 2-of-2 is simple. One key is held by the user, perhaps via a password-derived brainwallet, or a randomly generated key held encrypted in browser storage or a client-side application, and another key is held by the server. When the user wants to sign a transaction, they log onto the wallet on their computer, and then produce a transaction sending funds from their address to the desired destination and sign it with their private key. Then, the transaction is sent to the server. The server then does some fraud-detection check; for example, it may send a confirmation code to the user’s phone number, and ask the user to type it in. If successful, the server signs the transaction and sends it off.

By default, however, this scheme is fragile. If your computer is hacked, or you forget your password, then you lose access to the wallet and the server can do nothing about it. Similarly, if the company maintaining the server suffers an accident or mishap or disappears, you lose access. The extension to 2-of-2 is the patch to this problem. Essentially, every time your client produces a new transaction, it actually produces two transactions: one sending the funds as desired, and then a second sending the remaining funds after the first is finished to some backup address controlled by you. The server signs both transactions, but publishes only the first – the second is returned to you so you have a way to recover your funds even if the server disappears. Note that the address is 2-of-2, so the server has no way to invalidate that transaction without your consent. One particular point to keep in mind is that the server should be the first entity signing the transactions, not the second; otherwise, the server can maliciously sign only the first transaction and not the second and then disappear, leaving the user permanently in limbo.

The second scheme is simple 2-of-3. There are three keys: your key, the server’s key, and a backup key held by you in some secure offline location. Just as above, you sign, the server sends your phone a confirmation code, you supply the code on your computer, and the server signs; that’s all there is to it. If you lose your password, you can use your backup key and the server’s key to send a transaction to a new wallet; if you or the server get hacked then the attacker still has only 1 of 3, and if the server is malicious or disappears they only have 1 of 3 and you have 2 of 3. Similar logic applies for the 2-of-2 case, except that the cases that involve you losing your key or the server disappearing are instead handled by applying the emergency transaction. Thus, we have two slightly different but in many ways equivalent protocols for creating a multisig setup with no single point of failure…

… until we start thinking about the software code. One popular multisig wallet, BitGo, currently presents itself primarily as a client-side-Javascript web application; hence, we can analyze BitGo using the same analysis that we used to analyze blockchain.info (note: I am not picking on BitGo specifically; it is simply one of the most prominent and well-funded, other alternatives usually work in exactly the same way). If an attacker takes control of BitGo’s servers, then they have the ability to start feeding users a faulty web application. Additionally, if an attacker takes control of BitGo’s servers, they can also apply the second signature. Hence, BitGo continues to be a single point of failure.

Now, one could reasonably argue that (1) BitGo is a trustworthy company and so they are not too likely to act maliciously themselves and (2) the presence of multisig means that the attacker has to hack BitGo in two ways and not one. However, this does not bypass the primary point, which is that a centralized server-side wallet can achieve the exact same guarantees without the complexity of having the user store keys by simply adding an extra layer of multisig or secret sharing to their cold wallet. Hence, this kind of client-side-browser multisig wallet setup can be considered to be entirely cryptoeconomic security theater. This is not saying that BitGo is not secure; compared to most alternatives it is a good choice. Rather, this is simply saying that the “multisig” aspect is not providing precisely the guarantee of security that some people think it does.

The philosophical reason why combining browser Javascript and multisig is so problematic is that browser-based Javascript multisig wallet providers, and also many desktop-based providers, are setting up a protocol that is immune to any single point of failure from a protocol standpoint, but they are then immediately sacrificing that advantage in reality by playing two roles in the protocol at the same time: the client and the sever. The problem seems fundamental, and given how crucial the client is to any interaction perhaps even unresolvable; as we saw above, no matter how you download a piece of software, unless you have the time to properly review every line of code you are trusting the provider. At first glance, it seems like there is no way around this issue. However, as we will now see, there is, and the solution once again lies in multisig – this time done right.

Multisig Unfused

The multisig implementation in real life that I believe best exemplifies my vision of the correct way of doing things is the one that is being built by CryptoCorp. CryptoCorp’s approach to multisig is fundamentally different: rather than trying to take the Paypal route (really, most pre-crypto businesses’ route) of treating the interface and the security provider as part of the same package, CryptoCorp generalizes and abstracts the role of the interface and makes its core offering only the security provider. That is to say, CryptoCorp is spending the bulk of its resources specifically developing advanced features and algorithms for its signing oracle server, and lets any other wallet provider integrate with them in order to provide a compatible interface. At the Texas Bitcoin conference in March, CryptoCorp showed a working prototype of a modified Electrum wallet; now, they are working with over ten wallet providers to help integrate support for their server.

Of course, one question is, what is so special about CryptoCorp’s server? An app to do second-factor phone verification with Google Authenticator and sign transactions can be written in NodeJS in a few days; I even did it myself. Where CryptoCorp shines is in the advanced algorithms that it uses to filter between fraudulent transactions. Just paying $3 for a coffee? The CryptoCorp oracle doesn’t even bother asking for confirmation. Paying $500 for a laptop? It might check a bit more stringently. $50,000 for a car? Prepare for something pretty close to KYC verification. Unless the recipient’s address is known to belong to the well-established BitPremier, in which case it might send the transaction through with less hassle simply because it knows that you can always request a refund from them if you make a mistake – and if the recipient’s address has been linked to a hacker syndicate, it might ask for verification even at $3.

So why is CryptoCorp’s approach the better one? From my above criticism of the usual approach to multisig, the answer is obvious: the entity building the oracle and the entity maintaining the software are completely separate. In fact, with CryptoCorp you can actually be relatively safe even if your software turns out to be completely taken over by an attacker. You can use independent tools to verify that the addresses that you’re seeing are legitimate (and not fake multisigs where all keys are actually controlled by the attacker), the client cannot send transactions unilaterally, and if the client tries a more subtle attack like changing the outputs on the transaction or changing the amounts then the oracle will catch that. Thus, there actually is no single point of failure, and the trust-free promise of cryptocurrency is finally achieved.

It is important to note that Cryptocorp is not the only company that is doing things in this kind of generalized and highly modularized way. Codius, the oracle-based smart contract platform from Ripple, is approaching the issue in exactly the same way, and from a consumer protection standpoint so is Bitrated, with its open marketplace of buyers, sellers and arbitrators – although Bitrated still falls slightly short by being a browser-based webapp and not a client-side application or browser extension, or better yet a protocol with multiple compatible implementations.

Decentralization-Friendly Business Culture

The road to making all crypto-businesses look more like CryptoCorp, Codius or Bitrated, and less like PayPal, will be a long one. In the tech business community, there is strong pressure in favor of creating an end-all ecosystem rather than a single component and building a “moat” so as to make your product indispensable for your users – ideals that are the exact opposite of the principles of commoditization, generalization and separation of concerns that are so important for a robust decentralized ecosystem. Corporate profits go up massively if you have a secure position and a monopoly, and yet in the land of secure cryptocurrency applications being modular and easily replaceable is the name of the game.

However, we must note that CryptoCorp has managed to overcome this barrier, overcoming the stigma of being “just a signing oracle” by being a really really good signing oracle – and the wallets that CryptoCorp works with have also done the exact same thing. Even exchanges, perhaps the ultimate commodity business with hundreds of exchanges around the world, still manage to differentiate themselves. In the case of arbitration on services like Bitrated, arbitrators can choose to specialize in different industries, follow different commercial norms (eg. on acceptable standards of product quality, the extent to which consumers are expected to read purchase agreements, etc) and have well-optimized risk models that allow them to charge lower fees.

Additionally, perhaps running a multisig oracle does not need to be a business; like DNS servers, the task can simply be commoditized and done by a combination of larger companies, nonprofit research groups and hobbyists. Such a situation will be arguably preferable, as each entity running each node will have a stable independent source of income and much more reputation to lose, so we can expect oracles to both disappear and cheat much less frequently. But ultimately, if the community demands true decentralization, the market will configure itself somehow to provide it; the only thing that remains is an organized effort to complete the switch.