The Role of Bitcoin Nodes: Do Full Nodes Running in Data Centers Benefit the Bitcoin Network?

Since its launch, Bitcoin Classic's node count has steadily increased. The latest release of the alternative Bitcoin implementation even topped the charts,
Since its launch, Bitcoin Classic's node count has steadily increased. The latest release of the alternative Bitcoin implementation even topped the charts,
Technical - The Role of Bitcoin Nodes: Do Full Nodes Running in Data Centers Benefit the Bitcoin Network?

Since its launch, Bitcoin Classic's node count has steadily increased. The latest release of the alternative Bitcoin implementation even topped the charts, with almost 3,000 Bitcoin Classic 0.12 nodes reachable on the network.

But a closer look at these statistics reveal some odd details.

First, IP-datasuggests that many Bitcoin Classic nodes might not really be many Bitcoin Classic nodes at all. Instead, a single node could use multiple IP addresses to spoof total node count. This possibility appears more likely in light of the observation that very few Bitcoin Classic nodes seem to be replacing existing Bitcoin Core nodes, indicating that these are new nodes, rather than node operators actually make the switch. 

That said, it is theoretically possible that many new users are simply firing up (and shutting down) Bitcoin Classic nodes in the same geographical area simultaneously.

However, it is certain that a lion’s share of all Bitcoin Classic nodes are hosted in data centers, predominantly by Amazon Web Services and Choopa. That's unsurprising, as dedicated websites offer such services, and this practice is encouraged by Bitcoin Classic supporters. Recent data analysis, moreover, shows that a vast majority of nodes in these data centers are almost certainly paid for by a relatively small group of people.

This raises the question: Is it useful to run full nodes from data centers at all? Does the sharp increase of Bitcoin Classic nodes in any way benefit Bitcoin, Bitcoin Classic or both?

Let’s take a look at why anyone would want to run a full node.

Validation

Perhaps the most important reason to operate a full node is validation.

With a full node, users can check whether transactions are valid according to all of Bitcoin's rules. Using nothing but the open source software, node operators can verify that any bitcoins they receive are legitimately mined, correctly signed and more. This is what makes Bitcoin a trustless solution.

This also makes validation a popular yardstick for decentralization, embodied in the “cost of node-operation.” As it is cheaper to validate, more people can do it, which means Bitcoin’s decentralization is increased.

Plus, if a user can validate with his own full node, there’s a privacy benefit, as there is no need to share address information with any third party.

Technically, however, these arguments hold up only when the node is actually at the physical location of its operator. If a user needs to trust a data center to feed correct information from the node, the solution is theoretically no longer really trustless. Though, admittedly, in practice users always trust hardware and software to a certain extent; trusting a data center might be an acceptible risk for most.

Perhaps more importantly, therefore: Validation really serves a genuine purpose only when used to verify incoming transactions. Many Bitcoin nodes operated from data centers, however, aren't used for transacting at all, and, therefore, don’t provide the benefits concerning validation.

Conclusion: Operating a Bitcoin Classic node from a data center provides questionable validation if it’s used for transacting, and no meaningful validation whatsoever if it’s not.

Consensus

Overlapping with the previous point (but harder to measure), full nodes also influence Bitcoin's networked consensus process.

A full node adds “weight” to the set of rules it applies: Whoever wants to transact with that node (and the operator behind it) will need to adhere to its rules. As more nodes apply the same rules, these rules are “strengthened” through their collective network effect.

This might be the main reason many Bitcoin Classic nodes are coming online. They serve as a type of vote, signaling that users are willing to switch to a 2-megabyte block size limit.

However, insofar the Bitcoin network has anything resembling votes, these are not counted per node. Instead, nodes essentially “vote” through their economic “weight.” As the operator behind a node offers more value to the network – think of important merchants, big buyers, large exchanges and more – their economic weight increases.

This means it makes no difference how many nodes someone runs; whether an important merchant uses one node or one hundred, his total economic weight doesn't change. And, therefore, his influence in Bitcoin’s networked consensus process doesn’t change either.

And again, most nodes operated from data centers probably don't add any economic weight at all. They are not actually used for transacting.

Conclusion: Operating Bitcoin Classic nodes from data centers does not provide any meaningful weight to Bitcoin’s consensus process if it’s not used for transacting. It could add weight if it is used for transacting, but running more than one node per economic entity is pointless for consensus.

Decentralization

Other than cost of node operation, another popular yardstick to measure decentralization is the amount of doors that need to be knocked on to control or shut down Bitcoin.

Since full nodes serve as Bitcoin's backbone, it's beneficial to have many of them online ... but only if they are operated by many different people as well, and preferably in distant georaphical regions.

If more than one node is operated from a single data center, it’s the operator of that data center who has ultimate control over all of them. As such, only one door needs to be knocked on to control all nodes in that data center.

Conclusion: Operating more than one Bitcoin Classic node per data center does not provide any meaningful decentralization.

Relaying

Perhaps the most important task of a full node, from a technical network perspective, is relaying transactions and blocks to other nodes.

Moreover, if many nodes are controlled by few people, or are all at the same physical location, this effectively presents a single point of failure. If these nodes represent a significant chunk of the Bitcoin network, and suddenly disappear offline or start relaying corrupted data, it could even be a temporarily destabilizing factor.

However, there is a scenario in which running nodes from data centers could serve a purpose. If a Bitcoin Classic hard fork happens, but almost no non-Classic nodes switch to accept bigger blocks, the Bitcoin Classic nodes operated in data centers could help relay these blocks to nodes that do accept them. As such, running Bitcoin Classic nodes even before a hard fork occurs, could signal to miners that their potential bigger blocks will be relayed. (That said, miners would presumably be more worried about the split in the network in the first place, rather than relay-potential in case of such a split.)

Conclusion: Operating Bitcoin Classic nodes from data centers does not currently provide any meaningful contribution to Bitcoin’s relay process, and could even pose a small risk. There exists a scenario in which operating Bitcoin Classic nodes from data centers could slightly help Bitcoin Classic, but this advantage seems mostly theoretical.

Bootstrapping

Whenever a new Bitcoin node comes online, it needs to sync with the rest of the network. This requires the node to download (and verify) the complete blockchain, for which it needs to connect to fully synced nodes. Running a node from a data center can help.

In this case, however, many of the new Bitcoin Classic nodes operated from data centers have blockchain pruning enabled. They get rid of all blockchain data older than a couple of days. As such, they are no use to syncing nodes.

The Bitcoin Classic nodes that don’t have blockchain pruning enabled do upload blockchain data to syncing nodes. But that was never really a bottleneck or a problem in need of solving. And if it ever becomes a problem, it will be easy to take care of, indeed by spinning up full nodes from data centers.

Conclusion: Operating Bitcoin Classic nodes from data centers could serve some useful data to syncing nodes, but network health benefits are negligible.​

SPV Hosts

Last, full nodes serve as hosts for Simplified Payment Verification (SPV) clients, such as mobile wallet apps. Since SPV clients don't store the entire blockchain themselves, they connect to full nodes that do, and request the data they need.

However, the Bitcoin nodes including the new Bitcoin Classic nodes that have blockchain pruning enabled are of limited use for SPV nodes, since they might not be able to provide all data requested.

Moreover, hosting SPV clients was never really a bottleneck or a problem in need of solving either. And if it ever becomes a problem, it will be easy to take care of.

Conclusion: Operating Bitcoin Classic nodes from data centers could serve some useful data to SPV nodes, but network health benefits are negligible.​

Thanks to James Hilliard and Blocktrail CTO Ruben de Vries for added suggestions.