Why We Can’t Trust Bitcoin Mining Hash Rate Data From China

The changing proportion of Bitcoin network hash rate emanating from China can be explained by the inherent flaws in how this data is collected.
The changing proportion of Bitcoin network hash rate emanating from China can be explained by the inherent flaws in how this data is collected.

China has “re-emerged” as a major bitcoin mining hub in 2022, representing more than 20% of the Bitcoin network’s hash rate, according to new data from Cambridge’s Centre for Alternative Finance (CCAF). The October 2021 data update from CCAF indicated that “mining operations in mainland China have effectively dropped to zero.”

So, what caused this purported enormous whiplash in mining activity from a CCAF-reported high of 75% in September 2019, to 0%, and now back to 20%? Since July 2021, Bitcoin’s hash rate has grown at a steady pace, paring its losses from China’s original ban and continuing to set record highs in recent months. But what happened in China? And is the new CCAF data an accurate representation of the state of bitcoin mining?

This article aims to provide additional context to the CCAF data and explain why the data, although an important effort to try to quantify trends in the mining industry, is not reliable.

There Was Never 0% Hash Rate In China

Analysis of China’s massive resurgence in mining activity is premised on its prior state of having absolutely no mining activity whatsoever, which is entirely false. When CCAF first released its data last year, showing no mining activity in China, the project’s lead was careful to qualify it as the region’s “reported” share of hash rate, which could theoretically differ from its actual share. Other researchers, mining industry leaders and this author knew the 0% number to be inaccurate and said so publicly.

CCAF researchers dismissed these claims from actual miners as “difficult to verify,” preferring to lean on their own methodology. But CNBC reporter MacKenzie Sigalos took these claims seriously, and she later reported on the active underground mining scene in China. Ironically, the reporting by Sigalos was cited by CCAF researchers in their latest blog post with updated China mining analysis.

With a precipitous drop in total hash rate, a coinciding drop in bitcoin’s price and constant media attention paid to the future of mining after China’s ban, data that claimed 0% of hash rate was coming from China fit the narrative. But the data wasn’t accurate, and miners knew it. So why was the 0% number ever published?

Mining Data Is Hard To Collect And Cambridge Bitcoin Mining Data Is Flawed

Data is only as reliable as the methodology used to collect it, and for CCAF mining data, the assumptions in the methodology clearly demonstrate inherent problems with the data collection. These structural difficulties in fact compromise the reliability of the data as it’s presented.

One key failure is the methodological assumption that a mining facility’s IP addresses are an accurate indication of the hash rate’s geographic location. Consider an unlikely but feasible scenario where a miner based in Mexico uses a proxy with an IP address in Germany in January, switches to Australia as a proxy later in April, and then uses an IP address based in Romania in July. CCAF’s broken methodology would assume that this miner physically moved to all three of these locations throughout the year — a logistical nightmare and economic impossibility for any miner.

Some industry commentators defend CCAF’s research by asserting that slightly inaccurate data is better than no data at all. This idea is so laughably illogical it barely deserves mentioning. And CCAF so heavily caveats and qualifies its own data that its reliability is minimal at best. For example, in multiple places on its data dashboard, the CCAF qualifies its data for Germany and Ireland by indicating, “To our knowledge, there is little evidence of large mining operations in Germany or Ireland that would justify these figures. Their share is likely significantly inflated due to redirected IP addresses via the use of VPN or proxy services.”

Put differently, the data is not reliable.

To be clear, the problems with CCAF’s methodology are not its own doing. Mining data is outstandingly difficult to accurately collect. Similar mining data sets built by the newly-launched Bitcoin Mining Council also received some public criticism for the accuracy of their methodology. If anything, the continued work by CCAF to report mining data serves to expose many of the unavoidable issues with collecting accurate and representative data from bitcoin mining.

Mining Pools Can Lie

CCAF also relies on self-reporting by miners to aid its research on the geographic distribution of hash rate. The obvious problem with this exchange of information is that miners can lie. This point was made publicly on Twitter by Ethan Vera, co-founder of Luxor Mining, when he tweeted, “…the mining pools submitting data to Cambridge lied. They showed 0 hashrate in China when that clearly wasn’t the case.”

The political motivation for miners to lie is obvious. What miner would willingly report full or even partial mining activity in the world’s most aggressively anti-mining region? Any incentive to self-report mining in China is simply non-existent. And, as mentioned previously, the 0% hash rate statistic perfectly fits the ongoing narrative of a full mining exodus from China.

The changing proportion of Bitcoin network hash rate emanating from China can be explained by the inherent flaws in how this data is collected.

CCAF's estimated mining hash rate from China versus the U.S.

Data submitted to CCAF is given voluntarily, moreover, and there are very few cross checks available to Cambridge’s research team leaving them to have to simply trust the answers given to it, which is an unreliable research method.

China’s Resurgence Is Logistically Improbable

Considering the real-world logistics of orchestrating China’s resurgence in hash rate corroborates the unreliability of the CCAF’s latest data. It is simply an operational impossibility that nearly half of the hash rate that left China one year ago decided to abandon its newly secured mining facilities elsewhere in the world and relocate back to China.

Even for the non-trivial number of miners that opted to simply move their machines into storage instead of suffering the hassle or relocating internationally, there are few indications that the majority of this group of miners have chosen to or are able to fully redeploy their hardware.

And China certainly has not reversed its mining ban.

The hash rate that is currently online in China has always been in China. Miners have known this and publicly spoken about it. But structural limitations for academics researching this dynamic have resulted in previously inaccurate and persistently unreliable data about this hash rate.

Conclusion

Even though this article has somewhat harshly criticized the data published by the CCAF, it is not responsible for the foundational reasons why its data is unreliable. Collecting mining data is difficult, especially when key metrics are easily manipulated or misrepresented. And the work by CCAF demonstrates these difficulties.

Of course, China is unlikely to ever regain its former share of the global bitcoin hash rate market. Industry leaders and academics alike can agree on this. Chinese officials are still confiscating mining hardware by the hundreds and thousands of rigs, and many large-scale miners have permanently relocated to other parts of the world. But the Chinese underground mining industry will never be extinguished.

This is a guest post by Zack Voell. Opinions expressed are entirely their own and do not necessarily reflect those of BTC Inc or Bitcoin Magazine.