The role of decentralized networks in a data-abundant, hyperconnected world

As centralized systems cannot keep data processing efficient, we need to turn to decentralized networks in order to succeed.
As centralized systems cannot keep data processing efficient, we need to turn to decentralized networks in order to succeed.

When it comes to computer data storage, it can seem like we are running out of numbers. If you are old enough, you may remember when diskette storage was measured in kilobytes in the 1980s. If you are a little younger, you are probably more familiar with the thumb drives denominated in gigabytes or hard drives that hold terabytes today.

Humanity’s unfathomable data footprint

But we are now producing data at an unprecedented rate. As a result, we are going to need to be able to grasp numbers so large that they seem almost beyond human comprehension. In order to get a sense for the new realm into which we are entering, consider this: Market intelligence firm IDC estimates that the total global creation and consumption of data amounted to 59 zettabytes in 2020 — that’s 59 trillion gigabytes in old money.

Yet while the total volume of data in existence is now at an almost unfathomable scale, the rate at which it is growing is even more striking. Back in 2012, IBM calculated that 90% of the world’s data had been created in the previous two years. Since then, the exponential growth in global data volume has continued apace, and the trend looks set to continue. Indeed, IDC projects that over the next three years, humanity will create more data than it did during the previous three decades.

The obvious question is: What has changed? Why are we suddenly producing much more data than ever before? Of course, smartphones are part of the story. Everyone now effectively carries a mobile computer in their pocket, dwarfing the power of desktop computers of previous generations. These machines are constantly tethered to the internet and continuously receive and transmit data, even when idle. The average American Generation Z adult unlocks their phone 79 times a day, approximately once every 13 minutes. The always-on nature of these devices has contributed to the avalanche of new data produced, with 500 million new tweets, 4,000 terabytes of Facebook posts and 65 billion new WhatsApp messages fired out into cyberspace every 24 hours.

Smartphones are just the tip of the iceberg

Smartphones are merely the most visible manifestation of the new data reality, however. Whereas you might assume that video platforms such as Netflix and YouTube constitute the lion’s share of global data, in fact, the entire consumer share amounts only to approximately 50%, and this percentage is projected to gradually fall in the coming years. So, what makes up the rest?

The rise of the Internet of Things and connected devices has been further expanding our global data footprint. Indeed, the fastest year-on-year growth is taking place in a category of information known as embedded and productivity data. This is information derived from sensors, connected machines and automatically generated metadata that exists behind the scenes, beyond the visibility of end-users.

Take autonomous vehicles, for example, which use technologies, such as cameras, sonar, LIDAR, radar and GPS, to monitor the traffic environment, chart a route, and avoid hazards. Intel has calculated that the average autonomous vehicle using current technologies will produce four terabytes of data per day. To put that in perspective, a single vehicle will produce a volume of data each day equivalent to almost 3,000 people. Furthermore, it will be critically important that this data is stored securely.

On the one hand, it will be useful in order to schedule service intervals and diagnose technical problems most efficiently. It could also be used as part of a decentralized system to coordinate traffic flow and minimize energy consumption in a specific city. Finally and probably most importantly in the short run, it will be essential in order to settle legal disputes in the event of injuries or accidents.

Autonomous vehicles are just a tiny part of the overall story. According to McKinsey & Company, the percentage of businesses that use IoT technology has increased from 13% to 25% between 2014 and 2019, with the overall number of devices projected to have reached 43 billion by 2023. From industrial IoT to entire smart cities, the future economy will have a hugely increased number of connected devices producing potentially highly sensitive, or even critical data.

Is the end in sight for Moore’s Law?

There are two factors to consider, and both point to the increasing utility of decentralized networks. Firstly, while we have more data than ever before to tackle global challenges, such as climate change, financial instability and the spread of airborne viruses like COVID-19, we may be approaching a hard technical boundary in terms of how much of this information can be processed by centralized computers in real time. While data volumes have exponentially grown in recent years, processing power has not increased at the same rate.

In the 1960s, Intel co-founder Gordon Moore coined Moore’s Law, which stated that as the number of transistors on a microchip doubles every two years, computing power will increase at a corresponding rate. But Moore himself conceded that it was not a scientific law; it was more of a transitory statistical observation. In 2010, he acknowledged that as transistors are now approaching the size of atoms, computer processing power will reach a hard technical limit in the coming decades. After that, more cores can be added to processors to increase speed, but this will increase the size, cost and power consumption of the device. To avoid a bottleneck effect, therefore, we will need to find new ways of monitoring and responding to data.

The second factor to consider is cybersecurity. In an increasingly interconnected world, millions of new devices are going online. The data they provide will potentially influence things like how electrical grids are controlled, how healthcare is administered, and how traffic is managed. As a result, edge security — the security of data that resides outside of the network core — becomes paramount. This provides a complex challenge for cybersecurity experts, as the many different combinations of devices and protocols provide new attack surfaces and opportunities for man-in-the-middle intrusions.

Learning from networks in nature

If centralized processing is too slow and insecure for the data-abundant economies to come, what is the alternative? Some experts have been looking for inspiration in the natural world, arguing that we should move from a top-down to a bottom-up model of monitoring and responding to data. Take ant colonies, for example. While each individual ant has relatively modest intelligence, collectively, ant colonies manage to create and maintain complex, dynamic networks of foraging trails that can connect multiple nests with transient food sources. They do this by following a few simple behaviors and responding to stimuli in their local environment, such as the pheromone trails of other ants. Over time, however, evolution unearthed instincts and behaviors on an individual level that produce a system that is highly effective and robust on a macro level. If a trail is destroyed by wind or rain, the ants will find a new route, without any individual ant even being aware of the overall objective to maintain the network.

What if this same logic could be applied to organizing computer networks? Similar to ant colonies, in a blockchain network, many nodes of modest processing power can combine to produce a global outcome greater than the sum of its parts. Just as instincts and behavior are crucial in nature, the rules governing how nodes interact are critical in determining how successful a network will be at achieving macro-level goals.

Aligning the incentives of each decentralized actor in a mutually beneficial network took thousands of years for nature to master. It is unsurprising, therefore, that is also a difficult challenge for the human designers of decentralized networks. But while the genetic mutations of animals are essentially random in terms of their potential benefit, we have the advantage of being able to purposely model and design incentives to achieve common overall goals. This was at the forefront of our minds: The objective was to eliminate all perverse incentives for individual actors that erode the utility and security of the network as a whole.

By carefully designing incentive structures in this way, decentralized networks can greatly strengthen the degree of edge security. Just as the pathfinding network of an ant colony will continue to function even if a single ant gets lost or dies, decentralized networks are equally robust, enabling the network to remain fully functional even when individual nodes crash or go offline. Furthermore, not a single node needs to process or understand all the data in its totality for the network as a whole to be able to respond to it. This way, some researchers believe we can create an economic incentive structure that automatically detects and responds to common challenges in a decentralized way.

Conclusion

The volume of data we are producing is exploding, and our ability to monitor and respond to it using centralized computer networks is approaching its limits. For this reason, decentralized networks are uniquely suited to the challenges ahead. A lot of research, testing and experimentation remains to be done, but the fundamental robustness and utility of the underlying technology have been demonstrated. As we move toward a data-abundant, hyperconnected world, decentralized networks could play an important role in deriving the maximum economic and societal benefit from the Internet of Things.

The views, thoughts and opinions expressed here are the author’s alone and do not necessarily reflect or represent the views and opinions of Cointelegraph.

Stephanie So is an economist, policy analyst and co-founder of Geeq, a blockchain security company. Throughout her career, she has applied technology within her specialist disciplines. In 2001, she was the first to use machine learning on social science data at the National Center for Supercomputing Applications. More recently, she researched the use of distributed networking processes in healthcare and patient safety in her role as a senior lecturer at Vanderbilt University. Stephanie is a graduate of Princeton University and the University of Rochester.