
The global appetite for data is leading to an increase in power for computing. Neo4j’s Jim Webber explains why cloud has solved many problems, but created new ones.
- The growing demand for data and analysis is creating a new source of emissions, much of which is not yet being addressed.
- Increasing use of software as a service (SAAS) and increasing use of remote servers is driving demand and the associated power demand is going to become an increasing issue for corporates.
- Use of alternative approaches, such as graph technology, can dramatically reduce impact providing a useful tool for managing power consumption demand.
IT is one of the worst contributors to the greenhouse gas (GHG) emissions threatening our future. And I don’t mean the mining of cryptocurrency. While it is true that the energy bill associated with mining bitcoin is alarmingly high—the White House estimates U.S. crypto-asset activity is 0.4% to 0.8% of U.S. GHG, equivalent to all the emissions from burning railroad diesel fuel—it’s just the use of the data centre we need to look at.
Enterprise IT alone accounts for 1.5% of the planet’s total energy consumption, and it’s estimated that in 2020 alone, data centres were responsible for producing 300 metric tons of CO2, equivalent to 0.6% of total GHG.
Why do we have so many data centres? In part, it’s because of the macro trend towards the cloud. It’s an approach that entails shifting IT provisioning from a company’s internal hardware and computer rooms to expansive, centralised server facilities. Use of the cloud has brought many benefits. As a cloud user, you can lease computing resources per your requirements and not worry about maintaining your internal computer farms.
The problem is that while the transition reduced the day-by-day costs associated with supplying access to business applications and storing data, we’ve started to think of these services as both free and infinite. With the cloud, throwing thousands of servers and huge software data tools at your problem is now the norm, simply because we can.
There are use cases where this approach is effective. However, too often, we resort to brute force, neglecting the possibility of more refined ways of cracking the problem that could be executed on one laptop.
So while big-scale cloud is undoubtedly a powerful tool, employing it indiscriminately as a battering ram for every problem is inefficient. There is no such thing as a free lunch, and every server still consumes power, be it on-premise or in an Azure data hotel.
Reducing our reliance on excessive server infrastructure, thereby decreasing electricity use, could have a substantial impact on our power consumption.
Use the right tool for the job
Cloud is a multi-purpose tool that can handle a wide range of tasks. It is a versatile general tool, but certain tasks can be performed more efficiently and effectively with a specialized one. If you’ve got a Swiss Army Knife and I’ve got a power drill, you can complete more tasks than me—but I’m able to make holes in walls a lot more efficiently than you.
Programmers need to be reminded about choosing the right tool for the job, rather than defaulting to large-scale processing with cloud computing. There are some IT problems—even at the highest enterprise level—that can be solved more efficiently with a single thread on a laptop with the right software rather than using vast cloud infrastructure.
If we adopted that approach, we’d bypass the complexities associated with managing and maintaining a server farm environment, including all the power and hardware purchasing and overseeing involved. As a result, we would curb the GHG emissions it takes to keep that infrastructure online.
Independent of environmental concerns, I suspect there is a change coming with overuse of the cloud. Organisations are receiving ever larger bills from the “hyperscalers”, prompting them to appreciate that the cloud is not as cheap as originally thought. Cloud providers are offering discounts to help rationalise CIO’s spend. However, this discounting encourages an unvirtuous cycle where the default response to every problem becomes deploying a 1000-server solution.
And in the meantime, there are far less computationally expensive alternatives available for processing data.
Scale alone is not the solution
An excellent example is Adobe’s Behance social media network, which is a form of Twitter for the creative community. The problem was that two different architectural cloud approaches had not delivered the desired functionality for the application.
The first version needed 120 servers and around a 20-terabyte dataset of disk space. Ultimately, that approach didn’t provide sufficient business value or operational simplicity. It was then moved on to another big data cloud database that needed 48 servers, but Adobe had to increase its on-disk footprint to 50 terabytes. The system still had large infrastructure requirements yet did not meet user needs.
Adobe then moved to a solution called graph technology, which is a form of database-based approach. Graphs store company data in a smarter way where connections and relationships can be uncovered far more quickly compared to traditional databases. Now, Adobe needed just three servers for the same workload but with substantially improved functionality.
Moving to a better form of underlying software resulted in 40 times less hardware, making a significant dent in the energy needed to run the service and the carbon produced by the data centre.
Lightweight technology, such as graph software, needs far less data centre power and cooling for the same or better functionality. It’s an approach that can be used as a tool for saving IT costs and lowering the sector’s carbon footprint.
Graph software is not a silver bullet for all our climate worries, but being smarter about your cloud spending can help your IT bill and the planet.
Jim Webber is Chief Scientist at graph database and analytics leader Neo4j