Virtualization Technology News and Information
Article
RSS
AI in the Data Center

By John Diamond of Evoque Data Center Solutions

There's an old axiom in the data center business:  three things enter a data center and three things exit.  Power, data, and people enter...heat, data, and people exit.   Managing the input against the output is where the greatest gains can be achieved in productivity even as it reduces costs, and it's an area where artificial intelligence is increasingly playing a role. 

Some background:  data processing equipment consumes power with surprising inefficiency.  Most of the energy supplied to a computer chip is rejected as heat.  In fact, very little power is used to process ones and zeros.  Supplying the data center with clean stable power, then removing the heat generated from the chips, represents the greatest opportunity for immediate improvement.

Data center efficiency is measured using Power Utilization Effectiveness (PUE).  In the days of mainframe, the amount of energy required to power and cool the data processing equipment was often greater than the actual data processing equipment draw.   Historically, mainframe data centers had a PUE of greater than 2.0, meaning for every watt of power drawn by data processing equipment, another watt was required to feed the infrastructure to power and cool the data center.  

Using the same model, if a hyperscale data center drawing 50 million watts (50 megawatts) is operating at a PUE of 2.0, then 25 MW is wasted, at a significant cost of both natural resources and money.  The effort to help whittle PUE from 2.0+ to something on the order of 1.2 has resulted in tens of millions of dollars a year saved in operating cost.  The use of AI has been influential in creating efficiencies to achieve this goal.

It's important to understand how the dynamics of data, power and heat are closely coupled.  The more data that enters or exits a data center, the more power will be consumed and (with some delay) how much heat will be rejected.  If we continuously monitor data traffic, we may use integrated and derivative loops to facilitate how the power and cooling infrastructure will respond to data center loads, but that's a tall order.  An individual or even a team of individuals can not manually monitor, relay and adjust each of the power delivery and heat rejection components to operate optimally for every given condition.  As AI becomes more affordable at the commercial level, data processing facilities can now take advantage and realize enormous benefit.

Here's a simple illustration of how AI can be applied; consider a single Computer Room Air Conditioner (CRAC), managed by a person:

Many data centers use multiple, redundant CRACs to provide data center cooling.  If a technician monitors the amperage going to a CRAC, they observe how much power it draws at any given moment.  By monitoring it continuously, the technician begins to correlate the power draw against the current conditions, developing a "feel" for how the equipment responds to data center loads.  At some point, the tech may want to slow the CRAC fan speed down to conserve energy during light loads or possibly even turn the unit off.  If that same technician is able to monitor the 50 other CRAC units operating in the data center, and can correlate which units are operating most effectively and which units are unnecessary at any given moment, they can tune each individual unit to deliver the most effective cooling strategy for any condition.

Obviously, that's a lot to ask of one person.  Then, expand the model to include not just the amps but also inlet temperature and flow and outlet temperature, ambient conditions, input from the chiller plant and pumps as well as the electrical distribution system and you have an opportunity to streamline energy consumption at every phase of the data center's operation, from the moment information enters until it leaves.  But that's a workload far beyond the capability of one person, or even a team.  It's where AI increasingly plays a key role. 

Specific products are emerging into the data center market that are using AI to produce significant results and benefits.  Products like Vigilent Machine Learning utilize input from the building monitoring system, ambient conditions and individual CRAC units.  Vigilent's algorithm processes information to speed up and slow down individual CRAC units as needed.  It even includes a unique feature that turns off units, simulating failure and learning how the system will respond, then making appropriate adjustments to the fleet of operating units. 

Other companies are increasingly involved in this sector.  Brainbox AI's autonomous software works with large air handling units, operating air dampers and flow valves as well as fan speed to optimize heating and cooling.  Vie Technologies' Prediction-as-a-Service integrates vibration accelerometers and temperature sensors to monitor rotating equipment and analyze the equipment's signature under various loads.  Its algorithm predicts end of life and the need for maintenance. 

The benefits delivered by these AI-based technologies cover far more than just end-of-life cycle predictions. They also apply the concept of performance-based maintenance, an extension of an age-old principal "If it ain't broke, don't fix it."  If a piece of equipment is operating within its normal signature for a given set of conditions and fulfilling its intended function, then it is classified as "normal" and does not require maintenance or repair.  If a piece of equipment drifts outside of its normal signature, however, then the software can generate an exception report calling attention to the individual component.  AI gives the technician the liberty to focus attention on the unit that is underperforming, rather than being forced to spread his or her focus on the hundred pieces of equipment that are working.

In addition to the algorithm-based products already mentioned, there are products designed to optimize the thermal management of central cooling plants, which effectively perform a continuous heat balance across the heat exchanger and adjust motor speeds and valve positions to maximize efficiency.  AI is also utilized in water chemistry analyzers that apply biocides and corrosion inhibitors to control the fouling of heat transfer surfaces. 

On the power delivery side of the house, thermal infrared monitoring systems and corona detectors provide long-term trending that can accurately predict degradation and power continuity.  As mentioned earlier, power monitors can report the overall load...not just the data platform, but also the IT infrastructure itself.  AI is being used with data center batteries, applying the correct charging current to extend the batteries' life and ensure the health of the center's uninterruptible power supply (UPS), a key element at any center.  Similar to the mechanical heat balance, a strategy of monitoring UPS input and output voltage and amperage, as well the battery charger, confirms that the UPS is operating at its highest efficiency as well as any guarding against a potential breakdown within integrated circuits.

AI, of course, does not stand alone:  it can be coupled with technology already in use in the power and aviation industries.  Strain gauges and pressure sensors can monitor structural load changes or movements.  Strain gauges also confirm and monitor the torque applied to bolts, ensure piping gasket compression and alert operators of potential leaks before they occur.  Deploying IoT sensors at these devices, and using AI to monitor their performance, gives users an enhanced ability to use the full range of the information available, and can help avoid costly malfunctions and downtime. 

There's more: the team at Future Facilities has created a "digital twin," a model of the user's actual data center.  The program allows planners the opportunity to adjust power and cooling delivery to specific computer racks and preview how a given environment will respond to added data processing equipment, as well as where the optimum alignment is directed.  It also allows designers of raw data center space a sandbox to model how an environment will respond to high power density data equipment and the effect on adjacent data environments.  In effect, AI is being used to validate the ability for an infrastructure to handle any given hardware deployment.

As AI matures inside of data centers, we will move from product-based applications to an integrated approach.  Think of a system that instantaneously monitors data flow and prepares the power and cooling systems for adopting to a changing condition; imagine a system that is alert to the whole environment and can adapt to ensure high reliability and availability for the data processing platform with precision efficiency.  The goal is a "lights out" data center; the reality is that we're moving in that direction far more quickly than virtually anyone could have predicted even several years ago.

##

About the Author

john diamond 

John Diamond leads the design and construction team at Evoque Data Center Solutions (www.evoquedcs.com).  John's experience crosses more than 30 years, managing high reliability, high availability technology centers at nuclear power plants, managing high profile enterprise technology centers for Wall Street firms, and consulting in the design, construction, operation and analysis of technology centers for market leaders like Adobe, eBay, Google, and more.

Published Wednesday, September 23, 2020 7:38 AM by David Marshall
Filed under: ,
Comments
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
top25
Calendar
<September 2020>
SuMoTuWeThFrSa
303112345
6789101112
13141516171819
20212223242526
27282930123
45678910