As UPSs play a vital role in protecting your data centre’s critical load, they must also be kept well-maintained and capable of uncompromised performance themselves. To achieve this, it’s essential to monitor UPS status at both a tactical and strategic level, while having a plan for responding to the data gathered accordingly.
What to monitor, and how to do it
Examples of tactical information include reported battery temperature, voltage and resistance values; any excessive levels warn users of impending fault conditions, allowing corrective action to be taken. More strategic information would include reports on power consumption, where power is being consumed, and variations in processing load. This informs longer-term planning for the allocation and distribution of power within the data centre, and can provide opportunities for reducing wasted capacity.
Monitoring of environmental variables, particularly temperature and humidity, is essential to ensure that cooling strategies remain effective. It’s also important to monitor less obviously process-related factors, such as security and access, to protect the well-being of the UPS and other data centre equipment.
While the availability of such information is important, collecting it isn’t always easy. Proliferation of equipment over time, plus the advent of virtualisation, can make it difficult to understand how processing load is being distributed across the data centre server population over any given period. Fortunately, solutions in the form of data centre infrastructure management (DCIM) packages exist. These provide access to accurate, actionable data about a data centre’s current state and future needs; critically, they can also exchange information with building management systems to provide a more comprehensive, higher-level overview of data centre status.
Monitoring and control of UPSs must be part of any DCIM strategy. This increasingly involves an element of remote communications, as many organisations’ data infrastructure now includes ‘edge’ micro data centres, so-called because they are located out at the edge of an enterprise, geographically distant from any operations centre.
However, in KOHLER Uninterruptible Power’s (KUP’s) experience, only the monitoring aspect of such strategies should be automated, rather than implementing a fully-automated system with closed-loop control and two-way communications.
One way communications – usually the safest policy
Firstly, there can be a mistrust of two-way communications systems; they are seen as a security risk in some organisations. There has been more than one instance of communications equipment manufacturers being blacklisted over concerns about data misuse and associated security risks.
Even without these concerns, KUP’s experience has shown that if remote access and control of a UPS is too generally available, damage can be inflicted either by carelessness or malign intent. Better security can be promoted through using one-way communications solutions. The UPSs remains closely monitored, but the reaction to a fault should be a phone call or email to alert an authorised technician located near to the UPS. This makes it easier to limit access to only the appropriate personnel.
Nevertheless, once such a warning has been flagged, an appropriate response is essential. Technicians need to arrive on site, even if remote, within an agreed timeframe, and equipped with the training, documentation, equipment and parts needed to effect any repairs.
Choosing the right UPS vendor is essential
This means that, when evaluating potential UPS vendors, it’s essential to look beyond the system’s functionality, performance and reliability. A review of the vendor’s service capability is equally important. KUP, for example, has a nationwide service team which is fully-equipped as described above.
Although not strictly part of a remote monitoring or control strategy, provision of an effective scheduled maintenance strategy is also essential. By ensuring that the batteries and other UPS components are in top condition, UPS life will be extended. Additionally, dependence on any remote control strategy – however it’s implemented – will be reduced, along with exposure to failure.