New edge data centre at The Wellcome Sanger Institute supports Genomic Research for the betterment of mankind
Schneider Electric and Efficiency IT
Edge Project of the Year
Entry Description
The Wellcome Sanger Institute is one of the world’s leading research facilities focused on genomic discovery. The DNA sequencing machines at the core of its research efforts generate vast quantities of data, the analysis of which drives research efforts into improving human health in several challenging areas, including cancer, malaria and other pathogens. The volume of data and the speed at which it is generated requires an onsite data centre with massive processing capacity, as a cloud option could not deliver the necessary latency.

Wellcome Sanger has also just made operational a fourth data hall within its existing facility, with 400 racks consuming 4MW of power, it is now the largest genomic research data centre in Europe. The new hall is designed to be highly scalable—as the data demands of genomic research are only likely to increase rapidly—and efficient, as Wellcome Sanger is committed to reducing its PUE rating to 1.4. This effort is being realised by technology from Schneider Electric, including power distribution, UPS and its EcoStruxure IT management software.
At the core of Sanger’s technical infrastructure are its DNA sequencing machines; a fleet of highly complex and advanced scientific instruments, which generate vast quantities of data that must then be analysed within their on-premise data centre.

The nature of genomic research, a cutting-edge and evolving area of science, means that the demand for data-processing capacity is only likely to increase over time – genomic data is soon set to become the biggest source of data on the planet.

The rapid pace of development of genomic research, and its exploitation by medical science, will continue to drive growth at the data centre. By way of example, sequencing the first human genome took 13 years; the same task can now be achieved in less than an hour. The amount of genomic data generated in the last 18 months is equal to the amount that was generated in the first 18 years of research. Every year, the sequencers generate data more quickly as the technology improves. IT has to keep up with that demand.

The human body is made up of trillions of cells and as the Sanger sequences them, they will also be gathering more genomic data via a greater number of machines. Advances in today’s technology mean the Institute is gathering data more quickly than ever before. This requires more power availability, greater storage, faster connectivity and higher levels of local compute.
Edge data centres provide computing resource close to the point at which data is generated or used. In the case of the Wellcome Sanger Institute, the data generated by its DNA sequencers is of such a scale and generated at such a fast rate that the subsequent computational analysis must be performed as close to the site as possible.

This requirement for local processing to be physically close-by the sequencing equipment where data is being generated is an archetypal example of an ‘edge computing’ deployment. Proximity to the sequencing equipment is a primary consideration for the data centre. The bandwidth and latency requirement for the high volume, and velocity, of genomic data makes cloud services unsuitable. As such, no other edge data centre is as important to discoveries about human life as the one at Sanger.

Additionally, the DNA sequencing equipment on which the scientific effort depends are protected by individual APC by Schneider Electric Smart-UPS™ uninterruptible power supply systems. Downtime within this distributed IT environment would require the chemicals used in the research process to be replaced at significant cost, in addition to lost time and data. Ongoing monitoring of UPS battery health is therefore essential to ensure runtime is available and makes a major contribution towards the Wellcome Sanger Institute avoiding outages in both the data sequencing and research efforts.

As well as the requirement to provide high-performance low-latency computational support to the sequencing machines at the heart of Wellcome Sanger’s efforts, the deployment and operation of the edge data centre has to be as efficient as possible, given the Institute’s desire to maximise its financial resources on medical research rather than computing infrastructure. Simply put, any money saved on IT and physical infrastructure, either in terms of capital expenditure or operating costs, means greater funding is available for key scientific research.
Key to achieving efficient operation is the deployment of EcoStruxure IT, a cloud-based Data Centre Infrastructure Management (DCIM) system from Schneider Electric. This software provides vendor-neutral oversight from behind a ‘single pane of glass’ of all key infrastructure assets in the data centre including racks, power distribution units (PDUs), uninterruptible power supplies (UPS) and cooling equipment.

Wellcome Sanger had not deployed a single unified DCIM platform, and instead used several independent management systems, which monitored parts of the infrastructure but were unable to provide a single overall view of all assets.

The EcoStruxure IT software is a cloud-based system that is highly scalable and quick to deploy. It enabled data centre management to discover and connect thousands of devices on its network in less than 30 minutes. The platform where data from assets is stored, pooled and analysed, is highly cyber secure and GDPR compliant, allowing users to view all their assets at any time. Status updates can be delivered to any device, including smart phones or tablets, allowing key service personnel to be alerted to any issues immediately wherever they may be.
As an open-standards vendor-neutral technology, EcoStruxure IT’s ability to find so many network-enabled devices, regardless of manufacturer, greatly simplifies inventory management and makes the inevitable tasks of upgrading and scaling more efficient and predictable.

For the first time, data centre operations management can monitor both IT equipment and the power train from the same system. Now that those two important elements are combined, much greater visibility over the entire operation is possible. EcoStruxure IT also includes a Data Analytics module to enable smarter real-time decision-making and ensure that any unexpected issues in the data centre are identified and quickly resolved.

In time, the management software will also monitor the operations of the DNA sequencers themselves through the same single management console. This feature, and the fact that the sequencers are also supported by Schneider Electric Smart-UPS uninterruptible power supplies, helps to safeguard against downtime, which can be an expensive business given the valuable chemicals on with the operation of the sequencers depend and which must be discarded if the equipment is inoperative for any significant time.

The key objective of the new data hall is to provide computational support to the vital genome sequencing work performed by the Wellcome Sanger Institute and to do so in a reliable and efficient manner that maximises the investment the Institute must make in its IT support equipment.

Not only do the UPS systems minimise the risk of downtime, with its attendant high costs, but the EcoStruxure IT DCIM software ensures the reliable operation of vital infrastructure, the timely warning of any outages or service issues, the efficient management of all inventory on the Institute’s network and the cost-effective management of operations.

One of the most significant costs associated with operating any data centre is the power required for the cooling function. Continuous reliable operation depends on the equipment being kept at a safe operating temperature; cost-effective use of power requires the cooling operation to be as efficient as possible. This in turn demands constant monitoring of temperature to prevent hotspots developing and the continuous adjustment of cooling equipment, including chillers and fans, to ensure that they are operating only as needed.

The Power Usage Effectiveness (PUE) ratio of any data centre, defined as the ratio of total power consumption to the power required by the core IT equipment alone, is a popular metric for overall electrical efficiency. The expectation is for the Wellcome Sanger Institute to save between 5 and 10% of its energy costs in the first two years in the data centre itself. This will be achieved by raising the room temperature in the data halls from 19 to 21C. This in turn reduces the cooling effort needed from equipment such as chillers and can only be achieved by continuous monitoring of temperature throughout racks with adjustments to fan speeds as and when they are necessary.

Wellcome Sanger Institute has set itself ambitious targets of lowering its PUE from its current level, fluctuating between 1.6 and 1.8 to operating in the range of 1.4. This will produce more cost savings providing a further return on the investment in the new data hall and its monitoring software.

Reduced costs on IT equipment frees up investment for genomic research, a key and ongoing objective for the Wellcome Sanger Institute.
No other edge data centre is as important to discoveries about human life. Scalability, proximity to the data sources, and efficient management both to enable rapid allocation of resources to meet increasing demands and to controls costs are all essential elements in supporting Wellcome Sanger’s ground-breaking scientific research.