Pure Storage
Data Analytics Innovation of the Year
Entry Description
Data has never been more crucial for businesses, and data volume growth over the last couple of years has been staggering. With 90 percent of all data in existence today created in the past two years, and 80 percent of data set to be unstructured by 2025 (estimated to be 175 zettabytes in total). Data analytics represents an opportunity to turn this unstructured data into insight; promising the ability to innovate faster and extend competitive advantage.

The benefits of analytics are beyond compelling, and analytics investments power success. Recent research from ESG concludes that companies taking full advantage of analytics are 4.8x more likely to increase revenue, 3.2x more likely to improve customer satisfaction, and 2.4x more likely to enhance operational excellence.

Roadblocks to analytics

While the benefits of investing in a mature analytic platform are obvious, there are several challenges that prevent companies from being able to achieve their analytic goals and dreams.

As log analytic architectures scale, their performance begins to become unpredictable, causing slowdowns in search queries and their subsequent processes. As a distributed system that manages a magnitude of data ingest, a large part of search performance relies on the ability of the administrator to predict which data will be queried. But as companies mature their pipelines and utilize more and more data to glean insights – it is becoming harder for administrators to accurately forecast which data should live where and for how long. The result is a substantial slow down for ad-hoc and forensic searches as well as complex queries.

Additionally, as the analytic platform matures and more data is ingested, infrastructure can be overwhelmed and search capabilities across the board are impacted. This can lead to overprovisioning of infrastructure and increased costs.

In addition to unpredictable performance, the tightly coupled nature of compute and storage that traditional log analytic deployments use leads to disruptions and complexity as these environments scale. As capacity needs grow, customers are forced to deploy unnecessary compute resources as well and experience lengthy and impactful rebalancing processes. Likewise, if a customer needs to grow their compute resources, they are forced to grow the capacity as well whether they need to or not.

Another issue is that often the teams that run and manage log analytic applications are not the same teams that manage infrastructure. Because of this, there are often dramatic impacts to data pipelines in the form of performance issues, strained resources or outages. The application owners struggle to meet demand of their systems due to the struggling infrastructure, and the infrastructure teams don’t understand the application requirements and dynamics in order to quickly adapt to the ever changing demands.

To augment this many customers are looking to deploy their pipelines in the cloud, giving up ownership of their data for the promise of a carefree, persistent infrastructure so they can focus on driving results. But the promise of the public cloud isn’t free from it’s own unique challenges like security risks, unpredictable and exorbitant costs and the loss of data autonomy. The promise of cloud-like agility quickly crumbles as the solution grows and customers find themselves trapped in a rigid solution they no longer can afford to support.

Businesses want to overcome the challenges that are preventing them modernizing their analytic workflows and realizing the full potential of their data. Enter FlashBlade(R), the industry’s first unified fast file and object storage platform.

FlashBlade - unique in solving modern analytics challenges

When it comes to analytics, fast matters. The power of all flash coupled with the ability to scale in multiple dimensions is enabling our customers to experience the speed of distributed systems with the simplicity of a consolidated platform. Flashblade’s ability to scale capacity, performance and concurrency allows data architects to utilize the same system for a multitude of analytic applications, providing a single accelerated tier of storage for the most demanding data pipelines.

A modern data architecture needs to be simple to run, simple to manage and simple to scale. Flashblade is the first data platform in the industry to offer both native file and object support as well as a highly adaptive architecture that enables even competing and disparate workloads to effectively utilize the same accelerated tier of storage. These capabilities along with FlashBlade’s ability to scale enable data scientists to focus on their data pipelines instead of battling the infrastructure needed to run them.

In addition, a modern data architecture needs to protect a customer's investment, ensuring that they can innovate now and well into the future. Like any other business critical app, an analytics pipeline cannot afford an outage. Any outage, planned or unplanned will have a detrimental impact to analytic pipelines and business insights. FlashBlade ensures that performance and capacity are always available for your most demanding workloads. This enables a true cloud like experience whether off-prem, on-prem or both. Its simplicity of scale for performance and capacity ensures that data scientists have the resources they need to accelerate their data insights.

Accelerating time-to-market - Man AHL

One company already benefiting from the power of FlashBlade is Man AHL. Based in the UK, Man AHL is a pioneer in the field of systematic quantitative investing with more than $24 billion in assets under its management. Its entire business is based on creating and executing computer models to make investment decisions. The company adopted FlashBlade to deliver the massive storage throughput and scalability required to meet its most demanding simulation applications.

The impact of FlashBlade was noticed immediately. Some of the company’s researchers found that the introduction of FlashBlade made it easier to use Spark for performing multiple simulations. One of them experienced a 10x-to-20x improvement in throughput for Spark workloads compared to the previous storage system.

The company’s CTO, Gary Collier stated: “The greatest benefit for Man
AHL from Pure FlashBlade is significantly improved productivity for the team and accelerated time-to-market,” this has helped extend Man AHL’s competitive advantage in the market even further.

Trusted by leading companies around the world - LiquidNet

Liquidnet provides a global institutional investment network that many of the world’s
largest asset managers use to seek liquidity and investment opportunities, and a key to
helping them with their execution decision-making is superior analytics. LiquidNet utilises FlashBlade to perform timely trades and generate real-time analytics that its institutional trading clients rely on to produce outstanding results.

Liquidnet’s application development and analytics teams are employing modern tools
like Elasticsearch, Spark, and Kafka and needed infrastructure to keep up with their
workload demands. The ability to process large volumes of streaming data in real time, and the ability to make a data set available to multiple workloads simultaneously have proven key to staying at the leading edge of the market, thanks to FlashBlade.

LiquidNet’s Global Head of Product Support, Mani Venkateswaran, went as far as to say: “FlashBlade provides state-of-the-art technology that significantly reduces our
time-to-market with capabilities of high value to our clients. FlashBlade has been a huge enabler toward our goal of real-time analytics.”

Delivering real-time analytics previously was not feasible because its legacy storage
systems lacked the I/O and parallelism needed to move very large data sets fast enough. LiquidNet now joins countless others relying on FlashBlade to make a difference through analytics every single day.
Supporting Documents