What is distributed storage?
Distributed storage is a software-defined storage system that enables access to data - when you want, where you want and whom you want to access. Distributed storage is a logical volume management system designed to process scale and data access in a HA (High Available) environment with intelligence to detect and respond to failures and cyber attacks. Replacing the traditional three-tier architecture with a distributed file system, it is made up of data stored on clusters of storage nodes that are geographically dispersed. The storage system includes features that synchronize and coordinate data across the cluster nodes.
Distributed cloud storage: The next generation of cloud storage?
Distributed cloud storage is related to traditional cloud storage in some ways, especially in the techniques and hardware it uses. But there’s one important difference. Instead of data stored on a collection of storage devices in one data center, distributed cloud storage is made up of data stored on clusters of storage nodes that are geographically dispersed. The storage system includes features that synchronize and coordinate data across the cluster nodes greatly simplifying storage rollouts and management. Since the data is distributed, you can deploy cloud-based data monitoring tools to detect, prevent, recover from and analyze cyber attacks. Shared storage is a big target for ransomware attacks, data governance features of the distributed cloud storage greatly help to detect signatures, block user sessions, endpoints and perform forensic analysis and also help with recovery efforts in case of an attack.
The “distributed” nature of this type of cloud storage is important because it allows cloud data to be stored in closer proximity to an organization’s physical location such as ROBOs (Remote Office & Branch Offices). It opens up new possibilities for location-dependent cloud use cases and can result in faster data transfers, reduced network congestion, and lower risk of data loss.
Based on edge computing and storage, distributed cloud storage represents the next step in cloud storage, one that puts data closer to where it’s needed. Public cloud providers such as AWS have long acknowledged the value in keeping data close to where it will be used, as evidenced by their multiple zones and region-based offerings.
How distributed storage works
Public cloud providers distributed their storage services out to a variety of physical locations. The objective is to achieve very low latency by storing data physically near the location it will be used.
With distributed cloud storage, the lines between public, private, and hybrid cloud become blurred or disappear completely as an administrator can consistently manage data from across all three storage types from one control plane.
Why is distributed storage important?
- Software-defined - distributed storage replaces the traditional centralized SAN and NAS with a software-defined storage platform that empowers customers to deploy, manage, and scale a single, unified storage platform across datacenters, branch offices, or the cloud. An integrated distributed storage platform enables seamless access to storage by delivering files, objects and volumes across multiple protocols to all workloads and users.
- Access to all protocols - customers have been procuring Files, Objects and Volumes as point solutions and are managed by independent teams. A distributed storage system offers simplicity, consolidating all three access types on a single platform, helping customers to deploy the storage services at core/edge or to extend to the cloud. In addition, all three storage services are managed and monitored centrally.
- Scale-out architecture - unlike traditional storage arrays, distributed storage is by design a scale-out architecture. You can add as many nodes as you want, increasing the storage capacity ad infinitum.
- Faster provisioning - since the distributed storage system creates a shared pool of storage resources from a number physical nodes, storage policies can be created and attached to virtual machines that can instantaneously leverage resources from the dynamic storage pools. This makes faster storage provisioning unlike traditional storage where an admin has to create a volume/file share and attach to the virtual machine manually.
- Simplified management and monitoring - distributed storage system offers simple management and monitoring with dashboards, data analytics tools, etc.
Distributed storage features
While features can vary across cloud storage providers, most distributed cloud storage systems include:
- Partitioning – allows users to spread data across cluster nodes and easily access data from those nodes
- Replication – data is copied across a variety of nodes and are updated consistently whenever that data is modified
- Resiliency – data remains available, even if one or multiple nodes malfunction
- Easy scaling – system operators can scale storage capacity up or down as needed, simply by adding or removing nodes to the cluster
Pros and cons of distributed cloud storage
Distributed cloud storage has a number of advantages and benefits:
- Aids regulatory compliance – many regulations limit organizations from moving sensitive data across borders; now they can more easily keep country data in-country, for instance
- More ambiguous attack surface – because there are no “central” servers, there’s no obvious target for attack by bad actors
- Reduced risk of network failure – because data is stored in local or regional clusters, they can sometimes run separately—which increases fault tolerance
- Enhanced privacy – data files are split apart, encrypted, and stored across a network of servers
- Reduced energy costs – there’s no need to build and cool a massive centralized data center
Challenges arise primarily from the distributed nature of this storage model:
- Bandwidth – made up of a variety of cloud storage types and systems, distributed cloud storage might have a range of different connectivity models, which can put strain on edge-located internet connections
- Security – ensuring data is secure across varying cloud storage types spread across the world can be difficult
- Data protection – backup and business continuity can get tricky, especially when it comes to making sure geography-limited data stays where it should
Cloud computing vs distributed cloud: which is better?
The centralized, traditional cloud storage systems we have come to know and use are perfectly suitable for most organizations. They’re not going away anytime soon.
What is likely, however, is that distributed cloud storage will become increasingly popular, especially as edge computing and location-specific use cases proliferate.
Where centralized cloud storage needs a data center with a multitude of servers, distributed cloud storage distributes data across its dispersed network to individual devices or computers. The biggest benefit to that is reliability. Storing data on multiple systems of storage servers instead of one collection builds resiliency and keeps your data protected from loss.
Distributed cloud storage also reduces latency, with data being stored near where it will be used. The traditional cloud model can have serious latency, as data travels across the country or the globe. Lower latency equals improved performance—and a better user experience overall. Distributed cloud storage also edges out the centralized model because it is a greener solution and can help organizations save big on energy costs. There’s no need for enormous cooling systems—or even a data center building that requires light and heat.
Distributed cloud storage also enhances data security and data protection. A single instance of data can be split across multiple sites or multiple instances of data can be replicated across multiple sites. Both cases offer heightened data protection in case of DR events, Ransomware attacks, etc.
Edge computing vs distributed cloud
Edge computing is a distributed IT architecture where data is processed at the edge of the network, as close to the originating source as possible. This ideally puts compute and storage at the same point as the data source. While distributed cloud computing is a software system that is shared among multiple computers and runs as one system to improve efficiency and performance.
Examples of distributed cloud storage
Distributed cloud storage forms the foundation of some popular cloud storage systems, such as Amazon S3 and Microsoft Azure Blob Storage. Another good example of distributed cloud storage is a content delivery network (CDN), such as Netflix or YouTube. These companies store their video content in specific geographic locations around the world, nearer to where that content will be watched (think people watching a show in China versus someone accessing an English-language video in the UK). This helps reduce latency.
Distributed storage and Nutanix
Nutanix Unified Storage is a software-defined storage platform that consolidates File, Object & Block storage into a single platform. By removing the need for dedicated storage systems, the environment is simpler to operate, allowing you to focus more on application services and less on infrastructure. Combined with Nutanix Cloud Platform, Unified Storage gives you a platform that is built for scale, performance, and integrated data security. It offers agility, flexibility and simplicity to build modern applications and services no matter where they are deployed - core, cloud, or edge. The platform provides seamless access to structured and unstructured data using S3, SMB, or NFS protocols. A single point of management for all storage resources eliminates complexity of multiple interfaces and consumer-grade design enables non-storage experts to handle most day-to-day storage and data management tasks. Data security and analytics integrated into the solution provide deep insights into how data is being used and helps to prevent threats from ransomware and other bad actors. With integrated ransomware protection, Unified Storage helps to detect, prevent, and recover from cyber-attacks.
Related articles:
Related Resources
2023 Gartner® Magic Quadrant™ for distributed file systems and object storage
IDC white paper on the business value of Nutanix Unified Storage
Industry's First Unified Storage Platform for All Data Management Needs
Related products and solutions
Nutanix Unified Storage
Intelligently manage and share data to help your business make informed decisions.
Nutanix Files Storage
Centrally manage, scale and adapt to changing file-storage needs from on-premises to multiple clouds.
Nutanix Objects Storage
Objects Storage delivers secure S3-compatible object storage at massive scale to hybrid cloud environments.