DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Low-Code Development: Leverage low and no code to streamline your workflow so that you can focus on higher priorities.

DZone Security Research: Tell us your top security strategies in 2024, influence our research, and enter for a chance to win $!

Launch your software development career: Dive head first into the SDLC and learn how to build high-quality software and teams.

Open Source Migration Practices and Patterns: Explore key traits of migrating open-source software and its impact on software development.

Related

  • MaxLinear Empowers High-Speed Connectivity and Data Acceleration Solutions for Next-Gen Computing
  • Distributed Locking and Race Condition Prevention in E-Commerce
  • Implementation of the Raft Consensus Algorithm Using C++20 Coroutines
  • Edge Computing for Real-Time Data Processing in Utility IT Operations

Trending

  • Explore the Complete Guide to Various Internet of Things (IoT) Protocols
  • Empowering Citizen Developers With Low- and No-Code Tools: Changing Developer Workflows and Empowering Non-Technical Employees to Build Apps
  • Unlocking Potential With Mobile App Performance Testing
  • Maintain Chat History in Generative AI Apps With Valkey
  1. DZone
  2. Data Engineering
  3. Data
  4. Hyperscale NAS and Global Data Environment Enhancements Simplify Distributed Computing

Hyperscale NAS and Global Data Environment Enhancements Simplify Distributed Computing

Hammerspace simplifies distributed computing with Hyperscale NAS and Global Data Environment enhancements for performance, S3 support, and data orchestration.

By 
Tom Smith user avatar
Tom Smith
DZone Core CORE ·
Apr. 23, 24 · Analysis
Like (1)
Save
Tweet
Share
647 Views

Join the DZone community and get the full member experience.

Join For Free

Hammerspace, a software-defined, multi-cloud data control plane provider, recently announced significant enhancements to their Hyperscale NAS and Global Data Environment offerings aimed at simplifying data management for distributed computing. The new capabilities include performance optimizations, an S3 interface, and high-performance erasure coding. 

For developers, engineers, and architects struggling to efficiently manage and move data across silos to power AI/ML training and other computing workloads, Hammerspace provides a compelling solution. As Molly Presley, SVP Marketing at Hammerspace explained during the 54th IT Press Tour, their goal is to "radically improve how data is used" by shifting from "data at rest isolated in storage" to "data in motion across a global data environment."

Hyperscale NAS: Performance and Simplicity

One of the key challenges Hammerspace addresses is providing the performance needed for AI/ML model training and other GPU-intensive workloads while maintaining the simplicity of standard NAS interfaces. As Presley noted, "Models require standard NFS data interface" but "existing NAS and object storage were not designed for large compute performance."

Hammerspace's Hyperscale NAS solves this by combining the performance of HPC-class parallel file systems with the simplicity and enterprise features of scale-out NAS. "Hyperscale NAS architectures provide performance and efficiency that a massive web or hyperscale organization [needs], but even at small scale, the efficiencies are the same," said Presley. 

This enables linear scalability as the system grows. "With parallel file systems, this is true of Lustre, probably OneFS, GPFS for sure, that you can scale very linearly, so you get the benefits of being able to connect to the enterprise with the simplicity of NAS but with all the benefits of an HPC file system."

Real-world deployments have shown the power of Hyperscale NAS. One web-scale customer is using it to feed 34,000 GPUs for AI training, soon scaling to 1 million GPUs. It provides an aggregate performance of 12.5 TB/sec, rising to 100 TB/sec — all using standards-based, plug-and-play infrastructure.

Global Data Environment Unifies Siloed Data

The other major problem area Hammerspace targets is data silos that make it difficult to efficiently share data across sites and clouds. Their Global Data Environment creates a unified namespace across all storage, enabling transparent data movement and access.

"Hammerspace virtualizes the underlying storage infrastructure. All authorized users and applications can access the same data locally from anywhere," explained Presley. This eliminates the need for complex data migrations and copying.

On the backend, Hammerspace supports any storage type, from NAS to object storage to cloud. The front end provides standard access protocols like NFS, SMB, and now S3 as well. This allows applications to access data without modification while gaining the benefits of the global data environment.

"[Customers] bias for a specific need. But then they add more workloads over time and getting all of their data into a single global data environment," said Presley. "Now all those S3 applications can interact with us without any limitations on the data patterns that they'll have."

Built-in high-performance erasure coding on Hammerspace-provided storage nodes offers efficient data protection without sacrificing performance. "The speed really was [the] peak that caught our attention, but there is a lot about the resiliency," noted Presley. "Having the ability to suffer a lot of failures and self-heal and continue to provide performance and not lose performance through the process is what made this so attractive to us."

Simplifying Data Management Across the Lifecycle

For engineers and architects, this all adds up to vastly simplified data lifecycle management. Data generated on any storage system or cloud can be easily ingested into the global data environment and made available wherever it is needed without migration or copying. Automated orchestration and placement policies ensure data is always in the right place at the right time.

This streamlines common workflows like using AI to process data coming in from edge sites or IoT devices and enables remote users to get real-time access to large datasets. Data is automatically placed close to GPU resources when needed for training, tiered to cost-effective storage when dormant, and protected via erasure coding.

The impact of this model was summed up well by Presley: "When you imagine you have a lot of capacity in HPC in particular, they historically haven't done data backup because they just can't afford to have two copies of 200 petabytes or whatever it is. So they will have a single copy. And what we do is instead of if you want to make a file available on another node, we will actually move it to the other node. We don't keep copies of data. There's actually a placement of the data somewhere new, and then you still only have a single gold dataset."

Looking Forward

Hammerspace's Hyperscale NAS and Global Data Environment updates mark a major step forward in simplifying distributed data environments. As GPU-accelerated computing and multi-cloud deployments become the norm, the ability to efficiently manage data across these landscapes is critical.

Developers can look forward to focusing on their applications and models while leaving the intricacies of distributed data management to Hammerspace's data orchestration. Infrastructure engineers and architects gain a powerful and flexible platform for unifying data silos and putting data where it needs to be for each workload — all without sacrificing performance or enterprise reliability. These capabilities will only become more essential as the size and distribution of datasets continues to grow.

Data (computing) Distributed Computing

Opinions expressed by DZone contributors are their own.

Related

  • MaxLinear Empowers High-Speed Connectivity and Data Acceleration Solutions for Next-Gen Computing
  • Distributed Locking and Race Condition Prevention in E-Commerce
  • Implementation of the Raft Consensus Algorithm Using C++20 Coroutines
  • Edge Computing for Real-Time Data Processing in Utility IT Operations

Partner Resources


Comments

ABOUT US

  • About DZone
  • Send feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: