Data Storage for AI
Storing and activating mass data is critical to the next wave of AI innovation.
App developers are racing to train and deploy AI models. The focus has been on finding the right data and ramping up computing power. As AI models and applications proliferate, figuring out how to store the massive amounts of exabytes they will generate becomes an urgent challenge. Supporting AI workloads requires a mix of important memory and storage technologies across the AI data workflow. But ultimately, AI at scale requires hard drives.
Feeding AI the data it needs to learn, create, and improve requires a broad range of storage technologies. From high-throughput memory to high-capacity hard drives, determining the right storage mix for any AI workload is about balancing the need for performance, cost, and scalability.
AI compute clusters train, run, and optimize language models. GPUs, CPUs, NPUs, and TPUs are closely coupled with high-performance memory devices, offering terabyte- and even petabyte-per-second throughput for extreme computation. The input and output data they use and create flows into networked storage clusters where it’s preserved long-term, mostly on hard drives, to support future re-training, quality control, and compliance.
Today, the most advanced AI innovators also operate the world’s largest hyperscale and cloud data centers. These companies choose to store 90% of their online exabytes¹ on hard drives because they understand the unique price-to-performance value that hard drives offer for mass-capacity storage. Though SSDs are also a critical technology, hard drives will continue to store the majority of data as more AI-optimized architectures are deployed.
By supporting the entire AI data workflow, hard drives play a crucial role in validating AI models.
Realizing the full potential of AI requires data—and the storage that upholds it.
People and machines will create in more ways at a faster pace than ever as AI proliferates, producing massive volumes of data.
AI improves in a virtuous feedback loop of consuming data, generating new content, and learning from its performance.
Seagate is optimizing storage for AI, making unprecedented leaps forward in capacity to support efficient data center architecture and buildout.
Seagate's analysis of IDC's Multi-Client Study, Cloud Infrastructure Index 2023: Compute and Storage Consumption by 100 Service Providers, November 2023.
Seagate’s analysis based on Forward Insights Q323 SSD Insights, Aug. 2023; IDC Worldwide Hard Disk Drive Forecast 2022-2027, Apr. 2023, Doc. #US50568323; TRENDFOCUS SDAS Long-Term Forecast, Aug. 2023.
Using total embodied carbon with a 5-year lifecycle.
Sara McAllister et al., “A Call for Research on Storage Emissions,” Hotcarbon.org, 2024.