Blog

How should enterprises prepare to get the most out of their AI investments?

Table of Contents

There is no AI success without data—heaps of it.

And there are no massive data sets without ample, efficient data storage.

Data upholds AI, and mass-capacity hard drives uphold data.

These insights are brought into sharp relief by a 2025 survey from the research firm Recon Analytics. 

The Seagate-commissioned global survey queried 1,062 respondents. They are IT storage buyers and decision-makers who work in storage infrastructure roles for companies that report over $10 million in annual revenue, have over 50 terabytes of current storage usage, have adopted AI or plan to adopt AI within the next three years, and are located in the United States, China, United Kingdom, South Korea, Singapore, France, India, Japan, Taiwan, and Germany.

The survey focused on the effects of AI adoption on infrastructure priorities, data retention, and data management. Results shed light on how AI will impact infrastructure needs over the next three years.

Survey highlights. 

First and foremost, the survey demonstrated that AI adoption is driving exponential growth in data storage demand through 2028.  

  • As many as 61% of respondents from companies that predominantly use cloud storage said their companies’ cloud-based storage would have to increase by over 100%— that is, it would have to double—over the next three years.

Figure 1. Sixty-one percent of respondents whose companies primarily use cloud storage for their AI data management expect to increase their storage requirements by 100% or more.

As AI applications drive unprecedented data creation, the more data organizations save, the more they can validate that AI is acting as expected. With access to behavioral data—like training datasets, model checkpoints, prompts, and answers—companies can scrutinize algorithms, and better understand and refine AI decision-making. Without the scale and efficiency of data centers, AI’s potential would be limited, as the ability to store and retrieve massive datasets is central to AI’s success. 

It is not just the amount of storage that drives AI success. Duration of data storage matters too.

  • Of the respondents employed by businesses that have adopted AI technology, 90% believe longer data retention improves the quality of AI outcomes.

Figure 2. Ninety percent of companies that use AI today believe that retaining more historical data improves model accuracy.

This finding points to a correlation between preserving data for longer periods and more reliable AI insights. This may be underpinned by several factors. First, constant iterative processing is intrinsic to how AI algorithms work. Content outputs feed back into the model, improving its accuracy and enabling new models. Raw datasets and outcomes become sources for further development and new workflows.

But holding onto data sets for longer serves other business critical functions, too: it protects a company’s intellectual property. It keeps “receipts” of the model’s original data sets and processes, providing an explanation of results when required (say, as part of a legal process). These “receipts” establish data lineage, ensuring a clear record of the journey data takes from input to output. Data lineage allows organizations to verify the origin and usage of datasets, allowing that AI models rely on accurate data. It enables AI systems to be fully auditable and supports both regulatory compliance and internal accountability.

Additionally, companies may choose to store more data for longer because they realize that they cannot know today what new, valuable insights the algorithms of tomorrow might uncover from yesterday’s data. Longer data retention enables the processing of old data by yet-undeveloped AI models. For these reasons, longer data retention boosts the business value AI can provide. 

In a related finding, infrastructure decision-makers view extended data retention as essential for building trust, a critical foundation without which AI insights hold little value.

  • 88% of respondents whose companies use AI today believe adoption of trustworthy AI increases the need to store more data for longer periods of time.

Seagate defines trustworthy AI as AI data workflows and models that use dependable inputs and generate reliable insights. Trustworthy AI is built on data that meets the following criteria: 

  • high quality and accuracy 
  • clear legality, ownership, and provenance 
  • secure storage and protection 
  • explainable and traceable transformations by the algorithm 
  • consistent and reliable outputs from the data processing 

Figure 3. Eighty-eight percent of respondents whose companies use AI today said that adoption of trustworthy AI requires increased need to store more data for longer periods of time.

Scalable storage infrastructure supports trustworthy AI because it enables the vast amounts of data used by AI systems to be properly managed, stored, and secured.

  • As part of building trustworthy AI, 80% of respondents stressed the importance of checkpointing.

Checkpointing is the process of saving the state of an AI model at specific, short intervals during its training. AI models are trained on large datasets through iterative processes that can take anywhere from minutes to months. The duration of a model’s training depends on the complexity of the model, the size of the dataset, and the computational power available. During this time, models are fed data, parameters are adjusted, and the system learns how to predict outcomes based on the information it processes. 

Checkpoints act like snapshots of the model’s current state—its data, parameters, and settings—at many points during training. Saved to storage devices every minute to every few minutes, the snapshots allow developers to retain a record of the model’s progression, and to avoid losing valuable work due to unexpected interruptions. 

According to the survey, companies using 100+PB of storage are saving and backing up checkpoints on a daily-to-weekly basis, with 87% of them storing these checkpoints in the cloud or in a mix of HDD and SDD. 

Storage: The secret driver of AI success. 

Compute and energy are popular themes in discussions of AI adoption. But the Recon Analytics survey highlights storage as the critical driver.  

  • From the perspective of infrastructure buyers, data storage ranked as the second most important part of AI infrastructure, following only security. Security and storage were followed by data management, network capacity, compute, regulations, LLM viability, and energy, in order of importance. 
  • Two thirds (66%) of respondents ranked storage as the second most important among their top four AI enablers and as the fourth most important barrier to adoption.

 Figure 4. Sixty-six percent of infrastructure decision-makers ranked storage as the second most important component among their top four AI enablers. They also ranked storage as the fourth most important barrier to AI deployment.

Recon founder and lead analyst Roger Entner describes the takeaway:  

“The survey results generally point to a coming surge in demand for data storage, with hard drives emerging as the clear winner. When you consider that the business leaders we surveyed intend to store more and more of this AI-driven data in the cloud, it appears that cloud services are well-positioned to ride a second growth wave.” 


To get the most value from AI, enterprises must prepare with scalable, efficient data storage. Whether directly or through cloud services, AI’s reliance on data depends on hard drives—offering unmatched capacity, cost efficiency, and sustainability—as the backbone of trustworthy AI.