#28. Govern Your Storage Cost — Part 2: Setting a Strategy

The second part in the series to set a comprehensive storage strategy.

In the last post, we reviewed storage services. Now we'll explore how to govern their use and understand the role of automation. Let's begin by examining how to connect business demands to storage services.

Step 1. Classify Data

Understand business demands: We begin by gathering requirements for compliance, security, and cost optimisation—particularly important for object storage (e.g., S3 files or stored logs). Consider an insurance business scenario: a large amount of data must be retained for several years due to regulatory requirements (e.g., personal transactional data and personal information). Additional data must be kept for security purposes (e.g., logs). While these objects cannot be deleted, their storage class can be optimized using storage lifecycle management techniques.

The first step is to examine data relevance from a business perspective and classify it using tags. A simple tag like business_value:{value}, where value could be compliance, security or other business purposes, serves as a good starting point. This tagging mechanism becomes the foundation for Cloud storage management.

Another useful tag is expiry_date, which specifies how long an object or bucket should remain available. Once this date passes, the object can be safely deleted.

Step 2. Regulate storage lifecycle management

Once data is classified using tags, you can implement automated lifecycle management policies. For example, in AWS S3, you can create lifecycle rules that automatically transition objects between storage classes based on their age and tags, or even delete it the data completely and save the costs.

Object Storage Lifecycle management rules.

Create rules for storage lifecycle management. For example, in AWS S3, Azure Blob Storage, or Google Storage, you can automate moving objects to cheaper storage tiers after a specified period. For instance, after 30 days, you could move all objects in a given bucket from Standard S3 to Glacier (archival storage). This may apply to any classified data, which holds a specific tag.

Regulating Backups and Snapshots

Every company should set policies that govern when to use backups versus snapshots. As shown in the last article, even though backups are conducted less frequently, they can be far more expensive than snapshots. So how should we govern their use?

Backups serve as long-term data retention and archival protection against accidental deletion or corruption. They're essential for compliance requirements that need historical records and for recovery from major system failures or disasters. Snapshots, on the other hand, provide quick point-in-time recovery for rapid restoration. Here are some guidelines to help you choose your strategy:

For development environments: You need neither backups nor snapshots. Use a data generator or mock data for testing. However, in certain cases, you might need a one-time backup of real data.
For production environments: Start by examining your business requirements. For databases, high availability constraints are typically covered by live data replication. However, some industries have regulations requiring complete disaster recovery data. In such cases, one backup with daily snapshots might suffice. Otherwise, focus on taking complete backups less frequently while maintaining regular snapshots—for example, one backup every quarter with weekly snapshots.

If not required for regulatory purposes, remember to delete both backups and snapshots regularly.

Step 3: Automate Storage Management

Automation plays a massive role in governing the cost for storage. Here are some examples:

Automating Object Storage Management

AWS and GCP offer Intelligent Tiering for object storage, which automatically optimizes storage costs by moving data between access tiers based on usage patterns. The service monitors access patterns and moves objects that haven't been accessed for 30 days to the infrequent access tier. When these objects are accessed again, they automatically return to the frequent access tier. This automation eliminates the need for manual lifecycle management rules while ensuring cost-effective storage.

Does this mean tagging for storage lifecycle management is unnecessary? Not at all. Services like Intelligent Tiering complement manual management rather than replace it. You can use lifecycle management for business-classified objects while enabling Intelligent Tiering for everything else.

Automating Block Storage

When provisioning a block storage volume, you must specify its size. This often leads to either over-provisioning (which wastes resources) or underprovisioning (which can affect performance). This challenge has driven platforms like lucidity.cloud to develop solutions that automatically rightsize the provisioning process based on actual needs. Such automation tools should be seriously considered, especially if your organization heavily uses storage services.

Final Notes on Storage Governance

Understand the technical demands: Cache, Block, Blob, File or Network. Part of storage governance is understanding the technical requirements for each storage type. Organizations often use file storage when object storage would be more appropriate for their needs. Such poor architectural decisions can be costly. Always consider the right purpose for each storage service before deployment.