Day 10 of 100 Days of AWS: Storage Basics Part - II

Day 10 of 100 Days of AWS: Storage Basics Part - II

·

7 min read

Hi folks! Welcome to Day 10 of 100 Days of AWS🎯, where we will cover the complete AWS cloud from beginner to professional. Today, we will expand our horizons in the AWS cloud by understanding the Object storage service provided by AWS. Different S3 storage classes. factors to consider before choosing the s3 class that meets the specific requirements. Let’s get started🚀!

Day 9 Overview;

On Day 9, we covered extensively about broad portfolio of storage services provided by AWS. Under them, we discussed the services offered by AWS in the data storage category. followed types of Storage in general computing. finally, we discussed about the services under these storages their features, Down Sides, and use cases. If you want to dive deep into this topic or revise it, please click here.

Object Storage overview;

Object storage is a technology that stores and manages data in an unstructured format called an object. Modern organizations create and analyze large volumes of unstructured data such as photos, videos, email, web pages, sensor data, and audio files. Object storage solutions are ideal for building cloud-native applications that require scale and flexibility, and can also be used to import existing data sources for analytics, backup, or archive.

Metadata is critical for object storage technology. with object, storage objects are kept in a single bucket and not files inside of the folder. This creates flat storage as opposed to hierarchical or tiered storage

Object storage is ideal storage for data lakes because it delivers an architecture for large amounts of data, with each piece of data stored as an object, and the object metadata provides a unique identifier for easier access. This architecture limits the scaling limitation of traditional storage and is why object storage is the storage of cloud.

Object Storage Importance;

As businesses expand, they often face challenges managing isolated pools of unstructured data from various sources, which complicates analysis and innovation. Object storage addresses this by providing scalable, cost-effective solutions that store data in its native format, eliminating complexity and capacity constraints of traditional systems. It allows for centralized management of unstructured data through a user-friendly interface and enables the use of policies to optimize storage costs. While it can be implemented on-premises, cloud object storage offers virtually unlimited scalability, high durability, and global accessibility, facilitating faster decision-making and insights.

use cases for object storage;

Customers use object storage for various purposes:

  • Analytics: Collect and analyze large volumes of data for insights.

  • Data Lake: Supports seamless scaling from gigabytes to petabytes with high durability.

  • Cloud-Native Application Data: Facilitates fast deployments and easy data access for microservices-based applications.

  • Data Archiving: Offers long-term retention with enhanced durability and security for rich media and regulatory compliance.

  • Rich Media: Efficiently store and deliver media files using globally replicated architectures.

  • Backup and Recovery: Ensures uninterrupted operations through data replication across centers.

  • Machine Learning: Enables efficient training and real-time predictions from large datasets.

S3 Simple Storage Service;

Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance. Customers of all sizes and industries can use Amazon S3 to store and protect any amount of data for a range of use cases, such as data lakes, websites, mobile applications, backup and restore, archive, enterprise applications, IoT devices, and big data analytics. Amazon S3 provides management features so that you can optimize, organize, and configure access to your data to meet your specific business, organizational, and compliance requirements.

Factors to consider before choosing S3 storage class;

Choosing the right Amazon S3 storage class depends on several factors related to your application's requirements and data usage patterns. Here are the key factors to consider:

Data Access Frequency

  • Frequently Accessed Data: Use S3 Standard or S3 Intelligent-Tiering.

  • Infrequently Accessed Data: Choose S3 Standard-IA, S3 One Zone-IA, or S3 Intelligent-Tiering.

  • Rarely Accessed/Archived Data: Consider S3 Glacier, S3 Glacier Instant Retrieval, or S3 Glacier Deep Archive.

Data Retrieval Latency

  • Low Latency (Milliseconds):

    • S3 Standard, S3 Intelligent-Tiering, S3 Standard-IA, S3 One Zone-IA, and S3 Glacier Instant Retrieval.
  • Medium to High Latency (Minutes to Hours):

    • S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive.

Data Durability and Availability

  • High Durability and Availability: All S3 storage classes provide 99.999999999% durability (11 9s), but availability varies:

    • S3 Standard: 99.99% availability.

    • S3 Standard-IA and S3 Intelligent-Tiering: 99.9% availability.

    • S3 One Zone-IA: 99.5% availability (single AZ).

Cost (Storage, Retrieval, and Transfer Fees)

  • Storage Cost: Lower for infrequent and archive classes (e.g., Glacier, IA classes).

  • Retrieval Cost: Higher for infrequent and archive classes.

    • Frequent retrieval from Glacier or IA classes can negate storage savings.
  • Monitoring Cost: S3 Intelligent-Tiering includes monitoring fees.

S3 storage classes;

Amazon S3 provides a range of storage classes to optimize storage costs based on data access patterns, retrieval speed, and durability. Here’s a breakdown of S3 storage classes:

1. S3 Standard

  • Purpose: Frequently accessed data.

  • Key Features:

    • Low latency and high throughput.

    • 99.999999999% (11 9s) durability.

    • Suitable for dynamic websites, content distribution, and big data analytics.

  • Cost: Highest storage cost but no retrieval fees.

2. S3 Intelligent-Tiering

  • Purpose: Data with unknown or changing access patterns.

  • Key Features:

    • Automatically moves data between access tiers (frequent, infrequent, and archive) based on usage.

    • 11 9s durability.

    • Low latency for frequent and infrequent access.

  • Cost: Monitoring and automation fees apply.

3. S3 Standard-IA (Infrequent Access)

  • Purpose: Infrequently accessed but instantly retrievable data.

  • Key Features:

    • Lower storage cost than S3 Standard.

    • Higher retrieval costs.

    • Suitable for backups and disaster recovery.

  • Cost: Lower storage cost, higher retrieval cost.

4. S3 One Zone-IA

  • Purpose: Infrequent access data stored in a single availability zone.

  • Key Features:

    • 11 9s durability.

    • Lower-cost alternative to S3 Standard-IA.

    • Suitable for secondary backups or data that can be easily re-created.

  • Cost: Lower storage and availability cost; no multi-AZ redundancy.

5. S3 Glacier Instant Retrieval

  • Purpose: Archive data with fast retrieval needs.

  • Key Features:

    • Millisecond access for archive data.

    • Lower-cost than Standard-IA.

    • Suitable for medical records, compliance, and media assets.

  • Cost: Low storage cost, slightly higher retrieval cost.

6. S3 Glacier Flexible Retrieval (formerly Glacier)

  • Purpose: Archive data with occasional access.

  • Key Features:

    • Retrieval times: Minutes to hours.

    • Cheaper than Glacier Instant Retrieval.

    • Suitable for archives that are rarely accessed.

  • Cost: Very low storage and retrieval cost.

7. S3 Glacier Deep Archive

  • Purpose: Long-term archive with very rare access.

  • Key Features:

    • Lowest-cost storage class.

    • Retrieval time: Hours (12+ hours standard).

    • Ideal for compliance archives or cold storage.

  • Cost: Extremely low storage cost, high retrieval time and cost.

8. S3 Outposts

  • Purpose: Data stored in on-premises environments using S3 APIs.

  • Key Features:

    • Designed for on-premises workloads.

    • Provides the same S3 API and features as AWS.

    • Useful for local data processing and low-latency workloads.

Day 10 Wrap Up;

In conclusion, understanding the various storage options provided by AWS, particularly Amazon S3, is crucial for optimizing data management and cost efficiency in the cloud. Object storage offers scalable and flexible solutions for handling unstructured data, making it ideal for modern applications and analytics. By carefully considering factors such as data access frequency, retrieval latency, durability, availability, and cost, businesses can select the most appropriate S3 storage class to meet their specific needs. As we continue our journey through AWS, the knowledge of these storage fundamentals will serve as a solid foundation for exploring more advanced cloud services. Stay tuned for our next discussion on AWS compute offerings.

Summary & Key Points;

  • Objects are just files Flat file structure (no folders); but they look like folders

  • Great for storing media files, logs, audit reports or basically any file you want.

  • API storage so it cannot be and should not be mounted or boot.

  • Storage classes impact accessibility, resiliency, and cost

Up Next on Day 11;

  • Will Discuss about compute offered by AWS.

  • Compute Types.