AWS Cloud Developer Associate Certification

12.Amazon RDS and Elastic Cache
Index

  1. Database types – Relational VS non relational
  2. Database types – Operational vs Analytical
  3. Databases Architectures
  4. amazon RDS
  5. Elastic cache
    1. Caching strategies – Lazy load,Write through,TTL
    2. elastic Cache Engines
      1. Memcached
      2. Redis(cluster disabled)
      3. Redis(cluster enabled)

Database Types – Relational vs Non-Relational – Key differences are how data are managed and how data are stored

RelationalNon-Relational
Organized by tables, rows and columnsVaried data storage models
Rigid schema (SQL)Flexible schema (NoSQL) – data stored in key-value pairs, columns, documents or graph
Rules enforced within databaseRules can be defined in application code (outside
database)
Typically scaled verticallyScales horizontally
Supports complex queries and joinsUnstructured, simple language that supports any
kind of schema
ACID (Atomicity, Consistency, Isolation,
Durability) compliance typically enforced
Performance is typically prioritized, can use ACID
transactions in some cases
Amazon RDS, Oracle, MySQL, IBM DB2,
PostgreSQL
Amazon DynamoDB, MongoDB, Redis, Neo4j

Database types – Operational vs Analytical

Key differences are use cases and how the database is optimized

Operational / transactionalAnalytical
Online Transaction Processing (OLTP)Online Analytics Processing (OLAP) – the source data comes from OLTP DBs
Production DBs that process transactions. E.g. adding customer records, checking stock availability (INSERT, UPDATE, DELETE)Data warehouse. Typically, separated from the customer facing DBs. Data is extracted for decision making
Short transactions and simple queriesLong transactions and complex queries
Relational examples: Amazon RDS, Oracle, IBM DB2,
MySQL
Relational examples: Amazon RedShift, Teradata, HP Vertica
Non-relational examples: Amazon DynamoDB, MongoDB,
Cassandra
Non-relational examples: Amazon EMR, MapReduce

Databases – Architecture Discussion

Data StoreWhen to Use
Database on EC2– Full control over instance and database
– Preferred DB not available under RDS
Amazon RDS• Need traditional relational database for OLTP
• Your data is well-formed and structured
Amazon DynamoDB • Name/value pair data
• Unpredictable data structure
• In-memory performance with persistence
• High I/O needs
• Require dynamic scaling
Amazon RedShift• Data warehouse for large volumes of aggregated data
• Primarily OLAP workloads
Amazon ElastiCache• Fast temporary storage for small amounts of data
• Highly volatile data (non-persistent)

Amazon Relational Database Service (RDS)

  1. Amazon Relational Database Service (Amazon RDS) is a managed service that makes it easy to set up, operate, and scale a relational database in the cloud.
  2. Automated backups and patching applied in customer-defined maintenance windows
  3. Push-button scaling, replication and redundancy
  4. Amazon RDS supports the following database engines
    1. Amazon Aurora (proprietary AWS database engine).
    2. MySQL.
    3. MariaDB.
    4. Oracle.
    5. SQL Server
    6. PostgreSQL
    7. RDS is a managed service and you do not have access to the underlying EC2 instance (no root access).

Amazon RDS – Scalability

  1. You can only scale RDS up (compute and storage).
  2. You cannot decrease the allocated storage for an RDS instance
  3. You can scale storage and change the storage type for all DB engines except MS SQL.
  4. For MS SQL the workaround is to create a new instance from a snapshot with the new configuration.
  5. Scaling storage can happen while the RDS instance is running without outage however there may be performance degradation
  6. Scaling compute will cause downtime
  7. You can choose to have changes take effect immediately, however the default is within the maintenance window.

Amazon RDS – Multi-AZ and Read Replicas

Multi-AZ DeploymentsRead Replicas
Synchronous replication – highly durableAsynchronous replication – highly scalable
Only database engine on primary instance is activeAll read replicas are accessible and can be used for read scaling
Automatic failover to standby when a problem is detectedCan be manually promoted to a standalone database instance
Always span two Availability Zones within a single RegionCan be within an Availability Zone, Cross-AZ, or Cross- Region
Automated backups are taken from standbyNo backups configured by default
Database engine version upgrades happen on primaryDatabase engine version upgrade is independent from source instance

Amazon RDS Aurora Key Features

Aurora FeatureBenefit
High performance and scalabilityOffers high performance, self-healing storage that scales up to 64TB, point-in-time
recovery and continuous backup to S3
DB compatibilityCompatible with existing MySQL and PostgreSQL open source databases
Aurora ReplicasIn-region read scaling and failover target – up to 15 (can use Auto Scaling
MySQL Read ReplicasCross-region cluster with read scaling and failover target – up to 5 (each can have up to 15
Aurora Replicas)
Global DatabaseCross-region cluster with read scaling (fast replication / low latency reads). Can remove
secondary and promote
Multi-MasterScales out writes within a region. In preview currently and will not appear on the exam
ServerlessOn-demand, autoscaling configuration for Amazon Aurora – does not support read replicas
or public IPs (can only access through VPC or Direct Connect – not VPN)

Amazon RDS Aurora Replicas

FeatureAurora ReplicaMySQL Replica
Number of replicasUp to 15Up to 5
Replication typeAsynchronous (milliseconds)Asynchronous (seconds)
Performance impact on primaryLowHigh
Replica locationIn-regionCross-region
Act as failover targetYes (no data loss)
Yes (potentially minutes of data loss)
Automated failoverYesNo
Support for user-defined replication delayNoYes
Support for different data or schema vs. primaryNoYes

Amazon ElastiCache

  1. Fully managed implementations of two popular in-memory data stores – Redis and Memcached.
  2. ElastiCache is a web service that makes it easy to deploy and run Memcached or Redis protocol-compliant server nodes in the cloud.
  3. Can be put in front of databases such as RDS and DynamoDB – sits between the application and the database
  4. Good if your database is particularly read-heavy and the data does not change frequently.
  5. Billed by node size and hours of use
  6. Elasticache EC2 nodes cannot be accessed from the Internet, nor can they be accessed by EC2 instances in other VPCs.

Amazon ElastiCache – Caching Strategies

  1. Lazy Loading
  2. Write Through
  3. Dealing with stale data – Time to Live (TTL

Lazy Loading

  1. Loads the data into the cache only when necessary (if a cache miss occurs).
  2. Lazy loading avoids filling up the cache with data that won’t be requested
  3. If requested data is in the cache, ElastiCache returns the data to the application
  4. If the data is not in the cache or has expired, ElastiCache returns a null.
  5. The application then fetches the data from the database and writes the data received into the cache so that it is available for next time.
  6. Data in the cache can become stale if Lazy Loading is implemented without other strategies (such as TTL).

Write Through

  1. When using a write through strategy, the cache is updated whenever a new write or update is made to the underlying database.
  2. Allows cache data to remain up-to-date.
  3. Without a Time To Live (TTL) you can end up with a lot of cached data that is never read

Dealing with stale data – Time to Live (TTL)

  1. The drawbacks of lazy loading and write through techniques can be mitigated by a TTL.
  2. The TTL specifies the number of seconds until the key (data) expires to avoid keeping stale data in the cache.
  3. When reading an expired key, the application checks the value in the underlying database (note- for expired key it doesn’t returns null as opposed to when there is cache miss n first place)
  4. Lazy Loading treats an expired key as a cache miss and causes the application to retrieve the data from the database and subsequently write the data into the cache with a new TTL
  5. Depending on the frequency with which data changes this strategy may not eliminate stale data – but helps to avoid it.

Exam tip: the key use cases for ElastiCache are offloading reads from a Database, and storing the results of computations and session state. Also, remember that ElastiCache is an in-memory database and it’s a managed service (so you can’t run it on EC2).

Amazon ElastiCache – Engines

Feature Memcached Redis (cluster mode disabled) Redis (cluster mode enabled)
Data persistenceNo Yes Yes
Data typesSimple Complex Complex
Data partitioningYes No Yes
EncryptionNo Yes Yes
High availability (replication)No Yes Yes
Multi-AZYes, place nodes in multiple AZs.
No failover or replication
Yes, with auto-failover. Uses read replicas (0-5
per shard)
Yes, with auto-failover. Uses read replicas (0-
5 per shard)
ScalingUp (node type); out (add nodes)Single shard (can add replicas)Add shards
MultithreadedYesNoNo
Backup and restoreNo (and no snapshots)Yes, automatic and manual snapshotsYes, automatic and manual snapshots

Amazon ElastiCache – Memcached

  1. Simplest model and you can run large nodes
  2. Memcached can be scaled in and out

Amazon ElastiCache – Redis

  1. Open-source in-memory key-value store
  2. Supports more complex data structures: sorted sets and lists
  3. Supports master / slave replication and multi-AZ for cross-AZ redundancy

Published by

Unknown's avatar

sevanand yadav

software engineer working as web developer having specialization in spring MVC with mysql,hibernate

Leave a comment