Amazon Elastic Compute Cloud (EC2)

What is Amazon EC2?

Amazon Elastic Compute Cloud (EC2) provides secure, resizable compute capacity in the AWS Cloud through virtual servers called EC2 instances.

Why Use EC2?

Compared to traditional on-premises infrastructure (which requires upfront hardware costs, delivery delays, and complex setup), EC2 allows you to:

Provision and launch virtual servers within minutes
Stop using instances when you’re done to save costs
Pay only for the compute time you use

How Does EC2 Work?

Launch

Choose a template with pre-configured settings (OS, app server, etc.)
Select an instance type (hardware specs)
Set up security rules to control network traffic

Connect

Connect to the instance via multiple methods:

Through programs/applications
Direct user access (e.g., logging into the desktop)

Use

Once connected, you can:

Run commands
Install software
Manage files and storage

Amazon EC2 Instance Types

Amazon EC2 offers different instance types tailored for specific workloads. When selecting an instance type, it’s important to consider your application’s needs in compute, memory, storage, and networking.

General Purpose Instances

These provide a balanced mix of compute, memory, and networking. Suitable for application servers, gaming servers backend enterprise applications, and small to medium databases. You should use general purpose instances when your application has evenly distributed resource needs.

Compute Optimized Instances

Best for compute-intensive workloads that benefit from high-performance CPUs. Suitable for: High-performance web servers, compute-intensive application servers, dedicated gaming servers, and batch processing workloads.

You should use compute optimized instances when processing power is the top priority.

Memory Optimized Instances

Designed for workloads that process large datasets in memory. Suitable for:

High-performance databases
Real-time processing of large unstructured data
Workloads requiring fast memory access

Memory optimized instances provide high memory capacity and performance.

Accelerated Computing Instances

Use specialized hardware accelerators to enhance performance for specific tasks. Suitable for:

Floating-point calculations
Graphics processing
Data pattern matching
Game and application streaming

These instances offload compute tasks from the CPU to dedicated hardware for efficiency.

Storage Optimized Instances

Designed for workloads that need high, sequential read and write access to large datasets. Suitable for:

Distributed file systems
Data warehousing
High-frequency online transaction processing (OLTP)

Pricing

With Amazon EC2, you pay only for the compute time you use. AWS offers multiple pricing models to suit various use cases and budget strategies.

On-Demand Instances

Ideal for short-term, unpredictable workloads
No upfront costs or long-term commitments
Instances run until manually stopped
Use cases: development, testing, or apps with irregular usage
Not cost-effective for year-long or ongoing workloads

Amazon EC2 Savings Plans

Commit to a consistent compute usage (1-year or 3-year term)
Save up to 72% over On-Demand pricing
Usage within commitment billed at a discounted rate
Excess usage billed at On-Demand rates
AWS Cost Explorer can help analyze usage and suggest savings

Reserved Instances

Apply billing discounts to On-Demand Instances
Options:
- Standard and Convertible (1 or 3 years)
- Scheduled (1 year)
Greater discounts for longer terms
Continue using instances after term ends at On-Demand rates unless renewed

Spot Instances

Use unused EC2 capacity at up to 90% discount
Ideal for flexible or interruptible workloads
Use cases: background processing, data analytics
Not suitable for critical tasks requiring consistent availability
Instances may be interrupted when demand increases or capacity drops

Dedicated Hosts

Physical servers fully dedicated to your use
Helps with license compliance (per-socket, per-core, per-VM)
Available as On-Demand or Reserved
Most expensive EC2 pricing option
Suitable for workloads with strict compliance or licensing needs

Scaling

Scalability Overview

Scalability means starting with only the resources you need and adjusting automatically based on demand. This ensures:

You only pay for what you use
You have enough capacity to meet customer needs without over-provisioning

Amazon EC2 Auto Scaling

Amazon EC2 Auto Scaling allows you to automatically add or remove EC2 instances in response to changes in demand. This helps maintain application availability and reduces manual intervention.

Use cases include handling traffic spikes, preventing slow response times, and optimizing cost.

Two Types of Scaling

Dynamic scaling: reacts in real time to changes in demand Predictive scaling: forecasts demand and schedules scaling actions in advance

Auto Scaling Group Configuration

When creating an Auto Scaling group, you can define:

Minimum capacity: the number of instances that should always be running.
Desired capacity: the target number of instances to maintain. If unspecified, it defaults to the minimum.
Maximum capacity: the upper limit of instances to allow scaling.

Elastic Load Balancer

Elastic Load Balancing is an AWS service that automatically distributes incoming application traffic across multiple resources, such as Amazon EC2 instances.

Functions

Acts as a single point of contact for all incoming web traffic
Automatically routes requests to available EC2 instances
Distributes workload evenly, preventing any single instance from being overwhelmed

Integration with Auto Scaling

Elastic Load Balancing works closely with Amazon EC2 Auto Scaling. As the number of EC2 instances increases or decreases based on traffic demand, the load balancer ensures that incoming requests are evenly distributed among all active instances.

Real-World Analogy

Low demand: A few customers are at a coffee shop. Only a few registers (EC2 instances) are open, which is efficient for the demand. High demand: More customers arrive, so more registers (instances) are opened by Auto Scaling. An employee (the load balancer) directs customers to the next available register, distributing the load evenly.