Skip to content

Amazon Elastic Compute Cloud (EC2)

What is Amazon EC2?

Amazon Elastic Compute Cloud (EC2) provides secure, resizable compute capacity in the AWS Cloud through virtual servers called EC2 instances.

Why Use EC2?

Compared to traditional on-premises infrastructure (which requires upfront hardware costs, delivery delays, and complex setup), EC2 allows you to:

  • Provision and launch virtual servers within minutes
  • Stop using instances when you’re done to save costs
  • Pay only for the compute time you use

How Does EC2 Work?

Launch

  • Choose a template with pre-configured settings (OS, app server, etc.)
  • Select an instance type (hardware specs)
  • Set up security rules to control network traffic

Connect

Connect to the instance via multiple methods:

  • Through programs/applications
  • Direct user access (e.g., logging into the desktop)

Use

Once connected, you can:

  • Run commands
  • Install software
  • Manage files and storage

Amazon EC2 Instance Types

Amazon EC2 offers different instance types tailored for specific workloads. When selecting an instance type, it’s important to consider your application’s needs in compute, memory, storage, and networking.

General Purpose Instances

These provide a balanced mix of compute, memory, and networking. Suitable for application servers, gaming servers backend enterprise applications, and small to medium databases. You should use general purpose instances when your application has evenly distributed resource needs.

Compute Optimized Instances

Best for compute-intensive workloads that benefit from high-performance CPUs. Suitable for: High-performance web servers, compute-intensive application servers, dedicated gaming servers, and batch processing workloads.

You should use compute optimized instances when processing power is the top priority.

Memory Optimized Instances

Designed for workloads that process large datasets in memory. Suitable for:

  • High-performance databases
  • Real-time processing of large unstructured data
  • Workloads requiring fast memory access

Memory optimized instances provide high memory capacity and performance.

Accelerated Computing Instances

Use specialized hardware accelerators to enhance performance for specific tasks. Suitable for:

  • Floating-point calculations
  • Graphics processing
  • Data pattern matching
  • Game and application streaming

These instances offload compute tasks from the CPU to dedicated hardware for efficiency.

Storage Optimized Instances

Designed for workloads that need high, sequential read and write access to large datasets. Suitable for:

  • Distributed file systems
  • Data warehousing
  • High-frequency online transaction processing (OLTP)

Pricing

With Amazon EC2, you pay only for the compute time you use. AWS offers multiple pricing models to suit various use cases and budget strategies.

On-Demand Instances

  • Ideal for short-term, unpredictable workloads
  • No upfront costs or long-term commitments
  • Instances run until manually stopped
  • Use cases: development, testing, or apps with irregular usage
  • Not cost-effective for year-long or ongoing workloads

Amazon EC2 Savings Plans

  • Commit to a consistent compute usage (1-year or 3-year term)
  • Save up to 72% over On-Demand pricing
  • Usage within commitment billed at a discounted rate
  • Excess usage billed at On-Demand rates
  • AWS Cost Explorer can help analyze usage and suggest savings

Reserved Instances

  • Apply billing discounts to On-Demand Instances
  • Options:
    • Standard and Convertible (1 or 3 years)
    • Scheduled (1 year)
  • Greater discounts for longer terms
  • Continue using instances after term ends at On-Demand rates unless renewed

Spot Instances

  • Use unused EC2 capacity at up to 90% discount
  • Ideal for flexible or interruptible workloads
  • Use cases: background processing, data analytics
  • Not suitable for critical tasks requiring consistent availability
  • Instances may be interrupted when demand increases or capacity drops

Dedicated Hosts

  • Physical servers fully dedicated to your use
  • Helps with license compliance (per-socket, per-core, per-VM)
  • Available as On-Demand or Reserved
  • Most expensive EC2 pricing option
  • Suitable for workloads with strict compliance or licensing needs

Scaling

Scalability Overview

Scalability means starting with only the resources you need and adjusting automatically based on demand. This ensures:

  • You only pay for what you use
  • You have enough capacity to meet customer needs without over-provisioning

Amazon EC2 Auto Scaling

Amazon EC2 Auto Scaling allows you to automatically add or remove EC2 instances in response to changes in demand. This helps maintain application availability and reduces manual intervention.

Use cases include handling traffic spikes, preventing slow response times, and optimizing cost.

Two Types of Scaling

Dynamic scaling: reacts in real time to changes in demand Predictive scaling: forecasts demand and schedules scaling actions in advance

Auto Scaling Group Configuration

When creating an Auto Scaling group, you can define:

  • Minimum capacity: the number of instances that should always be running.
  • Desired capacity: the target number of instances to maintain. If unspecified, it defaults to the minimum.
  • Maximum capacity: the upper limit of instances to allow scaling.

Elastic Load Balancer

Elastic Load Balancing is an AWS service that automatically distributes incoming application traffic across multiple resources, such as Amazon EC2 instances.

Functions

  • Acts as a single point of contact for all incoming web traffic
  • Automatically routes requests to available EC2 instances
  • Distributes workload evenly, preventing any single instance from being overwhelmed

Integration with Auto Scaling

Elastic Load Balancing works closely with Amazon EC2 Auto Scaling. As the number of EC2 instances increases or decreases based on traffic demand, the load balancer ensures that incoming requests are evenly distributed among all active instances.

Real-World Analogy

Low demand: A few customers are at a coffee shop. Only a few registers (EC2 instances) are open, which is efficient for the demand. High demand: More customers arrive, so more registers (instances) are opened by Auto Scaling. An employee (the load balancer) directs customers to the next available register, distributing the load evenly.