Amazon Elastic Compute Cloud (EC2)
What is Amazon EC2?
Amazon Elastic Compute Cloud (EC2) provides secure, resizable compute capacity in the AWS Cloud through virtual servers called EC2 instances.
Why Use EC2?
Compared to traditional on-premises infrastructure (which requires upfront hardware costs, delivery delays, and complex setup), EC2 allows you to:
- Provision and launch virtual servers within minutes
- Stop using instances when you’re done to save costs
- Pay only for the compute time you use
How Does EC2 Work?
Launch
- Choose a template with pre-configured settings (OS, app server, etc.)
- Select an instance type (hardware specs)
- Set up security rules to control network traffic
Connect
Connect to the instance via multiple methods:
- Through programs/applications
- Direct user access (e.g., logging into the desktop)
Use
Once connected, you can:
- Run commands
- Install software
- Manage files and storage
Amazon EC2 Instance Types
Amazon EC2 offers different instance types tailored for specific workloads. When selecting an instance type, it’s important to consider your application’s needs in compute, memory, storage, and networking.
General Purpose Instances
These provide a balanced mix of compute, memory, and networking. Suitable for application servers, gaming servers backend enterprise applications, and small to medium databases. You should use general purpose instances when your application has evenly distributed resource needs.
Compute Optimized Instances
Best for compute-intensive workloads that benefit from high-performance CPUs. Suitable for: High-performance web servers, compute-intensive application servers, dedicated gaming servers, and batch processing workloads.
You should use compute optimized instances when processing power is the top priority.
Memory Optimized Instances
Designed for workloads that process large datasets in memory. Suitable for:
- High-performance databases
- Real-time processing of large unstructured data
- Workloads requiring fast memory access
Memory optimized instances provide high memory capacity and performance.
Accelerated Computing Instances
Use specialized hardware accelerators to enhance performance for specific tasks. Suitable for:
- Floating-point calculations
- Graphics processing
- Data pattern matching
- Game and application streaming
These instances offload compute tasks from the CPU to dedicated hardware for efficiency.
Storage Optimized Instances
Designed for workloads that need high, sequential read and write access to large datasets. Suitable for:
- Distributed file systems
- Data warehousing
- High-frequency online transaction processing (OLTP)
Pricing
With Amazon EC2, you pay only for the compute time you use. AWS offers multiple pricing models to suit various use cases and budget strategies.
On-Demand Instances
- Ideal for short-term, unpredictable workloads
- No upfront costs or long-term commitments
- Instances run until manually stopped
- Use cases: development, testing, or apps with irregular usage
- Not cost-effective for year-long or ongoing workloads
Amazon EC2 Savings Plans
- Commit to a consistent compute usage (1-year or 3-year term)
- Save up to 72% over On-Demand pricing
- Usage within commitment billed at a discounted rate
- Excess usage billed at On-Demand rates
- AWS Cost Explorer can help analyze usage and suggest savings
Reserved Instances
- Apply billing discounts to On-Demand Instances
- Options:
- Standard and Convertible (1 or 3 years)
- Scheduled (1 year)
- Greater discounts for longer terms
- Continue using instances after term ends at On-Demand rates unless renewed
Spot Instances
- Use unused EC2 capacity at up to 90% discount
- Ideal for flexible or interruptible workloads
- Use cases: background processing, data analytics
- Not suitable for critical tasks requiring consistent availability
- Instances may be interrupted when demand increases or capacity drops
Dedicated Hosts
- Physical servers fully dedicated to your use
- Helps with license compliance (per-socket, per-core, per-VM)
- Available as On-Demand or Reserved
- Most expensive EC2 pricing option
- Suitable for workloads with strict compliance or licensing needs
Scaling
Scalability Overview
Scalability means starting with only the resources you need and adjusting automatically based on demand. This ensures:
- You only pay for what you use
- You have enough capacity to meet customer needs without over-provisioning
Amazon EC2 Auto Scaling
Amazon EC2 Auto Scaling allows you to automatically add or remove EC2 instances in response to changes in demand. This helps maintain application availability and reduces manual intervention.
Use cases include handling traffic spikes, preventing slow response times, and optimizing cost.
Two Types of Scaling
Dynamic scaling: reacts in real time to changes in demand Predictive scaling: forecasts demand and schedules scaling actions in advance
Auto Scaling Group Configuration
When creating an Auto Scaling group, you can define:
- Minimum capacity: the number of instances that should always be running.
- Desired capacity: the target number of instances to maintain. If unspecified, it defaults to the minimum.
- Maximum capacity: the upper limit of instances to allow scaling.
Elastic Load Balancer
Elastic Load Balancing is an AWS service that automatically distributes incoming application traffic across multiple resources, such as Amazon EC2 instances.
Functions
- Acts as a single point of contact for all incoming web traffic
- Automatically routes requests to available EC2 instances
- Distributes workload evenly, preventing any single instance from being overwhelmed
Integration with Auto Scaling
Elastic Load Balancing works closely with Amazon EC2 Auto Scaling. As the number of EC2 instances increases or decreases based on traffic demand, the load balancer ensures that incoming requests are evenly distributed among all active instances.
Real-World Analogy
Low demand: A few customers are at a coffee shop. Only a few registers (EC2 instances) are open, which is efficient for the demand. High demand: More customers arrive, so more registers (instances) are opened by Auto Scaling. An employee (the load balancer) directs customers to the next available register, distributing the load evenly.