Compute Services

Compute services provide dynamically scalable compute capacity in the cloud.
Compute resources can be provisioned on-demand in the form of virtual machines.
Virtual machines can be created from standard images provided by the cloud service provider (e.g., Ubuntu image and Windows server image) or custom images created by the users.
A machine image is a template that contains a software configuration (operating system, application server, and applications).

Amazon EC2

IAAS
TYPES: General Purpose, Compute Optimized, Memory Optimized, Accelerated Computing, Storage Optimized

Amazon Machine Image

Amazon Machine Image (AMI) is an instance template that contains the software configuration (including operating system and applications) required to launch an instance.

AMIs are based on Linux or Windows operating systems.
AMIs can come from different sources such as:
- AMIs published by AWS
- AWS marketplace
- Community AMIs
- Your own AMIs created from existing instances

EC2 Security Groups

Within a security group, you can define rules to allow traffic based on port, protocol, and source or destination.

EC2 Tenancy Options

Amazon EC2 supports the following tenancy options for the instances:

Shared Tenancy: Shared tenancy is the default tenancy model for all EC2 instances. In this model, the instances run on shared hardware. Therefore a single host can host instances from different customers.
Dedicated Instances: Dedicated instances run on a single-tenant hardware, which means that the hardware is dedicated to a single customer.
Dedicated Hosts: In the dedicated hosts model, the instance runs on a Dedicated Host, which is an isolated physical server solely dedicated to a single customer.

EC2 Dedicated Host VS Dedicated Instance

Dedicated Hosts:
- Additional Visibility and Control: With a Dedicated Host, you have control over how your instances are placed on the physical server. This includes knowing which instances are running on which specific server.
- Consistent Deployment: Dedicated Hosts allow you to consistently deploy instances on the same physical server over time, which is useful for specific licensing or compliance needs.
- Software Licenses: If you have software licenses that are tied to a specific physical machine (like certain database or enterprise software), Dedicated Hosts make it possible to comply with those licensing terms.
- More control + license compliance + consistent deployment.
Dedicated Instances:
- These instances also run on dedicated physical hardware, but you don’t get visibility into or control over how the instances are placed on the physical server. The focus is simply on having dedicated hardware for security or compliance.
- Dedicated hardware for isolation, without detailed control.

EC2 Pricing Options

On-Demand Instances
- On-demand instances do not have any upfront costs or commitments
- Users are charged for the running instances on an hourly basis
Reserved Instances:
- Reserved Instances are recommended for long term use or predictable workloads.
Spot Instances:
- With spot instances, you can specify the price that you are willing to pay for a certain instance type. You bid for unused Amazon EC2 capacity

EC2 Placement Groups

Placement Groups allow you to define how your instances are placed on the underlying hardware. A placement group is a logical grouping of instances.

Cluster:
- A cluster placement group clusters instances into a low-latency group in a single availability zone.
- Definition: A cluster placement group is a logical grouping of instances within a single Availability Zone. Instances in the same cluster placement group are placed in close proximity to each other, providing low-latency networking between them.
- Use Case: This is beneficial for applications that require extremely low-latency communication between instances, such as high-performance computing (HPC) applications.
Partition:
- A partition placement group spreads instances across logical partitions, ensuring that instances in one partition do not share underlying hardware with instances in other partitions.
- Definition: A partition placement group is another logical grouping, but it spreads instances across many racks. Each partition within the group represents a logical segment of the overall capacity. Partitions do not share the same rack, providing a degree of fault tolerance.
- Use Case: This type of placement group is suitable for distributed and replicated workloads that benefit from spreading instances across different physical hardware to reduce the risk of simultaneous hardware failures.
Spread:
- A spread placement group spreads instances across underlying hardware.
- Definition: A spread placement group is designed to spread instances across underlying hardware to reduce the risk of correlated failures. Instances in a spread placement group are placed on distinct racks, ensuring that they are physically separate from each other.
- Use Case: Ideal for applications that require high availability by minimizing the impact of hardware failures. It's a good choice for critical applications that need redundancy and fault tolerance.

EC2 Auto Scaling

AWS EC2 Auto Scaling service allows automatic scaling of Amazon EC2 capacity up or down according to user-defined conditions. Amazon EC2 Auto Scaling supports the following types of scaling policies:

Target tracking scaling: With this policy, you can increase or decrease the current capacity of the group based on a target value for a specific metric (such as CPU utilization or network traffic)
- If you want the CPU utilization of the fleet to stay at around 50 percent when the load on the application changes.
- You can create a target tracking scaling policy that targets an average CPU utilization of 50 percent.
- Then, Application Auto Scaling will scale out (increase capacity) when CPU exceeds 50 percent to handle increased load. It will scale in (decrease capacity) when CPU drops below 50 percent to optimize costs during periods of low utilization.
Step scaling: With this policy, you can increase or decrease the current capacity of the group based on a set of scaling adjustments, known as step adjustments, that vary based on the size of the alarm breach.
- You can define different step adjustments based on the breach size of the alarm. For example:
- Scale out by 10 capacity units if the alarm metric reaches 50 percent
- Scale out by 30 capacity units if the alarm metric reaches 75 percent
- Scale out by 40 capacity units if the alarm metric reaches 85 percent
- When the alarm threshold is breached for the specified number of evaluation periods, Application Auto Scaling will apply the step adjustments defined in the policy. The adjustments can continue for additional alarm breaches until the alarm state returns to OK.
Simple scaling: With this policy, you can increase or decrease the current capacity of the group based on a single scaling adjustment.

AWS Elastic Load Balancing

AWS Elastic Load Balancing (ELB) is a managed service that allows you to create load balancers for distributing traffic across a group of EC2 instances. With ELB, you can load balance HTTP, HTTPS, TCP, and SSL traffic to EC2 instances.
ELB supports three types of load balancers as follows:
- Application Load Balancer: Application Load Balancers are meant for web applications with HTTP and HTTPS traffic. Application Load Balancers operate at the request level and provide advanced routing and visibility features.
- Network Load Balancer: Network Load Balancers are meant for TCP and TLS connections. Network Load Balancers operate at the connection level and can handle millions of requests per second securely while maintaining ultra-low latencies.
- Classic Load Balancer: Classic Load Balancer is the previous generation of the load balancer that supports HTTP, HTTPS, and TCP traffic.
ELB uses Health checks to detect unhealthy targets, stop sending traffic to them, and then spread the load across the remaining healthy targets.
An elastic load balancer performs the health check for registered instances using the protocol and path specified for the health check.
- You can set the time interval for health checks, a timeout period, number of consecutive health check failures for an instance to be marked as unhealthy (unhealthy threshold), number of consecutive successful health checks for an instance to be marked as healthy (healthy threshold).

ELB - Sticky Sessions

For session-based applications, you can use the sticky sessions feature to route requests from the same user session to the same instance.
- A sticky session ensures that all requests coming from a user in a session are routed to the same instance.
- If your application doesn’t use session cookies, you can create a session cookie by specifying a stickiness duration.

ELB - Connection Draining

You can enable the Connection Draining feature to stop sending requests to instances that are de-registering or unhealthy while keeping the existing connections open.
This enables the load balancer to complete in-flight requests made to instances that are de-registering or unhealthy.
While enabling connection draining, you have to specify the maximum timeout value between 1 and 3600 seconds.
When the maximum timeout is reached, the load balancer forcibly closes connections to the de-registering instance.