My Learning Path: May 2017

Wednesday, May 31, 2017

Amazon S3

S3: Simple Storage Service
    S3 provides developers and IT teams with secure, durable, highly-scalable object storage. S3 is easy to use with simple webservices interface to store and retrieve any amount of information from anywhere on the web.
    It is object based storage(flat files).
    It is spread across multiple devices and facilities.
    Files can be from 0 bytes to 5TB.
    There is unlimited storage.
    Files are stored in buckets.
    S3 is a universal namespace, i.e., names must be unique throughout the world.
    The S3 DNS is something like
     https://S3-region-name.amazonaws.com/name_of_the_bucket
    When you upload a file to S3 we always get a HTTP 200 code if the upload was successful.

    Data Consistency Model for S3:
    If we are writing a new object to S3 it will be available immediately, but if we are editing the record or delete it, it will take some time for the changes to take place.
    The reason behind this is AWS doesnt let you see any corrupt data whatsoever.
    Read after write consistency for PUTS of new objects. -- Can read immediately after writing
    Eventual consistency of overwrite PUTS and DELETES(can take some time to propagate.)
--Updates and deletes can take some time
    S3 is a simple Key-Value Store.
    S3 is object based. Objects consists of the following:

Key(This is simply the name of the object)
Value(This is simply the data and is made up of a sequence of bytes)
Version ID(Important for versioning)
Metadata(Data about the data that is being stored)

        Subresources:
            Access Control Lists.
            Torrent.

Built for 99.99% availability for the S3 platform.
Amazon Guarantees 99.99% availability.
Amazon Guarantees 99.999999999% durability for S3 information.
(9*11)
Tiered storage available.
Lifecycle Management.
Versioning.
Encryption.
Secure the data in couple of different ways using Access Control Lists and Bucket Policies.

S3 Storage Classes/Tiers:

S3: 99.99% availability, 9*11 durability, stored across multiple devices in multiple facilities and is designed to sustain the loss of 2 facilities concurrently.
S3: IA(Infrequently Accessed), For data that is accessed less frequently, but requires rapid access when needed. Lower fees than S3 but charged for retrieval.
Reduced Redundancy Storage: 99.99% durability and 99.99% availability of objects over a given year.
Glacier: Very cheap but used for archival only. It takes 3-4 hours to restore from glacier.

Glacier:

Extremely Low.
Used only for data archieval.
$0.01 per GB per month.
Retrieval times from 3-5 hours.

S3- Charges:

Storage
Requests
Storage Management Pricing
Data Transfer Pricing
Transfer Acceleration (fast easy and secure transfer of files over long distances b/n end users and an S3 bucket. It takes advantage of CloudFront's globally distributed edge locations. As the data arrives at the edge location, data is routed to Amazon S3 over an optimized network path)

S3 FAQs IMP

S3 falls under the storage services of the AWS.
This is based on object based storage and can be used to store files, docs, multimedia, audio files, video files etc., and not to do software installations.

As soon as we login to the console we can see S3 under storage.
The first step under the S3 is we need to create a bucket.

Under the bucket created we have some more options available under the properties:

Versioning :

To see the different versions of the same object in the bucket.

Static Web Hosting :

Allows to host a static web site without any server side technologies (plain html).
This has tremendous advantages.
No worries about load balancing, auto scaling, virtual machines.
It scales automatically and extremely low cost.

Logging :

We can set up log reports.

Under the Advanced Settings:

Tags :

Using tags for cost controls.

Cross Region Replication :

Used for replicating the objects in different regions.

Events :

Under events we can have specific notifications when specific events occur in your bucket.
For Ex: Someone uploaded a file and we want to invoke a lambda function and that lambda function is going to convert that into a thumb nail and save the output of that in another bucket.

Under the Lifecycle :

We can have settings which will say if the object is not used by the user in certain number of days, move it to less frequently used tier or if it is more than 120 days move it to Glacier and things like that can be configured here.
This specifies the different storage tiers.

Under Permissions :

ACL (Access Control Lists) :

We can specify the access control lists. By default all the buckets created are private/inaccessible.

Under Management :

Analytics :

We can do analytic s for different storage classes.

Metrics and Inventory:

Uploading some data in S3:

The first time we upload some data in S3, the permissions are not set to it(it is in private mode so no one can actually access it)
The format for the object will be something like this :
https://s3.amazonaws.com/bucketName/objectName.

As soon as you upload an object and try to access it they will be an error message saying the access is denied.
If we want the object to be accessed by someone, we need to either specify the email id/everyone/aws verified user.

Under the object permission there are two things
1. Giving the user permission to read/write the object.
2. Giving the user permission to read/write the object permissions(giving the authority to actually autorize the others users).

Once the versioning is enabled on S3, we cannot remove versioning but we can only disable it.
To check the versions of an object, we can view that on the drop down beside the file.

From the architecture point of view, we should not have large files with versioning checked in with out any archieval planning after certain versions. As each file is in its original size.

When we delete a file in the bucket it disappears from the version but it is essentially hiding the versioning and not actually deleting the file. To restore the file back after delete, just go back to show versions and delete the 'Delete Marker', which will delete the invisible condition set on the object.
It is a great back up tool.

An additional layer of security can be added to the S3 buckets by creating MFA delete capability which can avoid accidental deletion of S3 objects.

Creating a static website using amazon S3:

Create a bucket in S3, upload objects into the bucket.
Under the website hosting section of the bucket we can enable website with an index.html and error.html as the landing and error pages for our static website.

Cross Origin Resource Sharing (CORS):

This is a way of referring code in one S3 bucket using JavaScript in another S3 bucket.
This allows all the buckets talk to each other.
Under the CORS section for the sample just mention the URL of the bucket which it should allow to be accessed from.

Polly:

Polly is text to speech recognition service.

S3 - Versioning Lab

Create a bucket and while creating it enable versioning.
Versioning actually stores all the objects/versions seperately.
Once Versioning is enabled it cannot be removed but only be disabled.
If versioning has to be removed, we have to create a new bucket and transfer the objects to that bucket.
If we delete a version we cannot restore it back, but if we delete an object itself we can restore it back.

Cross Region Replication

Versioning must be enabled on both the source and the destination buckets.
Regions must be unique.
Files in an existing bucket are not replicated automatically. All the subsequent updated files will be replicated automatically.
You cannot replicate to multiple buckets or use daisy chaining.
Delete markers are replicated.
Deleting individual versions or delete markets will not be replicated.

Lifecycle Mangement, IA S3 and Galcier - Hands On

Lifecycle Management:

This basically helps in maintaining the life cycle of the object by writing a rule for the object under management section, by adding a life cycle rule.

Life cycle rule: These rules help in manage the life cycle cost by transitioning from S3, after certain time into IA S3 and then to Glacier archiving the files which are least frequently used there by reducing the cost of the storage drastically.

Under the life cycle transition rules we can configure either on the current version or on the previous versions.

Object Created -- 30days later --> Transitioned to IA -- 30days later --> Transitioned to Glacier

-- 425 days later --> Expires

If the number of days in the glacier is less than 90 days and we want to expire it before completing 90 days in the glacier, we will be required to give an extra authorization saying we want to delete the object even though we are charged for 90 days(which logically doesn't make sense).
Glacier is designed to store an object at least for a minimum of 90 days.

Min 30 days after creation to IA S3, 60 days to Glacier and 61 days to Expire

Wednesday, May 24, 2017

AWS Cloud Formation

CloudFormation allows us to quickly and easily deploy your infrastructure resources and applications on AWS.
We can either 'Create New Stack' ( if we do not have any stacks ) or 'Launch CloudFormer' if a Stack is already available.
The use of services such as CloudFormation, ElasticBealStalk and AutoScaling are free, we are actually paying for the resources these services create

To get the attribute name we use the function Fn::GetAtt for the value in the outputs tag will return the name of the attribute specified.
If there is an error in the script CloudFormation will automatically roll back all the resources that were created.
Rollback is enabled by default.

Amazon SWF Service: Simple WorkFlow Service

SWF is a webservice that makes it easy to coordinate work across distributed application components.
SWF enables applications for a wide range of use cases, including media processing, web-applications back-ends, business process workflows, and analytics pipelines, to be designed as a coordination of tasks.
Tasks represent invocations of various processing steps in an application which can be performed by executable code, web serice calls, human actions, and scripts.

SWF Workers & Deciders:

The Workers and Deciders can run on Cloud Infrastructure, such as Amazon EC2, or on machines behind the firewalls. SWF brokers the interaction between the workers and deciders.
It allows the deciders to get consistent views into the progress of tasks and to initiate new tasks in an ongoing manner.
At the same time, Amazon SWF stores tasks, assign them to workers when they are ready, and monitors their progress. It ensures that the task is assigned only once and is never duplicated. Since SWF maintains the applications state durably, workers and deciders dont have to keep track of execution state. They can run independantly and scale quickly.

    SQS and SWF-- SWF is assinged only once and SQS can have duplicate messages.

    SWF Domain:
    The workflow and activity types and the workflow execution itself are all scoped to a domain. Domains isolate a set of types, executions and task lists from others within the same account.

    We can register a domain using amazon console or by RegisterDomain action in the Amazon SWF API.

    SQS - By default 12 hours and SWF - 1 year and the value is always measured in seconds.

    SWF -- Task Oriented API, Message only once never duplicated, Keeps track of all the events and tasks, Human interaction
    SQS -- Message Oriented API, Duplicate messages need to be handled, Application level tracking, No Human Interaction

SNS : Simple Notification Service;

SNS : Simple Notification Service;

It is a web service that makes it easy to set up, operate and send notifications from the cloud.
It provides developers with a highly scalable, flexible and cost effective capability to publish messages from an application and immediately deliver them to subscribers or other applications.

SNS follows publish-subscribe messaging paradigm, with notifications being delivered to clients using a "push" mechanism that eliminates the need to periodically check or poll for the new information and updates.

With simple APIs requiring minimal up-front development effort, no maintainance, SNS gives developers an easy mechanism to incorporate a powerfull notification system with their applications.
SNS -- PUSH ; SQS -- PULL/POLL;

SNS can deliver notifications by SMS Text Messages or email, to SQS queues or to an HTTP end point.
To prevent the messages from getting lost the SNS messages are stored across multiple availability zones.
SNS allows you to group multiple receiptients using topics. Topic is an access point for allowing receiptients to dynamically subscribing for identical copies of the same notification.

One Topic -- Multiple end points
        -- IOS devices
        -- Android devices
        -- SMS Receipients etc.,
        When we publish once to a topic SNS will deliver appropriately formatted copies of your message to each subscriber.

SNS Benefits:
    -- Instantaneous, push-based delivery (no polling)
    -- Simple APIs and easy integration with applications
    -- Flexible message delivery over multiple tranport protocols.
    -- Inexpensive, pay-as-you-go model with no upfront costs.
    -- Web-based AWS Management Console offers simplicity of a point and click interface.

SNS Vs. SQS
-- Both messaging services.
-- SNS -- Push
-- SQS -- Poll/Pulls

SNS Pricing
-- $.50 for 1 million requests for SNS.
-- $.06 per 100,000 notification deliveries over HTTP.
-- $.75 for 100 notifications deliveries over SMS.
-- $2.00 for 100,000 notification deliveries over Email.

SQS - Simple Queue Service

SQS : Simple Queue Service

It s web service that gives you access to a message queue that can be used to store messages while waiting for a computer to process them.
It is a distributed queue system that enables web service applications to quickly and reliably queue messages that one component in the applications generates to be consumed by another component. A queue is a temporary repository for messages that are awaiting processing.
Decouple the components of an application so they can run independently, amazon SQS easing the message management between components. Any component of a distributed application can store messages in a fail-safe queue.
Messages can contain up to 256KB of text in any format. Any component can later retrieve messages programmatically using the Amazon SQS API.

    Issue SQS resolves: Producer is producing work faster than the consumer can process it, or if the producer and consumer are intermittently connected to the N/W.

    With SQS service ensures delivery of a message atleast once and supports multiple readers and writers interacting with the same queue.

    A single queue can be used by simultaneously by many distributed application components, with no need for the components to coordinate with each other to share the queue.

    It is engineered to be always available and deliver messages.

    SQS TradeOff : SQS doesnt guarantee first in first out delivery of messages. For many distributed applications, each message can stand on its own and wait as long as the messages are delivered, the order is not important.

    If the system requires order to be preserved, we can place sequencing information in each message, so that we can reorder the messages when the queue returns them.

    The visibility time out clock starts only when the application server has picked up the service atleast once. If the server goes offline the visibility timeout also expires and the message will still be available for another application server.

    Only when the message is deleted from the queue thats when the message is complete.

    If the messages have gone close to the threshold, it will make more application servers spin to complete the messages. SQS with Autoscaling is VERY POWERFUL.

    Exam Q's :

No First In First Out
12 hours visibility timeout by default.
SQS is engineered to provide the message at least once. We need to make sure as developer so that multiple messages will not give errors/inconsistencies.
256KB message now available
Billed at 64KB chunks.
A 256KB is 4 chunks( 4* 64KB).

Pricing:

First 1 million SQS requests per month are free.
$0.50 per million SQS requests per month there after ($0.00000050 per SQS request).
A single request can have 1 to 10 messages upto a maximum total payload of 256KB.
Even a single API call for 256KB is billed 4 times per 64KB.

Tuesday, May 23, 2017

AWS Databases

Services Available under AWS Databases:

RDS
DynamoDB
ElastiCache
Redshift
DMS

RDS Types Available :

SQL Server
Oracle
MySQL Server
PostgreSQL
Aurora
MariaDB

NRDB :

Monday, May 1, 2017

EC2 Elastic Cloud Compute

EC2:

EC2 is a web service that provides resizable compute capability in the cloud. It reduced the time required to obtain and boot new servers into minutes, allowing us to quickly scale the capacity both up and down, as the computing requirements changes.
It is simply a Virtual Machine in the cloud, can be a Linux server, Windows server etc.,
It allows you to pay only for the capacity that we actually use, it provides the developers with the tools to build a failure resilient applications and isolate them from common failure scenarios.

Types of payment options for EC2:

On Demand: It lets you pay fixed rate by hour with no commitment.
Users with that want the low cost and flexibility of EC2 without any upfront payment or long term contract.
Application with short term, spiky or unpredictable workloads that cannot be interrupted.
Applications being developed or tested on EC2 for the first time.
Reserved: Provides you with a capacity reservation with a significant discount on the hourly charge for an instance. 1 year or 3 year term.
Applications with steady state or predictable usage.
Applications that require reserved capacity.
Users able to make upfront payments to reduce their total computing costs further.
Spot pricing: Enables you to bid for whatever price you want for instance capacity, providing for even greater savings if the applications have flexible start and end times.
Applications that are feasible at very low compute prices.
Users with urgent computing needs for large amounts of additional capacity.
Dedicated Hosts: Physical EC2 servers dedicated for your use. Dedicated hosts can help you reduce costs by allowing you to use your existing server bound software licenses. This can be paid by hourly price.
Useful for regulatory requirements that may not support multi-tenant virtualisation.
Great for licensing which does not support multi-tenancy or cloud deployments. For ex:-Oracle
Can be purchased on demand(hourly).
Can be purchased as a reservation for up to 70% off the on demand price.

Diff. EC2 instances:

How to Remember them:

EBS:
Amazon EBS allows you to create storage volumes and attach them to EC2 instances. Once attached you can create a file system on top of these volumes, run a database or use them in any other way as a block device.
EBS are placed in a specific availability zone and are automatically replicated to protect from failure of a single component.

EBS Volume Types:

General Purpose SSD(GP2):- Bootable

General purpose, balances both price and performance.
Ratio of 3 IOPS per GB with upto 10000 IOPS and the ability to burst upto 3000 IOPS for extended periods of time for volumes under 1GB.

Provisioned IOPS SSD(101)- Bootable

Designed for I/O intensive applications such as large relational or NoSQL databases.
Use if you need more than 10000 IOPS.
Can provision upto 20000 IOPS per volume.

Throughput optimized HDD(ST1) - Not bootable

Big data
Data warehouses
Log processing

Cold HDD(SC1) - Not bootable

Lowest cost storage for in frequently accessed workloads
Used for File Server

Magnetic (Standard) - Bootable

Lowest cost per GB of all EBS types that is bootable. Magnetic volumes are ideal for workloads where data is accessed infrequently and applications where the lowest cost is important.

Launching an EC2 Instance:

EC2 is available under the Compute section in the console. EC2 is area specific, so there are some types of instances that are not available in all the regions.

Click on the launch instance to start the process of installation. The first step would be to choose between the different AMIs(Amazon Machine Images) which is a SnapShot of virtual machines. We have the flexibility to choose between the different Operating Systems. Different Types of virtualizations : http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/virtualization_types.html
Next will give you the type of instance that we want depending on the need from DRMCGIFTPIX of which some are eligible for free trial.
In the configure instance section this is the main screen. In the advance section we can pass bootstrap scripts to the EC2 instance. One subnet is equal to one availability zone.
In the add storage section, root volume is the bootable volume from where the OS loads. If the delete on termination is checked, when the EC2 instance is terminated this volume is deleted as well.
In the add tags section we can specify the key and value for the tag which can be tagged to the EC2 instance. Tagging as much as possible helps to track who is using the instance there by helping in cost efficiency.
In the configure security section, they are virtual files. Its a virtual firewall. In the source section we can specify down to the IP address that we want to use to access the instance.

We will need to create a key pair. We have a public key and a private key, we can have same public key for multiple EC2 instances and have different private keys to instances.

Types of security groups (ex: SSH, HTTP, HTTPS etc.,): http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/security-group-rules-reference.html

Commands Used
-- yum install httpd -y -- used to install apache

Security Groups:

Security groups helps you attach them to the EC2 there by creating a firewall to the instance. HTTP for instance allows you to access the public IP address using the web browser and this can be limited down to the IP address.
As soon as we make any change to the security group the effect kicks in immediately.
All inbound traffic is Blocked by default.
All Outbound Traffic is allowed by default.
You can have any number of EC2 instances within a security group.
You can have multiple security groups attached to EC2 instances.
Security Groups are StateLess.
-> If you create an inbound rule allowing traffic in, that traffic is automatically allowed back out again.
You cannot block a specific IP addresses using security Groups, instead use Network Access Control Lists.
You can specify allow rules, but not deny rules.

Volume & Snapshot Lab:

Snapshot is a point in time photograph of harddisk. Everytime we take a new snapshot only the data that is changed is stored.
Volumes exist on EBS(Elastic Block Storage). Volume is a virtual harddisk attached to the EC2 instance.
Snapshots exist on S3.
Snapshots are point in time copies of volumes.
Snapshots are incremental, this means that only the block that have changed since your last snapshot are moved to S3.

Commands used on IOS Terminal:
CHMOD 400 MyEC2Key.pem -- Encrypts the key to make it possible to SSH to EC2 instance.
ssh ec2-user@31.229.**.*** -i MyEC2Key.pem -- to ssh into the ec2 instance using the key
sudo yum update - to get all the updates
sudo su - get the root level access
yum install httpd -y -- to install Apache on the instance to make the instance a web server
cd /var/www/html -- to go into html directory.

nano index.html -- create a small text editor file
service httpd start -- to start apache

-- Now after starting the web server we enter the public IP address of the EC2 instance we can access the html file we created.

SSH into the EBS volumes:

Follow above to actually SSH into the EC2 instance, from then:
lsblk -- will give the information about the volumes that are attached to the EC2 instance and the volumes that are available.
mkfs -t ext4 /dev/xvdb -- will create a file system
mkdir /hello -- will create a new folder
mount /dev/xvdb /hello -- the new volume other than root volume will be attached to this folder
cd /hello -- goto hello
nano test.html -- create a html file named test.
umount /dev/xvdb -- unmount from the folder /hello

Exam Tips :

EBS Volumes can be changed on the fly(except for magnetic standard).
Best practice to stop the EC2 instance and then change the volume.
You can change volume types by taking a snapshot and then using the snapshot to create a new volume.
If you change a volume on the fly you must wait for 6 hours before making another change.
You can scale EBS volumes up only.
Volumes must be in the same AZ as the EC2 instances.

EFS (Not an exam imp topic):

Amazon Elastic File System(EFS) is a file storage service for Amazon EC2 instances. Amazon EFS is easy to use and provides a simple interface that allows you to create and configure file systems quickly and easily. With Amazon EFS, storage capacity is elastic, growing and shrinking automatically as you add and remove files, so your applications have the storage they need, when they need it.
Supports the Network File System version 4(NFSv4) protocol.
You only pay for the storage you use.
Can scale upto the petabytes.
Can support thousands of concurrent NFS connections
Data is stored accross multiple AZ's within a region
Read after write consistency.
The basic difference b/n EFS and EBS is that once we can mount any number of instances on one EFS using a load balancer and EBS can only support one EC2 instance.

CLI Commands - Developers Associate Exam :

aws ec2 describe-instances -- This basically tells us about all the EC2 instances that we have running.
aws ec2 describe-images -- Returns all the images that are available to provision
aws ec2 terminate-instances -- terminates the instances.
aws ec2 run-instance -- create instances/launch instances
aws ec2 start-instance-- used to start an existing instance.

Getting Instance Metadata:

curl http://169.254.169.254/latest/meta-data/ -- to get all the available details
curl http://169.254.169.254/latest/meta-data/public-ipv4 -- will give the public IP Address for the EC2 instance.
For a developer when he wants to access the public ip for the instance and echo we use this command as a variable and echo it.

Elastic Load Balancers:

Elastic Load Balancer is a virtual appliance that spreads the load of the traffic across the different web servers.
Two types : Application Load Balancer(layer 7); Classic Load Balancer (layer 4).
Instances monitored by ELS are reported as : InService , or OutofService.
Health checks check the instance health by talking to it.
Have their own DNS name. You are never given an IP address.
Read the ELB FAQs for Classic Load Balancers.

Exam Tips:

Know all the available SDK's https://aws.amazon.com/tools/
Android, iOS, JavaScript (Browser)
Java
.Net
Node.js
PHP
Python
Ruby
Go
C++

SDK Default Regions:

Default Region - US-EAST-1
Some have default regions (Java).
Some do not (Node.js).

Lambda (Not imp for Exam):
-> AWS Lambda is a compute service where you can upload your code and create a Lambda function. AWS lambda takes care of provisioning and managing the servers that you use to run the code. You dont have to worry about operating systems, patching, scaling, etc. You can use Lambda in the following ways:

As an event-driven compute service where AWS Lambda runs your code in response to events. These events could be changes to data in an Amazon S3 bucket or an Amazon DynamoDB table.
As a compute service to run the code in response to HTTP requests using Amazon API Gateway or API calls made using AWS SDKs.

Languages that can be used :

Node.js
Java
Python
C#