More realistic containerized python 3 application with MSSQL and Kafka running on AWS — Part 2 Deploy and Run The Sample App on AWS

Norman Fung
10 min readJun 27, 2021

If you haven’t read Part 1, go back to it: https://norman-lm-fung.medium.com/more-realistic-containerized-python-3-application-with-mssql-and-kafka-running-on-aws-part-1-f68bf211cee3

Push Docker Image to ECR

STEP 1. Push image to AWS ECR

Go to AWS ECR, create a repo say “casino”

STEP 2. Login

aws ecr get-login-password — region ap-east-1 | docker login — username AWS — password-stdin xxxxxxxxxxxx.dkr.ecr.ap-east-1.amazonaws.com

If you try login to ap-east, you may run into error you’d otherwise not get if you were to push to us-east:

Error saving credentials: error storing credentials — err: exit status 1, out: `error storing credentials — err: exit status 1, out: `The stub received bad data.``

Solution? Simply remove “credsStore”: “wincred” from C:\Users\norman\.docker\config.json

“Login” command will change config.json to:

STEP 3. tag your image and push to AWS ECR

docker tag normanfung/casino-mds:1.0.0 xxxxxxxxxxxx.dkr.ecr.ap-east-1.amazonaws.com/casino:mds-1.0.0

docker push xxxxxxxxxxxx.dkr.ecr.ap-east-1.amazonaws.com/casino:mds-1.0.0

docker tag normanfung/casino-strategies:1.0.0 xxxxxxxxxxxx.dkr.ecr.ap-east-1.amazonaws.com/casino:strategies-1.0.0

docker push xxxxxxxxxxxx.dkr.ecr.ap-east-1.amazonaws.com/casino:strategies-1.0.0

Now, there are two options to run containers from AWS

Option 1: From AWS ECS \ Task Definitions \ Run Tasks \ Advanced Options \ Container Overrides

— env,prod

This will be equivalent to:

docker run normanfung/casino:version1 python /src/mds/hsi_hedge.py — env prod

Option 2: AWS Lambda https://docs.aws.amazon.com/lambda/latest/dg/lambda-python.html

This will be covered in the next section.

AWS Batch

Under AWS Batches \ Compute \ Batches \ Compute Environment, click “Create compute environment”

Pick “Fargate”.

Networking defaults to VPC \ Security Groups \ default security group, and the subnets associated.

Under AWS Batches \ Compute \ Job Queue, click “Create”

Job Queue is associated with the Compute Environment “casino-batch” you just created.

Under AWS Batches \ Compute \ Job Definitions, click “Create”

For Platform, select Fargate.

Copy docker image url from ECR.

Url format is without “http”. For example,

xxxxxxxxxxxx.dkr.ecr.ap-east-1.amazonaws.com/casino:mds-1.0.0

Paste that under Container properties. We’d leave “Command” blank under “Job Definitions” and specify them under “Jobs” instead. This way, we have one Job Definition, but each “Jobs” instance can be configured to run with different parameters.

Note execution role defaults to “ecsTaskExecutionRole”. Also enable “Assign public IP”.

ecsTaskExecutionRole” needs these permissions:

  • AmazonEC2ContainerRegistryReadOnly
  • AmazonEC2ContainerRegistryFullAccess
  • AmazonEC2ContainerRegistryPowerUser

Reference https://towardsdatascience.com/deploy-your-python-app-with-aws-fargate-tutorial-7a48535da586

Verify this from Services \ Security, Identity, and Compliance \ IAM \ Roles.

Now, “Submit New Job”:

Now for Job configurations, select Job Queue you just created.

Repeat these steps to create “hsi-hedge-strategies” referencing container: xxxxxxxxxxxx.dkr.ecr.ap-east-1.amazonaws.com/casino:strategies-1.0.0

Under Container Properties, specify command (no need comma):

— env prod — os linux — hsioverride 28000

hsioverride is to override HSI spot. Instead of fetching from https://query1.finance.yahoo.com/v7/finance/download/%5EHSI?events=history&includeAdjustedClose=true, hsi_hedge.py simply uses the override value.

Command is equivalent to:

docker run normanfung/casion:mds-1.0.0 python /src/mds/hsi_hedge.py — env prod — os linux

Once created, monitor progress under Compute \ AWS Batch \ Jobs. Pick the correct Job Queue and status filter before you can see it.

Drill into details, click Refresh.

Detail logs under Services \ Management & Governance \ CloudWatch \ Log groups (/aws/batch/job)

Details here. Print screens from python will show here:

AWS Lambda

Select “Container Image”. Under CMD, put

— env, prod, — os, linux, — hsioverride,28000

hsioverride is to override HSI spot. Instead of fetching from https://query1.finance.yahoo.com/v7/finance/download/%5EHSI?events=history&includeAdjustedClose=true, hsi_hedge.py simply uses the override value.

Make sure Configuration \ Timeout updated to something more suitable, for example 5 min (Default is 3 sec)

And if python is able to talk to MSK Kafka or MSSQL, make sure Configurations \ VPC updated so Lambda/MSSQL/MSK Kafka lives in same subnets/security groups… etc

For this however, your Lambda execution role needs these permissions.

{ “Version”: “2012–10–17”, “Statement”: [ { “Effect”: “Allow”, “Action”: [ “ec2:DescribeNetworkInterfaces”, “ec2:CreateNetworkInterface”, “ec2:DeleteNetworkInterface”, “ec2:DescribeInstances”, “ec2:AttachNetworkInterface” ], “Resource”: “*” } ] }

Without them, when you try to save above Network settings, you’d run into error: “The provided execution role does not have permissions to call CreateNetworkInterface on EC2”

From IAM, edit the execution role.

Now, click on Policy at the bottom to edit it.

Then click JSON, and add necessary permissions.

{ “Version”: “2012–10–17”, “Statement”: [ { “Effect”: “Allow”, “Action”: [ “ec2:DescribeNetworkInterfaces”, “ec2:CreateNetworkInterface”, “ec2:DeleteNetworkInterface”, “ec2:DescribeInstances”, “ec2:AttachNetworkInterface” ], “Resource”: “*” } ] }

When done click “Review Policy”, then “Save Changes”.

Reference https://stackoverflow.com/questions/41177965/aws-lambdathe-provided-execution-role-does-not-have-permissions-to-call-describ

Also, when you associate your Lambda to a VPC, it’d lose access to the Internet. So web scraping won’t work.

Following the steps in below references will fix this.

STEP 1. Create a public subnet.

Call it public-subnet0. Click “Edit route table association , make sure public-subnet0 uses the route table which forwards traffic to IGW. There should be a route with Destination 0.0.0.0/0 (i.e. wildcard) and Target = IGW.

You’d find the ID “igw-89dd37e0” under \ VPC \ Internet Gateways.

STEP 2. Create a NAT Gateway

Associate it with the public-subnet0 you just created. Yes — NAT resides in a public subnet, not private subnet.

STEP 3. Create a route table “nat-route-table”. Add a route forward all traffic to NAT created.

  • Destination = 0.0.0.0/0
  • Target = NAT

STEP 4. Go back to \ VPC \ Subnets, click “Edit route table association” for three private subnets. Associate them with nat-route-table

Reference

https://www.youtube.com/watch?v=ujXr0i5EoHE

https://www.youtube.com/watch?v=yMzb48BL7qQ

https://www.digitalocean.com/community/tutorials/understanding-ip-addresses-subnets-and-cidr-notation-for-networking

Now just test from “Test”

At this point, job should have failed as expected. This is because we haven’t provisioned MSK Kafka or RDS msql yet.

Go see log from:

Cloud Watch \ Log groups \ /aws/lambda/hsi-hedge-mds

Scheduling

This is done via Cloud Watch \ Rules \ Create Rule

Target can be Lambda:

Target can also be ECS Task:

RDS (Create MSSQL)

Creating MSSQL is very straightforward. There’s not much to explain here. It’s under Services \ Database \ RDS.

Here we choose the smallest SQL Server Express Edition in “db.t3.small”, and for subnet and security groups, we choose default as we did for Lambda/AWS Batch and will do same when we create MSK Kafka — this way everyone resides in same security group and subnet.

To be able to access your new MSSQL from SSMS running on your laptop, go VPC \ Security Groups. Pick Default Security Group, edit Inbound Rule

  • Type MSSQL
  • Port 1433
  • Protocol TCP
  • Source 0.0.0.0/0 (i.e. anywhere)

Once done, you should be able to connect from SSMS via:

casino.chayhhyfhbvi.ap-east-1.rds.amazonaws.com,1433

Check your python settings, update connection strings where needed.

There’s only one place where our sample application connect to mssql: \ src \ strategies \ hsi_hedge.py

Create MSK (Kafka on AWS)

Under Services \ Analytics \ MSK, click Create. I choose “Custom Create” over “Quick Create” so I can make sure

  • MSK resides in the same subnet and security groups as Lambda/AWS Batch and MSSQL. Three Kafka brokers, one in each of three subnets.
  • Plaintext communication enabled (not by default)
  • In terms of sizing, kafka.t3.small selected.

Cluster configuration we want Custom configuration. auto.create.topics.enable=false by default and when publish new topic, you’d run into error:

Error hsi_warrants_mds prod <class ‘kafka.errors.KafkaTimeoutError’> KafkaTimeoutError: Failed to update metadata after 60.0 secs. Traceback (most recent call last)

To keep things simple,

  • Access control method = None
  • Encryption \ Plaintext is enabled (not by default)

Lastly, Deliver to Amazon CloudWatch Logs is enabled and logs from MSK are to be forwarded to Cloud Watch group “MSK” pre-created.

It will take 15–30 minutes to create. Once done, click View client information for MSK connection strings.

Copy from here. Make sure these are the same as in python settings. If not, you’d need to

STEP 1. Update MSK connection strings and local docker images

STEP 2. Push docker images from local to AWS ECR

STEP 3. Redeploy AWS Lambda.

Running both MDS and Strategies

With above preparations done, it’s time to run both

  • MDS (\src\mds\hsi_hedge.py) which pull warrant prices from hangseng.com and publish to Kafka MSK1
  • Strategies (\src\mds\hsi_hedge.py) which listens on MSK1 and generates orders. Orders are saved in MSSQL.

From under AWS Lambda or AWS Batch, whichever you prefer:

STEP 1. Fire off Strategies Lambda function first (Consumer will sit around wait for market data in MSK1), it’d sit around wait for STEP 2 MDS to fire

STEP 2. Fire off MDS Lambda function

Before you start, make sure “casino” database created.

You should be able to see

  • Logs from both lambda functions in Cloud Watch

MDS side first need publish HSI warrant prices to Kafka

From Strategies side, when it first started, you’d see hedge parameters

After MDS side published to Kafka, look for _generate_order

Then check to see if orders are saved in MSSQL casino database. Table “hsi_hedge_orders” is automatically created (pandas DataFrame.to_sql)

select * from casino.dbo.hsi_hedge_orders

Summarizing Problems

So, for a developer with no prior experience in AWS, to deploy a containerized application with fan out architecture is not a straightforward process.

  • MSSQL provisioning

If you want to be able to access MSSQL from SSMS running on your laptop (i.e. outside AWS), you need add an Inbound Rule to default Security Group (Source 0.0.0.0/0, Type MSSQL)

  • MSK provisioning

a. MSK not accessible from your laptop.

https://aws.amazon.com/msk/faqs/

https://repetitive.it/aws-msk-how-to-expose-the-cluster-on-the-public-network/?lang=en

b. Pretty expensive even for the smallest kafka.t3.small . Remember remove it when done

c. Plaintext communication enabled (not by default)

d. auto.create.topics.enable=false by default and when publishing a new topic, you’d run into error: KafkaTimeoutError: Failed to update metadata after 60.0 secs.

  • Lambda

a. By default, no VPC configuration and thus cannot talk to Kafka or MSSQL in your default VPC. But as soon as you associate your Lambda with VPC, your Lambda loses Internet connection. To fix, you’d need to configure NAT gateways and play with route tables.

b. Fix default timeout

c. Execution roles need be granted relevant permissions

d. Passing arguments to function:

— env prod — os linux (AWS Batch \ Jobs take this format)

Vs

— env,prod, — os,linux (AWS Lambda takes this format)

  • AWS ECR

If you try login to ap-east, you may run into error you’d otherwise not get if you were to push to us-east:

Error saving credentials: error storing credentials — err: exit status 1, out: `error storing credentials — err: exit status 1, out: `The stub received bad data.``

Solution? Simply remove “credsStore”: “wincred” from C:\Users\norman\.docker\config.json

That’s it! Happy coding!

--

--

Norman Fung

#python #web3 #crypto #trading #nft #blockchain #smartcontract #journey2makemoneyscientifically