AWS Serverless Application Model + Python.
In my previous blog post Using AWS Serverless Application Model (SAM) First Impressions I wrote about AWS Serverless Application Model (SAM). I have been using SAM in my current project, and SAM is a good, simple tool to create cloud infrastructure on AWS. However, one developer complained in a meeting today, that it is rather slow to develop AWS Lambdas with SAM: for each change you need to deploy the Lambda using SAM and test it on AWS cloud infrastructure. In this blog post, I consider some AWS Lambda development practices.
The AWS Lambda development strategy I explain in this blog post can be summarized like this: Try to minimize the development that you have to do in the AWS/SAM context. I.e., if you develop every piece of code in the AWS/SAM context, you have to build the Lambda, and either test it locally with SAM or deploy the code to AWS. There is a SAM sync command that you can use to sync your local changes with the AWS Lambda, but let’s not talk about that command in this blog post, but consider other alternatives.
Using the strategy mentioned above, my AWS Lambda development practices can be listed using these steps - the first step providing the fastest development feedback, and the last one providing the slowest development feedback.
The idea is that you minimize the code in your Lambda handler python file. In that file, just read the event that triggered the Lambda request. Here is an example from the handler.py
file that is the main entry point to my Lambda function:
from my_package_a import handler_logic as hl
...
@logger.inject_lambda_context(log_event=True)
def lambda_handler(event, context):
logger.debug(f"Context: {context}")
if event:
logger.debug(f"event: {event}")
for record in event["Records"]:
try:
entity = hl.parse_entity(record)
logger.info(f"Got entity: {entity}")
hl.process_entity(entity)
except Exception as e:
...
So, the lambda_handler
just receives the event, and then delegates everything else to the handler_logic.py
: parsing the entity from the record, and processing the entity. This way, I can skip the AWS/SAM context and develop the application logic just using the Python interpreter with the handler_logic.py
file. Another advantage is that you can then implement various unit and integration tests that do not need the AWS/SAM context.
The project uses a certain Lambda package hierarchy which forced me to do a simple trick to import the database module depending on whether we are running the lambda_handler.py
as is, or it is called from the handler.py
:
if __name__ == '__main__':
# Run this file as a main script for development purposes.
import database as db
else:
# This file was imported as a module by the Lambda handler (AWS/SAM context).
from my_package_a import database as db
Then we have the parse_entity
and process_entity
functions:
def parse_entity(record: dict) -> str:
""" Parses the entity from the event record."""
...
def process_entity(entity: str) -> None:
""" Main entry point to the application logic."""
...
And finally, add the main
method at the end of the file. This way you can call your application logic from the command line, as well as call it from the Lambda handler, example:
# For development purposes without the AWS / SAM context.
def main(_):
entity = "MY_CRYPTIC_ENTITY_XXX_YYY_20220608T043249"
process_entity(entity)
return 0
if __name__ == '__main__':
main(sys.argv)
So, we have skipped the parse_entity
part, and call directly the process_entity
which is under development.
Our application logic calls the database. Let’s use the real AWS RDS development database. The database is in a private subnet, so we need to create a SSH tunnel to the development database:
aws-vault my-aws-dev-profile --no-session -- ssh -i ~/.ssh/ssm-dev-instance-key.pem ec2-user@DB_JUMP_SERVER_ID -L MY_LOCAL_IP:54321:dev-db.XXXXXXXXXXXX.eu-west-1.rds.amazonaws.com:5432
Ok. Now we have SSH tunnel to the RDS development database.
I usually create under the personal
folder some personal development stuff I use during development, and the personal
directory is listed in the .gitignore
file so that it is not stored in the git repo. Let’s see the file:
λ> cat personal/set_env_vars.sh
#!/bin/bash
export ENV=dev
export DB_HOST="MY_LOCAL_IP"
export DB_PORT=54321
export DB_NAME="my_database"
...
Just source the file in the terminal you call the handler_logic.py
, and then you can call it:
source ../../personal/set_env_vars.sh
aws-vault exec my-aws-dev-profile --no-session -- python my_package_a/handler_logic.py
Running the python file in the Python interpreter is fast (it starts immediately). The application logic can connect to the RDS development database via SSH tunnel, and you can call all AWS services using the boto3 library and using your aws profile that you provide using aws-vault.
In the previous chapter, I created the SSH tunnel to the RDS development server using my Linux host machine IP. The reason for not using localhost is that the SAM docker container is a black box and the port forwarding is a bit tricky. I also needed to open the firewall to the 54321
port in my Linux host machine, so that the Lambda application running in the SAM docker container can connect to the Linux host machine port:
sudo ufw allow 54321
Create a json file in which you define the environment variables your lambda needs, e.g.:
λ> cat personal/.entity_lambda_env_local.json
{
"EntityFunction": {
"ENV": "dev",
"DB_HOST": "MY_LOCAL_IP",
"DB_PORT": 54321,
...
Now you can build && invoke the AWS Lambda to run it using the local SAM docker container running in your machine:
sam build && aws-vault exec my-aws-dev-profile --no-session -- sam local invoke EntityFunction --env-vars personal/.entity_lambda_env_local.json --event events/my_test_entity_a1.json
Your lambda running locally in the SAM docker container can access the RDS development database and call AWS services using your AWS profile that you provide using aws-vault.
This development cycle is a lot slower than running your application logic directly in the Python interpreter: the sam build takes some 10 seconds, and calling sam local invoke
takes another 10 seconds on my machine. Therefore, I do most of the development in the python interpreter, and just occasionally test the application logic in the SAM context.
Finally, the slowest method is to build the lambda using sam (just 10 seconds), and deploy the lambda package to AWS using SAM (about 1 minute), and test it. The testing phase in my case takes some 6 minutes: use aws cli to send an SNS message, which triggers an SQS event, which finally triggers the lambda under development. So, after 6 minutes I can go to CloudWatch logs to see if the test was successful. The development cycle is some 7 minutes altogether. Since the development cycle is several minutes, I just occasionally test the lambda in the real AWS context to see that it is still working.
This chapter is not related to the content of this blog post. But whenever I do other than Clojure development I miss the Clojure REPL. Doing Python development like this running the file is ok, but nothing like interacting in the editor with the Clojure REPL evaluating S-expressions. I have tried various Python REPLs, but they are nothing compared to a real Clojure REPL, so with Python, I mostly just run my Python code from the terminal and occasionally start the Python terminal REPL e.g. to try some standard library function.
When developing AWS Lambda code, you have several options to speed up the development cycle - if you are using Python, you should do most of the application logic just using the Python interpreter.
The writer is working at a major international IT corporation building cloud infrastructures and implementing applications on top of those infrastructures.
Kari Marttila
Airflow Standalone UI.
In my previous blog post AWS Step Functions - First Impressions I wrote about AWS Step Functions - an AWS native orchestration service that lets you combine various AWS services as part of your process. In this new blog post I compare AWS Step Functions and Amazon Managed Workflows for Apache Airflow.
Amazon Managed Workflows for Apache Airflow is an orchestration service like AWS Step Functions, that lets you define complex processes comprising of individual tasks (as steps in Step Functions). You can define standard Airflow tasks, or use various operators (like Python operator, Bash operator, AWS Lambda operator, AWS ECS Task operator, etc.). On AWS, you can define the same process using either Step Functions or AWS Managed Airflow.
In my current project, I have used both AWS Step Functions and AWS Managed Airflow to define complex processes. In the following chapters I explain some differences between the tools, to help you decide which one to use on AWS.
I implemented both AWS Managed Airflow and AWS Step Functions infrastructure using AWS Serverless Application Model (which is basically AWS CloudFormation, you can read more about using SAM in my previous blog post Using AWS Serverless Application Model (SAM) First Impressions).
Both services were a bit complex to setup using SAM/Cloudformation, but not too difficult. There are good tutorials and examples for both services, you should use utilize those tutorials and examples.
When I now compare the two SAM configurations, I see that there are some 40 lines of code in the Step Functions template.yaml
file (that related directly setting up the Step Functions service), and some 160 lines of code in the AWS Managed Airflow template.yaml
file (that related directly setting up the AWS Managed Airflow service). Here is a short list of resources you need to set up in both sides:
AWS Step Functions:
AWS::Serverless::StateMachine
, and its DefinitionSubstitutions
, Logging
, and Policies
.AWS Managed Airflow:
AWS::S3::Bucket
for AWS Managed Airflow.AWS::IAM::Role
for Airflow Execution.AWS::IAM::ManagedPolicy
for Airflow Execution.AWS::EC2::SecurityGroup
and AWS::EC2::SecurityGroupIngress
for Airflow access.AWS::MWAA::Environment
, and its PolicyDocument
(allow access to the bucket, accessing CloudWatch logs, some SQS management, some KMS management, etc.): the actual AWS Managed Airflow resource.So, there is a bit more to setup in the Airflow side, but that is a one-time task, and not too difficult using the tutorials and examples.
The main difference between AWS Managed Airflow and AWS Step Functions is that using AWS Step Functions you define your process as a JSON document. There is a nice GUI editor as well, and you should use that when designing the crude version of your process.
The main advantage to use AWS Managed Airflow is that you can use Python to define your process and tasks. The overall view of the process is a lot more easier to grasp with one view in the Airflow process side compared reading the whole AWS Step Functions long JSON file. This is an example of a direct asyclic graph (DAG) that Airflow uses to define the process:
start_process >> prepare_lists
prepare_lists >> start_parallel_processing
start_parallel_processing >> [branch_a_ids, branch_b_ids]
branch_a_ids >> [skip_branch_a_ids, process_branch_a_ids]
branch_b_ids >> [skip_branch_b_ids, process_branch_b_ids]
skip_branch_a_ids >> join_branch_a_ids
skip_branch_b_ids >> join_branch_b_ids
process_branch_a_ids >> join_branch_a_ids
process_branch_b_ids >> join_branch_b_ids
join_branch_a_ids >> join_start_parallel_processing
join_branch_b_ids >> join_start_parallel_processing
join_start_parallel_processing >> end_process
If you look at this code snippet and the picture in the beginning of this blog post you immediately see what is happening in the Python DAG code.
Personally, I think it was a lot easier to define the process using Airflow than using Step Functions.
When using Step Functions you define the steps as part of the JSON process. An example defining a step that calls an AWS Lambda:
"my-lambda": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"Parameters": {
"Payload.$": "$",
"FunctionName": "${MyLambda}"
},
"Next": "end-my-lambda-pass",
"ResultPath": "$.fetch-lambda"
}
Using AWS Managed Airflow the same definition:
my_lambda = LambdaInvokeFunctionOperator(
task_id='my-lambda',
function_name=AIRFLOW_VAR_MY_LAMBDA_NAME,
payload="\{\{ti.xcom_pull(task_ids='start-process', key='payload_my_lambda')}}",
dag=dag,
)
I think it is more natural to define the tasks on the Airflow side since you can use a natural programming language. (NOTE: I had to add \{\{
instead of just two curly braces since the markdown file did not show the line without the backslashes, i.e. the backslashes are not actually needed in the Python code. Airflow uses jinja templates in the operators.)
This is where Airflow really shines compared to Step Functions. When using Step Functions you are restricted to use the JSON operators and some so-called Intrinsic functions provided by the Step Functions environment. If you need to do some data manipulation between the steps and the Step Functions environment does not provide a method for that kind of data manipulation, you have to create a custom AWS Lambda to do the data manipulation. This is really awkward. Example. I just couldn’t find a way to merge two lists using the Step Functions data manipulation methods. Therefore I had to setup the whole AWS Lambda infrastructure to pass two lists to the Lambda, let the Lambda to do one line merge operation and return the merged list.
Using Airflow you can just use Python to do simple things like merging two lists:
@task(task_id='prepare-lists')
def prepare_lists_task(dag_run=None, ti=None):
print('Starting task: prepare-lists')
branch_a_ids = dag_run.conf.get('branch_a_ids')
branch_b_ids = dag_run.conf.get('branch_b_ids')
merged_ids = branch_a_ids + branch_b_ids
ti.xcom_push(key='merged_ids', value=merged_ids)
return merged_ids
Using Airflow, you pass data between the task using XComs. You can see in the code snippet above an example how to publish some data from the task: ti.xcom_push(key='merged_ids', value=merged_ids)
. You can pull this data from other tasks like: merged_list = ti.xcom_pull(task_ids='prepare-lists', key='merged_ids')
.
AWS provides good interfaces for both services. The AWS Managed Airflow GUI is pretty much the same as the Airflow standalone GUI you can see in the beginning of this blog post.
Personally, I prefer the Airflow developer experience over Step Functions. The process definition is nicer to implement using Python, and you can use Python in the tasks as well.
One major advantage is also the Airflow Standalone, which you can use to implement the overall version of your process, and then continue implementing the AWS related operators in the AWS Managed Airflow side.
You can easily send events to the Airflow Standalone, example:
airflow dags trigger -c "{\"flag\": \"kari-debug\", \"branch_a_ids\": [111, 222], \"branch_b_ids\":[]}" kari-demo-dag
When you edit the DAG files in Airflow Standalone, the server immediately updates the DAG in the server. This is a major pro for Airflow developer experience. Updating the definitions for both AWS Managed Airflow and AWS Step Functions require considerably longer development feedback.
There are other considerations than just the developer experience, however.
Costs. AWS Step Functions is a serverless service and you don’t pay infrastructure costs - you just pay for running the processes on the service. For AWS Managed Airflow, you have to create the Airflow infrastructure, which creates development costs. And also running the infrastructure, even you would not run any processes on it, has some considerable costs. And most probably you want to have exact copies of your infrastructure for various environments: development, developer testing, CI testing, performance testing, customer testing, and production - you need to setup a rather expensive service infrastructure for all those environments. I recommend you to consult the AWS Price Calculator for AWS Managed Airflow costs - the costs depend on your needs (a more performant environment costs more).
Security. This is a major disadvantage against using AWS Managed Airflow compared to AWS Step Functions. With Step Functions, every process is its own entity with a process executor. I.e., you can assign a separate IAM Role for running each separate process in the AWS Step Functions. Not so with AWS Managed Airflow. With AWS Managed Airflow, you create one Airflow Executor IAM Role, and you run all your DAGs (Airflow processes) with this IAM Role. For some Airflow DAGs this is not a problem. E.g., if you just compose your process using e.g. ECS operators, your ECS Task Definitions have dedicated IAM Roles (least privilege to have rights to access only those resources the ECS task needs to access). But for Python operator and most of the other Airflow operators, you run all the operators with the same common Airflow Executor IAM Role. This is a bit of a nuisance and it also violates the principle of least privilege. E.g., if you have separate teams and those teams create DAGs of their own, all separate teams use the same Airflow Executor IAM Role. The consequence being that the teams can e.g. see each others’ S3 buckets. You can create more AWS Managed Airflow Environments - but that solution generates more expences since the service has considerable costs.
If the costs or the security issues prevent you from using AWS Managed Airflow, you can always use AWS Step Functions - both services are good for complex workflows.
If you need to create complex processes on AWS you have two excellent services to choose from: Amazon Managed Workflows for Apache Airflow, and AWS Step Functions.
The writer is working at a major international IT corporation building cloud infrastructures and implementing applications on top of those infrastructures.
Kari Marttila
My Dygma Raise Keyboard.
This is a continuation blog post to my earlier keyboard related blog post Dygma Raise Keyboard Reflections Part 1 in which I explained how I configured my Linux keyboard using xmodmap and Dygma Bazecor.
In this new blog post I explain how I solved my keyboard issue when migrating from Xorg to Wayland.
A week ago after office hours, I thought I update my Ubuntu 22 Linux packages. There were quite a few various package updates and one firmware update. I updated them all. Without booting the machine before the firmware update. A big mistake. Or not. Anyway. I saw that e.g. NVIDIA graphics card driver got updated from version 525 to 535. I booted my machine, and my Linux first lagged considerably with my three external monitors, and then completely froze. I rebooted Linux, the same thing. And again. Ok, I thought that the new nvidia drivers had something to do with, so I downgraded back to 525. Didn’t help. My machine also has integrated Intel graphics card, I tried it, didn’t help. I spent one day figuring out the issue, but nothing helped. Finally, I completely re-installed Ubuntu 22. Didn’t help. I was pretty pissed off. Then I tried Wayland, and everything worked just fine. Perhaps, it was some issue with Xorg and/or nvidia drivers?
Ok, now I knew a perfectly good workaround: just ignore the Xorg(/nvidia) issue and use Intel graphics card with Wayland.
But. There was one major issue migrating to Wayland: all my Linux keyboard configurations were done using Xorg.
I am a programmer and I have heavily modified my Finnish keyboard layout so that it is as smooth as possible to program using the Finnish keyboard layout (read more about the specific modifications regarding the layout in my previous blog post: Dygma Raise Keyboard Reflections Part 1)
I googled what kind of solutions there are to configure Linux keyboard with Wayland, and I found some good candidates:
The strongest candidates seemed to be keyd and kmonad.
I kind of liked kmonad, but the installation seemed to be a bit tedious with Ubuntu, and the configuration seemed to be more complex than with keyd. So, I chose keyd.
The installation of keyd was really easy (just a copy-paste from the keyd web site):
git clone https://github.com/rvaiya/keyd
cd keyd
make && sudo make install
sudo systemctl enable keyd && sudo systemctl start keyd
NOTE: I later on realized this piece of instruction: Note: master serves as the development branch, things may occasionally break between releases. Releases are tagged, and should be considered stable.
I installed from the master branch, you might want to checkout the latest release branch first.
I had this kind of configuration process:
sudo emacs /etc/keyd/default.conf
- i.e., use your choice of editor to write the keyd configuration file.sudo journalctl -eu keyd -f
- keep journalctl keyd messages to flow in one terminal so that you see if there are errors during reload (see terminal #3).sudo keyd reload
- reload the configuration after changes.sudo keyd monitor
- use the monitor to check the keys you need to refer in your keyd configuration.This way it is pretty easy to experiment with various keyboard layout configurations and reload the new version of your keyboard layout (and immediately see all errors and warnings keyd emits).
You also might want to initialize /etc/keyd
directory as a git repo - this way you can add it to version control and see your change history to resolve some issues later on.
Let’s see my current keyd configuration:
# NOTE: The panic sequence *<backspace>+<escape>+<enter>* will force keyd to terminate.
### NOTE:
# This file needs to be in:
# /etc/keyd/default.conf
# If you are reading the ~/info/default.conf file, it is a backup file.
# NOTE: Remember to add changes in /etc/keyd to git!
### HELP SECTION BEGIN ###
# Web site: https://github.com/rvaiya/keyd
# See also Finnish keyboard layout:
# https://www.vero.fi/en/About-us/contact-us/efil/information-on-mytax/less-commonly-used-latin-characters-on-a-finnish-pc-keyboard/
##### KEYD #################
# When you edit /etc/keyd/default.conf file:
# cd /etc/keyd
# cn # Open VSCode
# In another terminal:
# sudo journalctl -eu keyd -f
# ... this way you immediately see if there is some error,
# when you load new configuration (after saving the file):
# sudo keyd reload
# Finally when you are ready, in the /etc/keyd: add changes to git.
# Find keys with tool:
# sudo keyd monitor
# List keys that you can use with keyd:
# sudo keyd list-keys
# Modifiers:
# C - Control++
# M - Meta/Super++
# A - Alt++h
# S - Shift++
# G - AltGr
### HELP SECTION END ###
### TODO BEGIN ###
# No issues at the moment.
### TODO END ###
###### KEYBOARD IDENTIFICATORS #####
# NOTE: Later on if you find out that some configuration does not work on Dygma vs Laptop keyboard,
# you can use separate ids sections.
[ids]
*
###### MAIN #####
[main]
# Special characters.
# Using sudo keyd monitor: tilde key in Finnish layout is: ]
# Tilde (~) (diaresis key (diaresis is next to Å))
# Thanks tkna!
] = macro(G-] space)
# Tick (forward tick) (´)
# Tick is next to backspace button.
= = macro(= =)
###### CAPSLOCK AS ALTGR #####
capslock = layer(capslock)
[capslock:G]
# Navigation.
j = left
k = down
l = right
i = up
h = home
# ; is ö key in the Finnish keyboard.
; = end
u = pageup
o = pagedown
# Functionalities.
d = delete
f = backspace
# Not needed.
# You get @ { [ ] } etc, by capslock+2, capslock+7, etc.
#2 = @
###### SHIFT #####
# NOTE: Shift needs to be here after my navigation configurations!
# (or it breaks navigation configuration above.)
[shift]
# Caret (^)
# So, you get caret (^) by clicking shift+diaresis (diaresis is next to Å)
] = macro(S-] space)
# Backtick
# So, you get backtick (`) by clicking shift+tick (next to backspace)
= = macro(S-= space)
# TESTING AREA
# {[]}\ @£$89[]}\|||
# ~~~¯¯¯^^^
# ´
#
As you can see, most of the stuff is just my personal notes to help me later on to continue configurations (nice to have instructions where the configuration file itself is). If you remove all comments, you get:
[ids]
*
[main]
] = macro(G-] space)
= = macro(= =)
capslock = layer(capslock)
[capslock:G]
j = left
k = down
l = right
i = up
h = home
; = end
u = pageup
o = pagedown
d = delete
f = backspace
[shift]
] = macro(S-] space)
= = macro(S-= space)
… just some 20 lines!
Compared to my previous xmodmap configuration this is really concise!
I listed in the configuration file for myself some TODOs to figure them out later on. I managed to fix all those issues, so there are no issues at the moment. :-)
Except. VSCode Integrated Terminal. I just couldn’t get it to use the keyd configuration. The workaround is to use the Emacs bindings hotkeys in VSCode Integrated Terminal (e.g. ctrl+p for upArrow for bash history, etc.). Luckily, the VSCode editor area works with keyd configuration just fine. I’m pretty sure this is not a keyd issue per se, but possibly some weird XWayland / VSCode issue. If you know a solution to this issue, send me email.
I already explained in my previous blog post that with the Finnish layout you get { [ ] }
and some other special characters needed in programming, by clicking the so called AltGr key and some other key at the same time. But the really weird design decision is that the AltGr key is located next to the to the spacebar key (right side), and those other keys are usually in the upper row in the right side of the keyboard. Which makes it pretty unergonomic to emit those keys: you have to press the altGr with the thumb of your righ hand and then reach with e.g. your right hand middle finger to the upper row of the keyboard. If you are a Finnish programmer, you definitely want to change the Capslock key (which is totally useless in programming) to work as the AltGr key. I did this in the Xorg using xmodmap (read my previous blog post about that: Dygma Raise Keyboard Reflections Part 1).
But. When I was experimenting with keyd, I had some issues figuring out how to switch CapsLock to AltGr. I checked the keyd web site if there was some keyd community to ask for help. And there is: IRC Channel: #keyd on oftc.
I explained there my issue, and a very nice Japanese guy with nickname tkna helped me a lot. I can’t believe how helpful he was. We spent about an hour together, and he practically solved all my hard issues, or found a good workaround for those issues we couldn’t solve. I was really happy about his help and I donated some money to his tkna91 ko-fi account. So, thanks a lot once more, tkna!
At the same time I also donated some money to the keyd author rvaiya, to his rvaiya ko-fi account.
If you are using Linux and you are migrating from Xorg to Wayland, and you want to ditch your old Xorg xmodmap configuration, I truly recommend to try keyd. The keyd tool is really easy to install, and really easy to configure. Actually, you can use keyd also with Xorg, so you don’t need to migrate to Wayland to ditch xmodmap.
But, remember to donate some money to the keyd author rvaiya to help the author to continue working with this great tool. And if you receive help from other keyd users (like tkna91), remember to donate some money to them as well - it is the right thing to do.
The writer is working at a major international IT corporation building cloud infrastructures and implementing applications on top of those infrastructures.
Kari Marttila
The players: AWS EventBridge Scheduler, IAM Role and ECS Task.
In my previous blog post Cloud Infrastructure Golden Rules I gave two cloud infrastructure golden rules:
I later on also wrote that you are allowed to use the AWS Console in certain situations. In this blog post I depict one of these occasions in which you could use the AWS Console to make your life as a cloud infrastructure programmer a bit easier.
Sometimes, you can make your life a bit easier using the AWS Console and its wizards to create the AWS resources. But you don’t use these resources in production. You just create the resources to examine what kind of resources the AWS wizard creates to replicate the resources using your cloud infrastructure tool of choice. I.e.: In production, all your cloud infrastructure must reflect your infrastructure code.
Let’s use an example. The other day I needed to create an Amazon EventBridge Scheduler to trigger an ECS Task. I read the documentation and understood that I need to create an AWS IAM Role for the scheduler.
To make my life easier I opened AWS Console and navigated to the Amazon EventBridge / Scheduler page. Then I clicked the “Create Scheduler” button. The wizard started and asked various questions. At the end, there was the permissions page, in which I chose “Create new role for this schedule”. Finally, the wizard showed the selections and provided the “Create schedule” button - I clicked it. The wizard created the scheduler, but more importantly it created the Role for it - I was particularly interested in the Role.
I went to the AWS IAM Roles view and found the new role. In the “Trust relationship” tab it said:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "scheduler.amazonaws.com"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"aws:SourceAccount": "99999999999"
}
}
}
]
}
(NOTE: the account is anonymized to “99999999999”, also in the next code examples.)
In the “Permissions” tab it provided an inline policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ecs:RunTask"
],
"Resource": [
"arn:aws:ecs:eu-west-1:99999999999:task-definition/my-test-api-ecs-task-def:*",
"arn:aws:ecs:eu-west-1:99999999999:task-definition/my-test-api-ecs-task-def"
],
"Condition": {
"ArnLike": {
"ecs:cluster": "arn:aws:ecs:eu-west-1:99999999999:cluster/my-test-api-ecs-cluster-fargate"
}
}
},
{
"Effect": "Allow",
"Action": "iam:PassRole",
"Resource": [
"*"
],
"Condition": {
"StringLike": {
"iam:PassedToService": "ecs-tasks.amazonaws.com"
}
}
}
]
}
This inline policy contains quite a bit of information, and it could have been a bit tedious to find the right configuration just by reading the documentation.
Next, I wrote the SAM template using that example the wizard had created for me:
...
ApiSchedulerRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Principal:
Service:
- scheduler.amazonaws.com
Action:
- sts:AssumeRole
Policies:
- PolicyName: ApiSchedulerRolePolicy
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- "ecs:RunTask"
Resource:
- !Ref ApiECSTaskDefinition
- Effect: Allow
Action:
- "sqs:SendMessage"
Resource:
- !GetAtt ApiDeadLetterQueue.Arn
- Effect: Allow
Action:
- "iam:PassRole"
Resource:
- "*"
Condition:
StringLike:
iam:PassedToService:
- ecs-tasks.amazonaws.com
...
As you can see, it is quite straightforward to create the infrastructure code from the working example.
And once you have created the actual resource using your IaC tool, you can compare the two resources: the one created with the AWS wizard, and the one created by your infrastructure code - they should be pretty identical.
Sometimes using the AWS Console wizards are the way to go to make your life a bit easier as a cloud developer.
The writer is working at a major international IT corporation building cloud infrastructures and implementing applications on top of those infrastructures.
Kari Marttila
AWS Step Functions Workflow Studio.
As I mentioned in my previous blog post, I joined a new interesting project a couple of months ago. In this new project, I was asked to help the domain specialists to implement a long lasting process that comprises a couple of ECS tasks and one lambda, a bit like the demonstration process in the graph above that I created for this blog post.
The domain specialists gave me the SQL query for the lambda and I implemented the rest of the lambda. We agreed with domain specialists regarding the command line parameters of the ECS tasks. While I waited for them to implement the Docker images for the ECS tasks, I implemented dummy versions of the Docker images with the agreed contract - the dummy containers just echo the arguments passed to the container. The reason for implementing the dummy Docker images was to develop the overall process and I didn’t have to wait for the actual Docker images to be ready.
AWS Step Functions is an orchestration service that lets you combine various AWS services as part of your process. The process can comprise Lambdas, Elastic Container Service, and some 200 other AWS services to choose from.
If you have a process which has various tasks, and some of the tasks can be executed in parallel (as the ECS RunTask1 and Lambda in the graph above), some of the tasks are dependent on the results of the previous tasks, and some tasks need to be mapped to some result list values (as the Map step and its ECS RunTask2 in the graph above), then AWS Step Functions is a good choice to implement the process.
With AWS Step Functions you can develop the process using a nice visual workflow modeller (as depicted in the picture above). And once your process is ready, you can export the process definition as a JSON file and use that JSON file to create the process in AWS Step Functions, and automate the infrastructure using e.g. AWS SAM (see my previous blog post: Using AWS Serverless Application Model (SAM) First Impressions).
You create your process using the off-the-shelf components (pass, parallel, map, choice), and integrate AWS services into the process. The main idea is that you pass JSON object from one step to another, and each step modifies the JSON object in some way (e.g. enriching it with new data, or filtering some data out of it).
Since the JSON object plays such a central role in the process, it is important to understand how to manipulate the JSON object. There is one important hard limit with the JSON object, and Step Functions provide various methods to manipulate the JSON object.
In paper, this all sounds nice. But in practice, it is far from easy. There are certain peculiarities that you need to take into consideration.
One of the main limits is that you can only pass a JSON object with a maximum size of 256kB - and that is not much. If you have a process that has a lot of data, a best practice is to store the JSON object in S3 bucket and let the next step fetch the JSON object from the S3 bucket.
Another solution is to provide in the JSON object only the data that is needed in the next step. If you have a list of objects, e.g.
{"mylist": [{"id": 1, "name": "Kari", "occupation": "IT specialist", "address": ...},
{"id": 2, "name": "Matti" ...}
...
]
}
… you might be able to fetch the name
, occupation
, and address
fields in the actual step, and just pass a list of ids:
{"mylist": [1, 2, 3, 4, 5, ...]}
However, you may want to use this method only if you are 100% sure that the list of ids will never exceed the 256kB limit. Walk on the safe side of the street and just store the intermediate JSON results in S3.
For more information regarding how to use the Map state: Map state processing modes - use Distributed
mode if you have a lot of data.
The Step Functions provide Input and Output Processing in Step Functions. You can filter out the input JSON object and pass only the fields that are needed in the next step. You can enrich the JSON object with new fields, and do other kind of manipulations with it. Check out the documentation for more details.
You can inject values from your AWS SAM template into the Step functions process definition using substitutions.
Example. You have a SAM IaC automation. In that automation you have a samconfig.toml
file which comprises the parameters for each environments you use, e.g. the subnets for your ECS tasks:
"SubnetIds=subnet-xxxxxxxxxxxx,subnet-yyyyyyyyyyy",
Then in your SAM template.yaml
file you have the parameter declaration, and a block for Step function parameter substitutions:
SubnetIds:
Type: List<AWS::EC2::Subnet::Id>
Description: List of subnet ids
...
MyStepFunctionsProcess:
Type: AWS::Serverless::StateMachine
Properties:
DefinitionSubstitutions:
ECSClusterArn: !Ref ECSClusterArn
ECSSubnetIds: !Join [ ",", !Ref SubnetIds ]
...
Here comes the tricky part. You have converted the comma separated list of subnets first as an array (since you need it as an array elsewhere in the template.yaml file), and now you convert the array as comma separated string, and inject it using Step functions substitution to your process definition json file.
But in your Step functions process definition json file you need it as an array once again. You cannot do it with intrinsic function right away (see next chapter) since you need the substitution to be done first. So, do the substitution in one step, and then convert the string to array in the next actual ECS step:
"substitute-ecs-parameters-pass": {
"Type": "Pass",
"Next": "ecs-task",
"Parameters": {
"subnets": "${ECSSubnetIds}",
...
"ecs-task": {
"Type": "Task",
"Resource": "arn:aws:states:::ecs:runTask.sync",
"Parameters": {
"LaunchType": "FARGATE",
"Cluster": "${ECSClusterArn}",
"TaskDefinition": "${MyECSTaskDefinitionArn}",
"NetworkConfiguration": {
"AwsvpcConfiguration": {
"Subnets.$": "States.StringSplit($.subnets, ',')",
"SecurityGroups.$": "States.StringSplit($.securityGroups, ',')",
"AssignPublicIp": "ENABLED"
}
},
...
A bit of conversions, but it works.
You can also use Intrinsic functions to manipulate the JSON object. For example, you can use States.Format
to format the JSON object as a string, and States.Array
to create an array of objects from the JSON object. Once again, check out the documentation for more details. See example of using "Subnets.$": "States.StringSplit($.subnets, ',')",
in the previous chapter.
Since it is a bit awkward to use these crudimentary JSON manipulation methods, AWS provides a Data Flow Simulator to the JSON conversions, and you don’t have to test the manipulation results running your Step Functions process. I strongly recommend to develop your process using the Step Functions Workflow Studio and Data Flow Simulator, since it is really time-consuming to test your process by running it in AWS for every little JSON manipulation change.
The context tells you e.g. the name of the process, execution id, timestamp, and the JSON object that triggered the process. So, you don’t need to try to manipulate the JSON object that is passed from one step to another always to comrise the initial data - you can always fetch those parameters from the context in any step.
I already told you the most important trick: use the Step Functions Workflow Studio and Data Flow Simulator and save a lot of time. Try your JSON manipulations in the Data Flow Simulator and only when you are sure that the JSON manipulations work as expected, add the manipulation into your process definition and test it by running your process in AWS.
Use the AWS CLI to start execution of your Step Functions process. For example:
aws-vault exec MY-AWS-PROFILE --no-session -- aws stepfunctions start-execution --state-machine-arn arn:aws:states:eu-west-1:999999999999:stateMachine:kari-testing-step-functions --input "{\"value1\" : \"this-is-value-1\", \"my_ids\": [1 2 3]}"
… and save time by not using the AWS Console to do that and copy-paste the JSON input into the AWS Console every time.
Use temporary development steps e.g. to examine the results, the context, or just truncate the long list of ids from the previous step to some more manageable list for development purposes (if your next step spins an ECS task for each id). Name your temporary steps so that you can easily find them and remove them later (see the graph above: TEMPORARY-truncate-list-to-3-items
).
Create commands in the previous step. The diagram above has a “Create command” step before the ECS steps. When implementing the Step Functions I realized that it was really painful to manipulate the ContainerOverrides
parameter in the ECS step. It was alot easier to manipulate the command that is injected to the container in the previous step, like:
"command.$": "States.Array(States.Format('{}', $), $$.Execution.Input.someValue, '--some_list', States.Format('{}', $$.Execution.Input.some_list))"
and then inject this command in the ECS step:
"ContainerOverrides": [
{
"Name": "kari-test-container",
"Command.$": "$.command"
}
Step Functions is an excellent tool for describing and implementing a long-lasting complex process. The other side of the coin is that Step Function has a steep learning curve, and manipulating the JSON object between steps can at times be quite awkward. If you have simpler processes, there are other options. For example:
AWS Step Functions is an excellent tool. But remember to allocate enough time to learn it and develop your first process, since the learning curve is rather steep. Save Step Functions for complex processes and use lighter methods for simpler processes.
The writer is working at a major international IT corporation building cloud infrastructures and implementing applications on top of those infrastructures.
Kari Marttila
Python logo.
I joined an interesting new project about a month ago. The project uses Python as the main programming language. I have been using Python more than 20 years, but mostly for various utility type scriptings (read more about my blog posts Python Rocks! Java Man Converts to Python, and Writing Machine Learning Solutions — First Impressions). Recently, I have been using Clojure, and especially with Babashka Clojure has mostly replaced Python as my choice of programming language for scripting. There is an excellent editor/REPL integration in Clojure for practically every main stream editor. I use nowadays mostly VSCode, and for Clojure I use the excellent Calva VSCode extension to provide an editor/REPL integration for Clojure.
Now that I’m forced to use Python, I really miss Clojure’s excellent REPL. You may say that there has always been a REPL in Python, you just launch the Python Interpreter and start writing code. But it’s not the same thing. Clojure provides excellent editor/REPL integration - you can lauch a Clojure REPL process, and connect to that process from your editor, and from the editor send the S-expressions to be evaluated in the REPL and get the results back to the editor. And that’s what I’m missing in Python.
So, I decided to spend some time searching for a good editor/REPL integration for Python. I found a couple of interesting solutions, and I decided to try them out.
If you plan to use the standard Python interpreter, then I recommend to use bpython instead. It’s not an actual editor/REPL integration solution, but at least provides a bit better syntax highlighting and autocomplete than the standard Python interpreter.
Astrapios Python REPL is a VSCode extension which provides an editor/REPL integration for Python. It uses iPython as the Python REPL, so you need to install iPython to your Python environment: pip install ipython
.
It’s not a perfect solution, but it’s the best I have found so far. Astrapios Python REPL starts quickly. I have configured the same hotkeys for sending the selected code / current line (alt+l), and for sending the whole file (alt+m) to be evaluated in the REPL as I have in Clojure - less cognitive burden to remember the semantically similar things in the editor with your muscle memory.
// Clojure
{
"key": "alt+l",
"command": "calva.evaluateSelection",
"when": "calva:connected && calva:keybindingsEnabled"
},
{
"key": "alt+m",
"command": "calva.loadFile",
"when": "calva:connected && calva:keybindingsEnabled"
},
...
// Python
{ // Send the selected code or current line to REPL
"key": "alt+l",
"command": "pythonREPL.sendSelected",
"when": "editorLangId == 'python'"
},
{ // Send all file contents to REPL
"key": "alt+m",
"command": "pythonREPL.sendFileContents",
"when": "editorLangId == 'python'"
}
For heavier stuff, you might want to try Jupyter Notebooks in VS Code. First install Jupyter Notebook, see Installing Jupyter. Basically, you just install it using pip: pip3 install notebook
. Then install the Jupyter extension for VSCode.
Start the Jupyter Notebook server from the command line:
jupyter notebook
In VSCode, create a new jupyter notebook file, eg. myjupyter.ipynb
. There is a Select Kernel
in the top right corner of the editor, click it. Choose Existing Jupyter Server...
, and enter the url that you got in the terminal when you started the Jupyter Notebook server. The extension shows you Python3 (iPyKernel)
as the kernel, choose it. Then you can start writing Python code in the notebook cells, and evaluate the cells by using the play icons.
When you launched the Jupyter Notebook, the server also opened a browser window with the Jupyter Notebook UI. You can also use that UI to create new notebooks, and open existing notebooks. The VSCode Jupyter Notebook extension provides a better editor experience, though.
Here are my recommendations.
The writer is working at a major international IT corporation building cloud infrastructures and implementing applications on top of those infrastructures.
Kari Marttila
AWS Serverless Application Model.
In my earlier blog post Using Serverless Framework - First Impressions, I told experiences regarding the Serverless Framework - an excellent multi-cloud serverless IaC tool. Besides the Serverless Framework I have extensive experience also using Terraform and Pulumi - you might want to read those articles as well.
I started working in a new project and I have been helping domain specialists with certain cloud implementations using the AWS cloud. The team uses AWS Serverless Application Model (SAM) as an IaC tool. In this blog post, I describe my first impressions using SAM - a serverless IaC tool provided by AWS. SAM is not a multi-cloud tool like the Serverless Framework, but if you are working in the AWS cloud then SAM is an excellent choice.
AWS SAM is a serverless IaC tool provided by AWS. SAM is a domain specific language (DSL, using YAML for defining serverless infrastructures). SAM uses AWS CloudFormation under the hood for deployments - CloudFormation is a general purpose IaC tool provided by AWS. AWS SAM uses a command line tool called SAM CLI for deploying serverless applications to the AWS cloud, and also for developing those applications locally.
I had to maintain an application which had a Javascript/React frontend and a Python/Flask backend application which provided API for the frontend application. The backend application was deployed to the AWS cloud using SAM.
Working with Python and SAM was pretty nice. You can develop a Python application locally, and once you are ready, you can test the application locally using SAM CLI. I had two terminals open, one for building the SAM backend, and one for running the SAM backend locally.
Terminal 1:
sam build
Terminal 2:
aws-vault exec YOUR_AWS_VAULT_PROFILE --no-session -- sam local start-api --env-vars .env_local.json
(I’m using aws-vault to manage my AWS credentials.)
SAM local development uses a Docker image public.ecr.aws/sam/build-python3.9
behind the curtains. You can pass an environment variable file (env_local.json
) in which you can define environment variables for your local development.
Then, when you give command sam build
in your terminal #1, SAM automatically reflects your backend application changes in the development server.
Using SAM provides nice developer experience when implementing the frontend and backend applications at the same time.
SAM Template is a YAML file which defines your serverless infrastructure. If you have been working with AWS CloudFormation then you are familiar with the SAM template.
Deploying your serverless infrastructure for various environments, you can use a samconfig.toml configuration file. In this file you can provide various settings for your environments.
The deployment into an environment defined in your samconfig.toml file is as simple as:
aws-vault exec YOUR_AWS_VAULT_PROFILE --no-session -- sam deploy --config-env prod
The deployment creates a CloudFormation Stack into your AWS account. You can use AWS Console to check which resources got created by your SAM deployment.
SAM is a serverless IaC tool provided by AWS. It is not a multi-cloud tool like the Serverless Framework. If you are working in the AWS cloud, then SAM is an excellent choice. If you are working in a multi-cloud environment, then you might want to use the Serverless Framework.
If you are working in the AWS cloud and you want to quickly create rather independent serverless deployment units, AWS SAM is an excellent choice. If you want to create a larger IaC solution, then you might want to use Terraform or Pulumi.
AWS SAM is an excellent IaC tool if you are working in the AWS cloud and you want to have a tool for quickly creating independent serverless deployments to AWS.
The writer is working at a major international IT corporation building cloud infrastructures and implementing applications on top of those infrastructures.
Kari Marttila
Kubernetes.
In my previous blog post AWS With Federated Identity Using OpenID Connector, I explained how I used OpenID with AWS IAM Role.
In this new blog post, I describe how I linked the previously created IAM Role with Kubernetes service account.
A Kubernetes Service Account is an entity in a Kubernetes cluster which provides a distinct identity.
To be able to interact with the cloud environment, you need to link the Kubernetes service account to some cloud role resource, e.g. AWS IAM Role.
I needed to do a Kubernetes rolling update to provide a zero downtime update for the Kubernetes pods to use a new Docker image. The use case is this. We have a CI/CD pipeline (Github Actions) that does the following steps with each commit:
EKS allows kubernetes API calls for the IAM User that created the EKS resource. For another IAM User or Role, use a Kubernetes service account and link it with an IAM Role.
Create the service account:
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: github-service-account
namespace: demo1
Then create a Kubernetes role:
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: github-role
namespace: demo1
rules:
- apiGroups:
...
And a role binding between the service account and the role:
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: github-rolebinding
namespace: demo1
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: github-role
subjects:
- namespace: demo1
kind: ServiceAccount
name: github-service-account
Apply these entities to your kubernetes cluster.
Finally you need to link the service account with the AWS IAM role that your CI/CD uses, example:
eksctl create iamidentitymapping --cluster $EKS_CLUSTER_NAME --arn $AWS_GITHUB_IAM_ROLE_ARN --username github-service-account --group system:masters
To make things easier for this demo, I used group system:masters
- you might want to create a more restricted group for deployments in a real production environment.
Now you are able to do the rolling update in your CI/CD pipeline:
kubectl rollout restart deployment deployment-demo1 -n demo1
In this blog post I described how you can link a Kubernetes service account with an AWS IAM Role, so that you are able to use the service account to do e.g. rolling updates in your Kubernetes deployments.
The writer is working at a major international IT corporation building cloud infrastructures and implementing applications on top of those infrastructures.
Kari Marttila
OpenID.
I did an exercise in which I implemented AWS EKS using Terraform. In my previous post AWS Load Balancer Controller, I explained how I created a kubernetes ingress for the solution. In this new blog post, I describe how I used OpenID federated identity to let Githut Actions to assume temporary credentials for an AWS IAM Role.
For a CI/CD pipeline you need to give the pipeline access to your cloud resources. Often, you see a solution that a developer creates a dedicated CI/CD AWS IAM user and then configures that user’s credentials permanently to the CI/CD pipeline. Even though most CI/CD machines provide various secrets, storing credentials in a CI/CD machine is not considered as a best practice.
What if you could provide some kind of federated identity solution, in which both parties agree to trust each other? This is exactly how you can use OpenID with AWS IAM Role.
The high-level flow goes like this:
Let’s see this in real life.
There are good instructions in the web how to do this. First you need to create the OpenID connector:
resource "aws_iam_openid_connect_provider" "github_openid_connect_provider" {
url = "https://token.actions.githubusercontent.com"
client_id_list = ["sts.amazonaws.com"]
thumbprint_list = [
"6938fd4d98bab03faadb97b34396831e3780aea1"
]
}
The thumbprint relates to Github well-known openid-configuration.
Then provide the IAM policy document:
data "aws_iam_policy_document" "github_actions_assume_iam_role_policy_document" {
statement {
effect = "Allow"
actions = ["sts:AssumeRoleWithWebIdentity"]
principals {
type = "Federated"
identifiers = [aws_iam_openid_connect_provider.github_openid_connect_provider.arn]
}
condition {
test = "StringEquals"
variable = "token.actions.githubusercontent.com:aud"
values = ["sts.amazonaws.com"]
}
condition {
test = "StringLike"
variable = "token.actions.githubusercontent.com:sub"
values = var.github-repositories
}
}
}
Note this line: values = var.github-repositories
. You can give the right to assume this IAM role for a specific Github repository - no other party can assume the IAM role.
Then the last step is to create the actual IAM role and give the IAM role the permissions you need (e.g. to push images to ECR), and bind above mentioned policy to the IAM role.
It is quite straightforward to use the above mentioned OpenID Federated Identity in Github Actions. You just need to store the AWS IAM role arn as a secret in your Github Actions configuration. Storing the arn itself poses not that much of a security threat, since the IAM role arn is not a credential, and you have already configured in the IAM role side that this federated identity can only be assumed by your Github repository.
In your Github Actions pipeline you can use aws-actions/configure-aws-credentials plugin to do this:
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v2
with:
role-to-assume: ${{ secrets.AWS_GITHUB_IAM_ROLE_ARN }}
aws-region: ${{ secrets.AWS_DEFAULT_REGION }}
Note this line: role-to-assume: ${{ secrets.AWS_GITHUB_IAM_ROLE_ARN }}
- we tell Github Actions to assume our AWS IAM role.
So, now your CI/CD pipeline can use the AWS IAM role to do the AWS related steps in your pipeline. Example:
...
- name: Login to Amazon ECR
id: login-ecr
uses: aws-actions/amazon-ecr-login@v1
...
docker push ${ECR_REGISTRY}/${ECR_REPO_NAME}:${IMAGE_TAG}
...
There is no reason to create a dedicated CI/CD IAM user and store the permanent AWS credentials in your CI/CD machine, since it is quite easy to provide a secure federated identity between your CI/CD pipeline and AWS.
The writer is working at a major international IT corporation building cloud infrastructures and implementing applications on top of those infrastructures.
Kari Marttila
AWS Load Balancer Controller.
I did an exercise in which I implemented AWS EKS using Terraform. Nothing special about that. But then I was pondering how to implement the ingress for the solution, and that turned out to be a bit challenging. I finally decided to use AWS Load Balancer Controller instead of building everything from scratch.
I’m sorry, but I cannot provide a repo for this blog post - I did this exercise for my corporation to provide junior developers an example of an IaC solution.
AWS Load Balancer Controller is a kubernetes controller which creates an AWS Application Load Balancer which is then used as an ingress for your kubernetes application. See the more detailed explanation in How AWS Load Balancer controller works.
The installation could be a bit easier if you created your cluster with the eksctl tool. I had already created the EKS cluster using Terraform, so I had to do some extra steps to make AWS Load Balancer Controller work with my solution.
I went with option A: IAM Roles for Service Accounts (IRSA). I.e., I created an IAM role with all necessary rights. Then I created an IAM OIDC provider:
eksctl utils associate-iam-oidc-provider \
--region $AWS_DEFAULT_REGION \
--cluster $EKS_CLUSTER_NAME \
--approve
I curled the iam-policy.json
as instructed in the AWS Load Balancer Controller documentation, and created an IAM Service Account:
eksctl create iamserviceaccount \
--cluster=$EKS_CLUSTER_NAME \
--namespace=kube-system \
--name=aws-load-balancer-controller \
--attach-policy-arn=$IAM_POLICY_ARN \
--override-existing-serviceaccounts \
--region $AWS_DEFAULT_REGION \
--approve
And installed the AWS Load Balancer Controller using a helm chart:
helm install aws-load-balancer-controller eks/aws-load-balancer-controller -n kube-system --set clusterName=$EKS_CLUSTER_NAME --set serviceAccount.create=false --set serviceAccount.name=aws-load-balancer-controller
And then I deployed a demo app (game 2048) to the cluster to verify that the AWS Application Load Balancer gets created:
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.4.0/docs/examples/2048/2048_full.yaml
It took a couple of minutes for the AWS Application Load Balancer to get created. I checked the status using command:
kubectl get ingress/ingress-2048 -n game-2048
Once I saw the address, I tested the address in my browser and saw the game - everything went smoothly.
Adding the AWS Load Balancer Controller can be done using the above process, but it is a bit ugly: All other infrastructure is created by terraform IaC, but for the ingress part, I had to use eksctl
tool which creates a CloudFormation stack behind the curtains.
It might be possible to reverse-engineer what eksctl
tool did above, and convert all the stuff as part of the terraform solution. At least those parts that are directly part of the AWS infrastructure. Maybe I do this in the second phase of this exercise.
Another solution might be trying to use terraform-aws-eks module instead of creating the eks module myself. I need to examine if that module creates the ingress as well.
If you want to find an easy way to provide a Kubernetes ingress in AWS EKS, AWS Load Balancer Controller is a good option.
The writer is working at a major international IT corporation building cloud infrastructures and implementing applications on top of those infrastructures.
Kari Marttila