Prospects use Amazon Managed Workflows for Apache Airflow (Amazon MWAA) to run Apache Airflow at scale within the cloud. They wish to use their current login options developed utilizing OpenID Join (OIDC) suppliers with Amazon MWAA; this permits them to offer a uniform authentication and single sign-on (SSO) expertise utilizing their adopted identification suppliers (IdP) throughout AWS companies. For ease of use for end-users of Amazon MWAA, organizations configure a customized area endpoint to their Apache Airflow UI endpoint. For groups working and managing a number of Amazon MWAA environments, securing and customizing every atmosphere is a repetitive however obligatory activity. Automation by infrastructure as code (IaC) can alleviate this heavy lifting to attain consistency at scale.
This put up describes how one can combine your group’s current OIDC-based IdPs with Amazon MWAA to grant safe entry to your current Amazon MWAA environments. Moreover, you should use the answer to provision new Amazon MWAA environments with the built-in OIDC-based IdP integrations. This strategy permits you to securely present entry to your new or current Amazon MWAA environments with out requiring AWS credentials for end-users.
Overview of Amazon MWAA environments
Managing a number of consumer names and passwords might be troublesome—that is the place SSO authentication and authorization is available in. OIDC is a extensively used normal for SSO, and it’s potential to make use of OIDC SSO authentication and authorization to entry Apache Airflow UI throughout a number of Amazon MWAA environments.
Once you provision an Amazon MWAA atmosphere, you possibly can select public or personal Apache Airflow UI entry mode. Non-public entry mode is often utilized by prospects that require proscribing entry from solely inside their digital personal cloud (VPC). Once you use public entry mode, the entry to the Apache Airflow UI is accessible from the web, in the identical approach as an AWS Administration Console web page. Web entry is required when entry is required exterior of a company community.
Whatever the entry mode, authorization to the Apache Airflow UI in Amazon MWAA is built-in with AWS Id and Entry Administration (IAM). All requests made to the Apache Airflow UI must have legitimate AWS session credentials with an assumed IAM function that has permissions to entry the corresponding Apache Airflow atmosphere. For extra particulars on the permissions insurance policies wanted to entry the Apache Airflow UI, check with Apache Airflow UI entry coverage: AmazonMWAAWebServerAccess.
Completely different consumer personas corresponding to builders, knowledge scientists, system operators, or architects in your group may have entry to the Apache Airflow UI. In some organizations, not all workers have entry to the AWS console. It’s pretty frequent that workers who don’t have AWS credentials may additionally want entry to the Apache Airflow UI that Amazon MWAA exposes.
As well as, many organizations have a number of Amazon MWAA environments. It’s frequent to have an Amazon MWAA atmosphere setup per utility or workforce. Every of those Amazon MWAA environments might be run in several deployment environments like growth, staging, and manufacturing. For giant organizations, you possibly can simply envision a state of affairs the place there’s a must handle a number of Amazon MWAA environments. Organizations want to offer safe entry to all of their Amazon MWAA environments utilizing their current OIDC supplier.
Answer Overview
The answer structure integrates an current OIDC supplier to offer authentication for accessing the Amazon MWAA Apache Airflow UI. This enables customers to log in to the Apache Airflow UI utilizing their OIDC credentials. From a system perspective, which means that Amazon MWAA can combine with an current OIDC supplier slightly than having to create and handle an remoted consumer authentication and authorization by IAM internally.
The answer structure depends on an Software Load Balancer (ALB) setup with a totally certified area title (FQDN) with public (web) or personal entry. This ALB supplies SSO entry to a number of Amazon MWAA environments. The user-agent (internet browser) name movement for accessing an Apache Airflow UI console to the goal Amazon MWAA atmosphere consists of the next steps:
- The user-agent resolves the ALB area title from the Area Title System (DNS) resolver.
- The user-agent sends a login request to the ALB path
/aws_mwaa/aws-console-sso
with a set of question parameters populated. The request makes use of the required parametersmwaa_env
andrbac_role
as placeholders for the goal Amazon MWAA atmosphere and the Apache Airflow role-based entry management (RBAC) function, respectively. - As soon as it receives the request, the ALB redirects the user-agent to the OIDC IdP authentication endpoint. The user-agent authenticates with the OIDC IdP with the prevailing consumer title and password.
- If consumer authentication is profitable, the OIDC IdP redirects the user-agent again to the configured ALB with a
redirect_url
with the authorization code included within the URL. - The ALB makes use of the authorization code acquired to acquire the
access_token
and OpenID JWT token withopenid e-mail
scope from the OIDC IdP. It then forwards the login request to the Amazon MWAA authenticator AWS Lambda perform with the JWT token included within the request header within thex-amzn-oidc-data
parameter. - The Lambda perform verifies the JWT token discovered within the request header utilizing ALB public keys. The perform subsequently authorizes the authenticated consumer for the requested
mwaa_env
andrbac_role
saved in an Amazon DynamoDB desk. The usage of DynamoDB for authorization right here is non-compulsory; the Lambda code performis_allowed
might be personalized to make use of different authorization mechanisms. - The Amazon MWAA authenticator Lambda perform redirects the user-agent to the Apache Airflow UI console within the requested Amazon MWAA atmosphere with the login token within the redirect URL. Moreover, the perform supplies the logout performance.
Amazon MWAA public community entry mode
For the Amazon MWAA environments configured with public entry mode, the consumer agent makes use of public routing over the web to hook up with the ALB hosted in a public subnet.
The next diagram illustrates the answer structure with a numbered name movement sequence for web community reachability.
Amazon MWAA personal community entry mode
For Amazon MWAA environments configured with personal entry mode, the consumer agent makes use of personal routing over a devoted AWS Direct Join or AWS Consumer VPN to hook up with the ALB hosted in a personal subnet.
The next diagram reveals the answer structure for Consumer VPN community reachability.
Automation by infrastructure as code
To make establishing this resolution simpler, we’ve got launched a pre-built resolution that automates the duties concerned. The answer has been constructed utilizing the AWS Cloud Improvement Equipment (AWS CDK) utilizing the Python programming language. The answer is accessible in our GitHub repository and helps you obtain the next:
- Arrange a safe ALB to offer OIDC-based SSO to your current Amazon MWAA atmosphere with default Apache Airflow Admin role-based entry.
- Create new Amazon MWAA environments together with an ALB and an authenticator Lambda perform that gives OIDC-based SSO assist. With the customization offered, you possibly can outline the variety of Amazon MWAA environments to create. Moreover, you possibly can customise the kind of Amazon MWAA environments created, together with defining the internet hosting VPC configuration, atmosphere title, Apache Airflow UI entry mode, atmosphere class, auto scaling, and logging configurations.
The answer affords numerous customization choices, which might be specified within the cdk.context.json file. Observe the setup directions to finish the mixing to your current Amazon MWAA environments or create new Amazon MWAA environments with SSO enabled. The setup course of creates an ALB with an HTTPS listener that gives the consumer entry endpoint. You may have the choice to outline the kind of ALB that you just want. You may outline whether or not your ALB shall be public going through (web accessible) or personal going through (solely accessible throughout the VPC). It is suggested to make use of a personal ALB along with your new or current Amazon MWAA environments configured utilizing personal UI entry mode.
The next sections describe the precise implementation steps and customization choices for every use case.
Conditions
Earlier than you proceed with the set up steps, ensure you have accomplished all stipulations and run the setup-venv script as outlined throughout the README.md
file of the GitHub repository.
Combine to a single current Amazon MWAA atmosphere
In the event you’re integrating with a single current Amazon MWAA atmosphere, comply with the guides within the Fast begin part. You should specify the identical ALB VPC as that of your current Amazon MWAA VPC. You may specify the default Apache Airflow RBAC function that each one customers will assume. The ALB with an HTTPS listener is configured inside your current Amazon MWAA VPC.
Combine to a number of current Amazon MWAA environments
To hook up with a number of current Amazon MWAA environments, specify solely the Amazon MWAA atmosphere title within the JSON file. The setup course of will create a brand new VPC with subnets internet hosting the ALB and the listener. You should outline the CIDR vary for this ALB VPC such that it doesn’t overlap with the VPC CIDR vary of your current Amazon MWAA VPCs.
When the setup steps are full, implement the post-deployment configuration steps. This consists of including the ALB CNAME file to the Amazon Route 53 DNS area.
For integrating with Amazon MWAA environments configured utilizing personal entry mode, there are further steps that have to be configured. These embody configuring VPC peering and subnet routes between the brand new ALB VPC and the prevailing Amazon MWAA VPC. Moreover, you could configure community connectivity out of your user-agent to the personal ALB endpoint resolved by your DNS area.
Create new Amazon MWAA environments
You may configure the brand new Amazon MWAA environments you wish to provision by this resolution. The cdk.context.json file defines a dictionary entry within the MwaaEnvironments
array. Configure the small print that you just want for every of the Amazon MWAA environments. The setup course of creates an ALB VPC, ALB with an HTTPS listener, Lambda authorizer perform, DynamoDB desk, and respective Amazon MWAA VPCs and Amazon MWAA environments in them. Moreover, it creates the VPC peering connection between the ALB VPC and the Amazon MWAA VPC.
If you wish to create Amazon MWAA environments with personal entry mode, the ALB VPC CIDR vary specified should not overlap with the Amazon MWAA VPC CIDR vary. That is required for the automated peering connection to succeed. It could take between 20–half-hour for every Amazon MWAA atmosphere to complete creating.
When the atmosphere creation processes are full, run the post-deployment configuration steps. One of many steps right here is so as to add authorization data to the created DynamoDB desk in your customers. You want to outline the Apache Airflow rbac_role
for every of your end-users, which the Lambda authorizer perform matches to offer the requisite entry.
Confirm entry
When you’ve accomplished with the post-deployment steps, you possibly can log in to the URL utilizing your ALB FQDN. For instance, In case your ALB FQDN is alb-sso-mwaa.instance.com
, you possibly can log in to your goal Amazon MWAA atmosphere, named Env1
, assuming a selected Apache Airflow RBAC function (corresponding to Admin
), utilizing the next URL: https://alb-sso-mwaa.instance.com/aws_mwaa/aws-console-sso?mwaa_env=Env1&rbac_role=Admin
. For the Amazon MWAA environments that this resolution created, you could have applicable Apache Airflow rbac_role
entries in your DynamoDB desk.
The answer additionally supplies a logout characteristic. To log off from an Apache Airflow console, use the traditional Apache Airflow console logout. To log off from the ALB, you possibly can, for instance, use the URL https://alb-sso-mwaa.instance.com/logout
.
Clear up
Observe the readme documented steps within the part Destroy CDK stacks within the GitHub repo, which reveals how one can clear up the artifacts created by way of the AWS CDK deployments. Bear in mind to revert any handbook configurations, like VPC peering connections, that you just might need made after the deployments.
Conclusion
This put up offered an answer to combine your group’s OIDC-based IdPs with Amazon MWAA to grant safe entry to a number of Amazon MWAA environments. We walked by the answer that solves this downside utilizing infrastructure as code. This resolution permits completely different end-user personas in your group to entry the Amazon MWAA Apache Airflow UI utilizing OIDC SSO.
To make use of the answer in your personal environments, check with Software load balancer single-sign-on for Amazon MWAA. For extra code examples on Amazon MWAA, check with Amazon MWAA code examples.
Concerning the Authors
Ajay Vohra is a Principal Prototyping Architect specializing in notion machine studying for autonomous car growth. Previous to Amazon, Ajay labored within the space of massively parallel grid-computing for monetary danger modeling.
Jaswanth Kumar is a customer-obsessed Cloud Software Architect at AWS in NY. Jaswanth excels in utility refactoring and migration, with experience in containers and serverless options, coupled with a Masters Diploma in Utilized Laptop Science.
Aneel Murari is a Sr. Serverless Specialist Answer Architect at AWS based mostly within the Washington, D.C. space. He has over 18 years of software program growth and structure expertise and holds a graduate diploma in Laptop Science. Aneel helps AWS prospects orchestrate their workflows on Amazon Managed Apache Airflow (MWAA) in a safe, price efficient and efficiency optimized method.
Parnab Basak is a Options Architect and a Serverless Specialist at AWS. He makes a speciality of creating new options which might be cloud native utilizing fashionable software program growth practices like serverless, DevOps, and analytics. Parnab works carefully within the analytics and integration companies area serving to prospects undertake AWS companies for his or her workflow orchestration wants.