Multi-account model deployment with Amazon SageMaker Pipelines

Amazon Sagemaker Pipeline is the first purpose-built Ci/CD Service for machine learning (ML). It helps you create and manage ML work-to-end, end-to-end ML workflows and implement Dev / Apps best practices ML (also known as ML).

Creating multiple accounts to streamline all the resources of your organization is a good DevOps practice. A multi-account strategy is important not only for improving governance but also for increasing the security and control of resources that support your organization’s business. This strategy allows many different teams inside your organization to experiment, innovate, and rapidly integrate to keep the production environment safe and available for their customers.

The pipeline makes it easy to implement the same strategy for implementing the ML model. Imagine a use case in which you have three different AWS accounts, one for each environment: data science, staging, and production. The data scientist has the freedom to run experiments and trains and adopt different models to his account at any time. When a model is sufficient to be deployed in production, the data scientist only needs to flip the model approval status Approved. After that, an automated process shows the model on the staging account. Here you can automate the testing of the model with unit tests or integration tests or test the model manually. Following a manual or automated approval, the model is deployed to a production account, which is a more controlled environment used to serve infections over real-world data. With pipelines, you can implement a ready account for a multi-user environment.

In this post, you learn how to use the pipeline to implement your own multi-account ML pipeline. First, you learn how to configure your environment and prepare to use a predefined template as a Pagemaker project for training and deploying a model in two different accounts: staging and production. Then, you go into detail about how this custom template was created and how to create and customize the template for your own SageMaker projects.

Environmental preparedness

In this section, you configure three different AWS accounts and use SageMaker Studio to create a project that integrates the CI / CD pipeline with the ML pipeline created by the data scientist. The following diagram illustrates the context architecture of the environment created by the SageMaker custom project and how the AWS organization integrates different accounts.

The diagram contains three different accounts managed by the organization. Also, three different user roles (which may be the same person) operate this environment:

  • ML engineer – Responsible for the provision of the SageMaker Studio project creating the CI / CD pipeline, model registry, and other resources
  • Data scientist – ML is responsible for creating the pipeline that ends with a trained model registered to the model group (also referred to as Model package group)
  • Approvers – Responsible for testing the model deployed in the staging account and approving the production deployment

If you wish, it is possible to run a similar solution without organization (although not recommended). But you need to manually create the permissions and trust relationship between your accounts and modify the template to remove the dependencies of the organizations. If you are an enterprise with multiple AWS accounts and teams, however, it is highly recommended that you use the AWS Control Tower to provision accounts and organizations. AWS Control Tower provides the easiest way to install and control a new and secure multi-account AWS environment. For this post, we only discuss implementing solutions with organizations.

But before you proceed, you need to complete the following steps, which are detailed in the next sections:

  1. Create an AWS account used by data scientists (data science account).
  2. Create and configure a SageMaker Studio domain in the data science account.
  3. Create two additional accounts for production and staging.
  4. Create an organizational structure using organizations, then invite and integrate additional accounts.
  5. Configure the required permissions to run pipelines and deploy models to external accounts.
  6. To deploy the model to multiple accounts, import the SageMaker project template and make it available for SageMaker Studio.

Configure SageMaker Studio in your account

The pipeline provides built-in support for MLOps templates to make it easier to use CI / CD for your ML projects. These map templates are defined as the Amazon CloudFormation template and published through the AWS service catalog. These are made available to data scientists through an IDE for SageMaker Studio, ML. To configure Studio in your account, complete the following steps:

  1. Create your SageMaker Studio domain.
  2. Enable the SageMaker Project Template and SageMaker Jumpstart for this account and Studio users.

If you have an existing domain, you can simply edit the settings for the domain or individual users to enable this option. Enabling this option creates two separate AWS Identity and Account Management (IAM) roles in your AWS account:

  • AmazonSageMakerServiceCatalogProductsLaunchRole – used by SageMaker to run project templates and create the necessary infrastructure resources
  • AmazonSageMakerServiceCatalogProductsUseRole – Used by the CI / CD pipeline to run jobs and deploy models to target accounts

If you have re-created your SageMaker Studio domain: Invent 2020, it is recommended that you refresh your environment by saving your work in progress. On The File Menu, select To close, And confirm your choice.

  1. If you do not have it yet, create and prepare two other AWS accounts for staging and production.

Configure organizations

You need to add a data science account and two additional accounts to a structure in the organization. When you grow up and increase your AWS resources, the organization helps you to center and control your environment. It is free and takes advantage of your governance strategy.

Each account must be combined into a separate organizational unit (OU).

  1. On the Organization Console, structure the OUs like the following:
  • The root
    • multi-account-deploymenT (or)
      • 111111111111 (Data science account – Sagemaker Studio)
      • production (or
        • 222222222222 (AWS account)
      • staging (or
        • 333333333333 (AWS account)

After the organization is configured, each account owner receives an invitation. Owners are required to accept invitations, otherwise, the organization does not include accounts.

  1. Now you need to enable reliable access with AWS organizations (“Enable all features” and “Enable reliable access to StackSets”).

This process allows your data science account to provision resources to target accounts. If you do not do this, the deployment process fails. Also, this feature set is the preferred way of working with organizations and includes consolidating billing features.

  1. Next, on the organization console, select Organize accounts.
  2. Choose Scaffolding.
  3. Note the OU ID.
  4. Repeat this process for production or.

Repeat this process for the production OU.

Configure Permissions

You need to create a SageMaker execution role in each additional account. These roles are assumed AmazonSageMakerServiceCatalogProductsUseRole To deploy and test endpoints in target accounts in a data science account.

  1. Sign in to the AWS Management Console with the staging account.
  2. Play the following CloudFormation Template.

This template creates a new SageMaker role for you.

  1. Provide the following parameters:
    1. SageMakerRoleSuffix – A short string (up to 10 lowercase with no spaces or alphanumeric characters) that is added to the role name after the following prefix: sagemaker-role-. Last role name is sagemaker-role-<<sagemaker_role_suffix.
    2. line pipe – ARN of the role from the Data Science account that assumes the SageMaker role you are creating. To find the ARN, sign in to the console with the data science account. On the IAM console, select The roles And search for AmazonSageMakerServiceCatalogProductsUseRole. Select this role and copy the ARN (arn:aws:iam::<<data_science_acccount_id:role/service-role/AmazonSageMakerServiceCatalogProductsUseRole) Belongs to.
  2. After creating this role in the staging account, repeat this process for the production account.

In the data science account, you now configure the policy of the Amazon Simple Storage Service (Amazon S3) bucket used to store trained models. For this post, we use the current field’s default SageMaker bucket. It has the following name format: sagemaker-<-<<aws_account_id.

  1. On the Amazon S3 console, find this bucket, which provides the ID of the area and data science account you are using.

If you don’t find it, create a new bucket after this name format.

  1. On Permissions Tab, add the following policy:
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Principal": {
                    "AWS": [
                        "arn:aws:iam::<:root",
                        "arn:aws:iam::<:root"
                    ]
                },
                "Action": [
                    "s3:GetObject",
                    "s3:ListBucket"
                ],
                "Resource": [
                    "arn:aws:s3:::sagemaker-<-<",
                    "arn:aws:s3:::sagemaker-<-</*"
                ]
            }
        ]
    }
  1. Save your settings.

Target accounts now have permission to read trained models during deployment.

The next step is to add new permissions to the roles AmazonSageMakerServiceCatalogProductsUseRole And AmazonSageMakerServiceCatalogProductsLaunchRole.

  1. In the data science account, on the IAM console, select The roles.
  2. Find out AmazonSageMakerServiceCatalogProductsUseRole Role and select it.
  3. Add a new policy and enter …

Stay on top - Get the daily news in your inbox

DMCA / Correction Notice

Recent Articles

Related Stories