Amazon Redshift

📘

This article is for data engineers and DevOps

This guide will provide you with the steps you need to perform in order to give your Amazon Redshift access to read data from S3 bucket used for Data Export.

Before you continue with this guide please make sure that you have completed the Data Export for Amazon S3 since it is requirement for ETL for Amazon Redshift.

IAM Policy for Redshift

1. Redshift → S3 Policy

To load data from S3 bucket into Redshift, you have to do the following steps. This will add permissions for your Redshift cluster to read data from a single S3 bucket.

  1. Login as Administrator into your AWS account

  2. Go to Policies page https://us-east-1.console.aws.amazon.com/iamv2/home#/policies

  3. Click Create policy

  4. Click JSON tab item

  5. Enter this manifest:

  6. Replace YourAttributionS3Bucket with actual name of your bucket used for Attribution Data Export

    {
      "Version": "2012-10-17",
      "Statement": [
          {
              "Sid": "VisualEditor0",
              "Effect": "Allow",
              "Action": [
                  "s3:Get*",
                  "s3:List*"
              ],
              "Resource": [
                  "arn:aws:s3:::YourAttributionS3Bucket/*",
                  "arn:aws:s3:::YourAttributionS3Bucket"
              ]
          }
      ]
    }
    
  7. Click Next: Tags

  8. Click Next: Review

  9. Add name: AttributionS3BucketReadOnly

  10. Click Create policy

2. AWS Glue policy:

  1. Go to "Policies" page: https://us-east-1.console.aws.amazon.com/iamv2/home#/policies
  2. Create a new policy named AttributionGlue with this JSON content:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ATBGluePolicy",
      "Effect": "Allow",
      "Action": [
        "glue:CreateTable",
        "glue:CreateDatabase",
        "glue:DeleteTable",
        "glue:GetTable"
      ],
      "Resource": "*"
    }
  ]
}

3. Create a role and attach it to Redshift

  1. Go to Roles page: https://us-east-1.console.aws.amazon.com/iamv2/home#/roles
    1. Click Create role
    2. Leave AWS Service selected
    3. Select Redshift in Other Services
    4. Click Redshift - Customizable
    5. Click Next
  2. On the Add permissions page:
    1. Select AttributionS3BucketReadOnly and AttributionGlue policies
    2. Click Next
  3. Enter role name: AttributionS3DataLoader
    1. Click Create role
    2. Copy ARN of the created role which should look like arn:aws:iam::0000000000:role/AttributionS3DataLoader (you will need to let us know this string).
  4. Go to your Redshift cluster settings
    1. Click Properties tab
    2. Click Associate IAM role
    3. Select AttributionS3DataLoader role (and additional roles if you need them)
    4. Click Associate IAM roles

4. Create Redshift user

  1. Go to redshift query editor: https://us-east-1.console.aws.amazon.com/sqlworkbench/home?region=us-east-1#/client
  2. Create a user and DB for Attribution data loader:
CREATE USER attribution WITH PASSWORD 'PRE_ENCRYPTED_PASSWORD_GOES_HERE';
CREATE DATABASE attribution WITH OWNER attribution;

We will generate a strong password and send it to you in pre-encrypted form.

If you want to generate password yourself, please use this SQL code:

SELECT 'md5' || md5('PLAIN_PASSWORD' || 'attribution');

More info: https://docs.aws.amazon.com/redshift/latest/mgmt/authorizing-redshift-service.html

Once described steps above are completed contact Attribution support to enable ETL service.