Skip to main content

Databricks Serverless Workspace on AWS

This example demonstrates a multi-provider stack that provisions a complete Databricks serverless workspace on AWS using stackql-deploy. The stack spans three providers: awscc for AWS Cloud Control resources, databricks_account for Databricks account-level operations, and databricks_workspace for workspace-level configuration.

Stack Overview

The stack provisions the following resources in order:

#ResourceProviderDescription
1aws_cross_account_roleawscc.iam.rolesIAM role for Databricks cross-account access
2databricks_account_credentialsdatabricks_account.provisioning.credentialsRegisters the IAM role as a credential configuration
3aws_s3_workspace_bucketawscc.s3.bucketsRoot S3 bucket for workspace storage
4aws_s3_workspace_bucket_policyawscc.s3.bucket_policiesGrants Databricks access to the workspace bucket
5databricks_storage_configurationdatabricks_account.provisioning.storageRegisters the S3 bucket as a storage configuration
6aws_s3_metastore_bucketawscc.s3.bucketsS3 bucket for Unity Catalog metastore
7aws_metastore_access_roleawscc.iam.rolesIAM role for Unity Catalog metastore S3 access
8databricks_workspacedatabricks_account.provisioning.workspacesThe Databricks workspace itself
9workspace_admins_groupdatabricks_account.iam.account_groupsAdmin group for the workspace
10get_databricks_usersdatabricks_account.iam.usersLooks up user IDs for group membership (query)
11databricks_account/update_group_membershipdatabricks_account.iam.account_groupsAdds users to the admin group (command)
12databricks_account/workspace_assignmentdatabricks_account.iam.workspace_assignmentAssigns the admin group to the workspace (command)
13databricks_workspace/storage_credentialsdatabricks_workspace.catalog.storage_credentialsUnity Catalog storage credential
14aws/iam/update_metastore_access_roleawscc.iam.rolesUpdates the metastore role trust policy with the external ID (command)
15databricks_credential_grantsdatabricks_workspace.catalog.grantsGrants privileges on the storage credential (command)
16external_locationdatabricks_workspace.catalog.external_locationsUnity Catalog external location
17databricks_workspace/unitycatalog/location_grantsdatabricks_workspace.catalog.grantsGrants privileges on the external location (command)

Prerequisites

  • stackql-deploy installed (releases)

  • Environment variables:

    export AWS_ACCESS_KEY_ID=your_aws_access_key
    export AWS_SECRET_ACCESS_KEY=your_aws_secret_key
    export AWS_REGION=us-east-1
    export AWS_ACCOUNT_ID=your_aws_account_id
    export DATABRICKS_ACCOUNT_ID=your_databricks_account_id
    export DATABRICKS_AWS_ACCOUNT_ID=414351767826
    export DATABRICKS_CLIENT_ID=your_databricks_client_id
    export DATABRICKS_CLIENT_SECRET=your_databricks_client_secret

Deploying the Stack

stackql-deploy build examples/databricks/serverless dev

Dry run:

stackql-deploy build examples/databricks/serverless dev --dry-run --show-queries

Testing the stack:

stackql-deploy test examples/databricks/serverless dev

Tearing down:

stackql-deploy teardown examples/databricks/serverless dev

stackql_manifest.yml

The manifest demonstrates several advanced features including multi-provider stacks, version-pinned providers, file() directives for externalized policy documents, the return_vals construct for capturing identifiers from RETURNING clauses, and command and query resource types alongside standard resource types.

Click to expand the stackql_manifest.yml file
version: 1
name: "stackql-serverless"
description: creates a serverless databricks workspace
providers:
- awscc::v26.03.00379
- databricks_account::v26.03.00381
- databricks_workspace::v26.03.00381
globals:
- name: databricks_account_id
description: databricks account id
value: "{{ DATABRICKS_ACCOUNT_ID }}"
- name: databricks_aws_account_id
description: databricks AWS account id
value: "{{ DATABRICKS_AWS_ACCOUNT_ID }}"
- name: aws_account
description: aws_account id
value: "{{ AWS_ACCOUNT_ID }}"
- name: region
description: aws region
value: "{{ AWS_REGION }}"
- name: global_tags
value:
- Key: 'stackql:stack-name'
Value: "{{ stack_name }}"
- Key: 'stackql:stack-env'
Value: "{{ stack_env }}"
- Key: 'stackql:resource-name'
Value: "{{ resource_name }}"

# ... resources defined in order of dependencies
# see full manifest in the examples/databricks/serverless directory

exports:
- workspace_name
- workspace_id
- deployment_name
- workspace_status
- workspace_url

Resource Query Files

This resource demonstrates the create/update/statecheck/exports pattern with Cloud Control, using generate_patch_document for updates and AWS_POLICY_EQUAL for policy comparison in the statecheck.

/*+ exists */
SELECT count(*) as count
FROM awscc.iam.roles
WHERE region = 'us-east-1' AND
Identifier = '{{ role_name }}';

/*+ create */
INSERT INTO awscc.iam.roles (
AssumeRolePolicyDocument, Description, ManagedPolicyArns,
MaxSessionDuration, Path, PermissionsBoundary,
Policies, RoleName, Tags, region
)
SELECT
'{{ assume_role_policy_document }}', '{{ description }}',
'{{ managed_policy_arns }}', '{{ max_session_duration }}',
'{{ path }}', '{{ permissions_boundary }}',
'{{ policies }}', '{{ role_name }}', '{{ tags }}', 'us-east-1';

/*+ update */
UPDATE awscc.iam.roles
SET PatchDocument = string('{{ {
"AssumeRolePolicyDocument": assume_role_policy_document,
"Description": description,
"ManagedPolicyArns": managed_policy_arns,
"MaxSessionDuration": max_session_duration,
"PermissionsBoundary": permissions_boundary,
"Path": path,
"Policies": policies,
"Tags": tags
} | generate_patch_document }}')
WHERE region = 'us-east-1'
AND Identifier = '{{ role_name }}';

/*+ statecheck, retries=5, retry_delay=10 */
SELECT COUNT(*) as count FROM (
SELECT
max_session_duration, path,
AWS_POLICY_EQUAL(assume_role_policy_document,
'{{ assume_role_policy_document }}') as test_assume_role_policy_doc,
AWS_POLICY_EQUAL(policies, '{{ policies }}') as test_policies
FROM awscc.iam.roles
WHERE Identifier = '{{ role_name }}' AND region = 'us-east-1')t
WHERE test_assume_role_policy_doc = 1
AND test_policies = 1
AND path = '{{ path }}';

/*+ exports */
SELECT arn, role_name
FROM awscc.iam.roles
WHERE region = 'us-east-1' AND
Identifier = '{{ role_name }}';

/*+ delete */
DELETE FROM awscc.iam.roles
WHERE Identifier = '{{ role_name }}'
AND region = 'us-east-1';

Key Patterns

Multi-Provider Stacks

This stack uses three providers with version pinning:

providers:
- awscc::v26.03.00379
- databricks_account::v26.03.00381
- databricks_workspace::v26.03.00381

Externalized Policy Documents

Complex IAM policy statements are stored as JSON files and loaded using the file() directive:

policies:
- PolicyDocument:
Statement:
- file(aws/iam/policy_statements/cross_account_role/ec2_permissions.json)
- file(aws/iam/policy_statements/cross_account_role/iam_service_linked_role.json)
Version: '2012-10-17'
PolicyName: "{{ stack_name }}-{{ stack_env }}-policy"

return_vals for Identifier Capture

When a provider returns an identifier during creation that can't be predicted (e.g. auto-generated IDs), use return_vals to capture it:

return_vals:
create:
- id: databricks_group_id # maps provider field 'id' to 'databricks_group_id'

Stack-Level Exports

The manifest defines stack-level exports that are displayed after a successful build or test and written to .stackql-deploy-exports for sourcing into the shell:

exports:
- workspace_name
- workspace_id
- deployment_name
- workspace_status
- workspace_url

More Information

The complete code for this example stack is available here. For more information: