DataQualityJobDefinition
The DataQualityJobDefinition resource lets you create and manage AWS SageMaker DataQualityJobDefinitions that define the inputs, outputs, and settings for running data quality monitoring jobs on your datasets.
Minimal Example
Section titled “Minimal Example”This example demonstrates how to create a basic DataQualityJobDefinition with required properties and some common optional settings.
import AWS from "alchemy/aws/control";
const dataQualityJobDefinition = await AWS.SageMaker.DataQualityJobDefinition("basicDataQualityJob", { DataQualityJobInput: { S3Input: { S3Uri: "s3://my-bucket/data", LocalPath: "/data/input", ContentType: "application/json" } }, DataQualityAppSpecification: { ImageUri: "123456789012.dkr.ecr.us-west-2.amazonaws.com/my-data-quality-image:latest", ContainerEntrypoint: ["python3", "data_quality_script.py"], ContainerArguments: ["--input", "/data/input"] }, JobResources: { ClusterConfig: { InstanceType: "ml.m5.large", InstanceCount: 1, VolumeSizeInGB: 20 } }, RoleArn: "arn:aws:iam::123456789012:role/my-sagemaker-role", DataQualityJobOutputConfig: { S3Output: { S3Uri: "s3://my-bucket/output", LocalPath: "/data/output" } }, EndpointName: "my-endpoint"});
Advanced Configuration
Section titled “Advanced Configuration”This example illustrates how to configure a DataQualityJobDefinition with additional advanced settings like a stopping condition and network configuration.
const advancedDataQualityJobDefinition = await AWS.SageMaker.DataQualityJobDefinition("advancedDataQualityJob", { DataQualityJobInput: { S3Input: { S3Uri: "s3://my-bucket/data", LocalPath: "/data/input", ContentType: "application/json" } }, DataQualityAppSpecification: { ImageUri: "123456789012.dkr.ecr.us-west-2.amazonaws.com/my-data-quality-image:latest", ContainerEntrypoint: ["python3", "data_quality_script.py"], ContainerArguments: ["--input", "/data/input"] }, JobResources: { ClusterConfig: { InstanceType: "ml.m5.large", InstanceCount: 1, VolumeSizeInGB: 20 } }, RoleArn: "arn:aws:iam::123456789012:role/my-sagemaker-role", DataQualityJobOutputConfig: { S3Output: { S3Uri: "s3://my-bucket/output", LocalPath: "/data/output" } }, StoppingCondition: { MaxRuntimeInSeconds: 3600 }, NetworkConfig: { EnableNetworkIsolation: true, VpcConfig: { SecurityGroupIds: ["sg-0abcd1234efgh5678"], Subnets: ["subnet-0abcd1234efgh5678"] } }});
Using Tags
Section titled “Using Tags”This example demonstrates how to create a DataQualityJobDefinition with tags for better resource management and tracking.
const taggedDataQualityJobDefinition = await AWS.SageMaker.DataQualityJobDefinition("taggedDataQualityJob", { DataQualityJobInput: { S3Input: { S3Uri: "s3://my-bucket/data", LocalPath: "/data/input", ContentType: "application/json" } }, DataQualityAppSpecification: { ImageUri: "123456789012.dkr.ecr.us-west-2.amazonaws.com/my-data-quality-image:latest", ContainerEntrypoint: ["python3", "data_quality_script.py"], ContainerArguments: ["--input", "/data/input"] }, JobResources: { ClusterConfig: { InstanceType: "ml.m5.large", InstanceCount: 1, VolumeSizeInGB: 20 } }, RoleArn: "arn:aws:iam::123456789012:role/my-sagemaker-role", DataQualityJobOutputConfig: { S3Output: { S3Uri: "s3://my-bucket/output", LocalPath: "/data/output" } }, Tags: [ { Key: "Environment", Value: "Production" }, { Key: "Team", Value: "DataScience" } ]});