1. Packages
  2. AWS Cloud Control
  3. API Docs
  4. glue
  5. Crawler

We recommend new projects start with resources from the AWS provider.

AWS Cloud Control v1.9.0 published on Monday, Nov 18, 2024 by Pulumi

aws-native.glue.Crawler

Explore with Pulumi AI

aws-native logo

We recommend new projects start with resources from the AWS provider.

AWS Cloud Control v1.9.0 published on Monday, Nov 18, 2024 by Pulumi

    Resource Type definition for AWS::Glue::Crawler

    Create Crawler Resource

    Resources are created with functions called constructors. To learn more about declaring and configuring resources, see Resources.

    Constructor syntax

    new Crawler(name: string, args: CrawlerArgs, opts?: CustomResourceOptions);
    @overload
    def Crawler(resource_name: str,
                args: CrawlerArgs,
                opts: Optional[ResourceOptions] = None)
    
    @overload
    def Crawler(resource_name: str,
                opts: Optional[ResourceOptions] = None,
                role: Optional[str] = None,
                targets: Optional[CrawlerTargetsArgs] = None,
                name: Optional[str] = None,
                database_name: Optional[str] = None,
                description: Optional[str] = None,
                lake_formation_configuration: Optional[CrawlerLakeFormationConfigurationArgs] = None,
                classifiers: Optional[Sequence[str]] = None,
                recrawl_policy: Optional[CrawlerRecrawlPolicyArgs] = None,
                crawler_security_configuration: Optional[str] = None,
                schedule: Optional[CrawlerScheduleArgs] = None,
                schema_change_policy: Optional[CrawlerSchemaChangePolicyArgs] = None,
                table_prefix: Optional[str] = None,
                tags: Optional[Any] = None,
                configuration: Optional[str] = None)
    func NewCrawler(ctx *Context, name string, args CrawlerArgs, opts ...ResourceOption) (*Crawler, error)
    public Crawler(string name, CrawlerArgs args, CustomResourceOptions? opts = null)
    public Crawler(String name, CrawlerArgs args)
    public Crawler(String name, CrawlerArgs args, CustomResourceOptions options)
    
    type: aws-native:glue:Crawler
    properties: # The arguments to resource properties.
    options: # Bag of options to control resource's behavior.
    
    

    Parameters

    name string
    The unique name of the resource.
    args CrawlerArgs
    The arguments to resource properties.
    opts CustomResourceOptions
    Bag of options to control resource's behavior.
    resource_name str
    The unique name of the resource.
    args CrawlerArgs
    The arguments to resource properties.
    opts ResourceOptions
    Bag of options to control resource's behavior.
    ctx Context
    Context object for the current deployment.
    name string
    The unique name of the resource.
    args CrawlerArgs
    The arguments to resource properties.
    opts ResourceOption
    Bag of options to control resource's behavior.
    name string
    The unique name of the resource.
    args CrawlerArgs
    The arguments to resource properties.
    opts CustomResourceOptions
    Bag of options to control resource's behavior.
    name String
    The unique name of the resource.
    args CrawlerArgs
    The arguments to resource properties.
    options CustomResourceOptions
    Bag of options to control resource's behavior.

    Crawler Resource Properties

    To learn more about resource properties and how to use them, see Inputs and Outputs in the Architecture and Concepts docs.

    Inputs

    In Python, inputs that are objects can be passed either as argument classes or as dictionary literals.

    The Crawler resource accepts the following input properties:

    Role string
    The Amazon Resource Name (ARN) of an IAM role that's used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.
    Targets Pulumi.AwsNative.Glue.Inputs.CrawlerTargets
    A collection of targets to crawl.
    Classifiers List<string>
    A list of UTF-8 strings that specify the names of custom classifiers that are associated with the crawler.
    Configuration string
    Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior.
    CrawlerSecurityConfiguration string
    The name of the SecurityConfiguration structure to be used by this crawler.
    DatabaseName string
    The name of the database in which the crawler's output is stored.
    Description string
    A description of the crawler.
    LakeFormationConfiguration Pulumi.AwsNative.Glue.Inputs.CrawlerLakeFormationConfiguration
    Specifies whether the crawler should use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
    Name string
    The name of the crawler.
    RecrawlPolicy Pulumi.AwsNative.Glue.Inputs.CrawlerRecrawlPolicy
    A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
    Schedule Pulumi.AwsNative.Glue.Inputs.CrawlerSchedule
    For scheduled crawlers, the schedule when the crawler runs.
    SchemaChangePolicy Pulumi.AwsNative.Glue.Inputs.CrawlerSchemaChangePolicy

    The policy that specifies update and delete behaviors for the crawler. The policy tells the crawler what to do in the event that it detects a change in a table that already exists in the customer's database at the time of the crawl. The SchemaChangePolicy does not affect whether or how new tables and partitions are added. New tables and partitions are always created regardless of the SchemaChangePolicy on a crawler.

    The SchemaChangePolicy consists of two components, UpdateBehavior and DeleteBehavior .

    TablePrefix string
    The prefix added to the names of tables that are created.
    Tags object

    The tags to use with this crawler.

    Search the CloudFormation User Guide for AWS::Glue::Crawler for more information about the expected schema for this property.

    Role string
    The Amazon Resource Name (ARN) of an IAM role that's used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.
    Targets CrawlerTargetsArgs
    A collection of targets to crawl.
    Classifiers []string
    A list of UTF-8 strings that specify the names of custom classifiers that are associated with the crawler.
    Configuration string
    Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior.
    CrawlerSecurityConfiguration string
    The name of the SecurityConfiguration structure to be used by this crawler.
    DatabaseName string
    The name of the database in which the crawler's output is stored.
    Description string
    A description of the crawler.
    LakeFormationConfiguration CrawlerLakeFormationConfigurationArgs
    Specifies whether the crawler should use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
    Name string
    The name of the crawler.
    RecrawlPolicy CrawlerRecrawlPolicyArgs
    A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
    Schedule CrawlerScheduleArgs
    For scheduled crawlers, the schedule when the crawler runs.
    SchemaChangePolicy CrawlerSchemaChangePolicyArgs

    The policy that specifies update and delete behaviors for the crawler. The policy tells the crawler what to do in the event that it detects a change in a table that already exists in the customer's database at the time of the crawl. The SchemaChangePolicy does not affect whether or how new tables and partitions are added. New tables and partitions are always created regardless of the SchemaChangePolicy on a crawler.

    The SchemaChangePolicy consists of two components, UpdateBehavior and DeleteBehavior .

    TablePrefix string
    The prefix added to the names of tables that are created.
    Tags interface{}

    The tags to use with this crawler.

    Search the CloudFormation User Guide for AWS::Glue::Crawler for more information about the expected schema for this property.

    role String
    The Amazon Resource Name (ARN) of an IAM role that's used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.
    targets CrawlerTargets
    A collection of targets to crawl.
    classifiers List<String>
    A list of UTF-8 strings that specify the names of custom classifiers that are associated with the crawler.
    configuration String
    Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior.
    crawlerSecurityConfiguration String
    The name of the SecurityConfiguration structure to be used by this crawler.
    databaseName String
    The name of the database in which the crawler's output is stored.
    description String
    A description of the crawler.
    lakeFormationConfiguration CrawlerLakeFormationConfiguration
    Specifies whether the crawler should use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
    name String
    The name of the crawler.
    recrawlPolicy CrawlerRecrawlPolicy
    A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
    schedule CrawlerSchedule
    For scheduled crawlers, the schedule when the crawler runs.
    schemaChangePolicy CrawlerSchemaChangePolicy

    The policy that specifies update and delete behaviors for the crawler. The policy tells the crawler what to do in the event that it detects a change in a table that already exists in the customer's database at the time of the crawl. The SchemaChangePolicy does not affect whether or how new tables and partitions are added. New tables and partitions are always created regardless of the SchemaChangePolicy on a crawler.

    The SchemaChangePolicy consists of two components, UpdateBehavior and DeleteBehavior .

    tablePrefix String
    The prefix added to the names of tables that are created.
    tags Object

    The tags to use with this crawler.

    Search the CloudFormation User Guide for AWS::Glue::Crawler for more information about the expected schema for this property.

    role string
    The Amazon Resource Name (ARN) of an IAM role that's used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.
    targets CrawlerTargets
    A collection of targets to crawl.
    classifiers string[]
    A list of UTF-8 strings that specify the names of custom classifiers that are associated with the crawler.
    configuration string
    Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior.
    crawlerSecurityConfiguration string
    The name of the SecurityConfiguration structure to be used by this crawler.
    databaseName string
    The name of the database in which the crawler's output is stored.
    description string
    A description of the crawler.
    lakeFormationConfiguration CrawlerLakeFormationConfiguration
    Specifies whether the crawler should use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
    name string
    The name of the crawler.
    recrawlPolicy CrawlerRecrawlPolicy
    A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
    schedule CrawlerSchedule
    For scheduled crawlers, the schedule when the crawler runs.
    schemaChangePolicy CrawlerSchemaChangePolicy

    The policy that specifies update and delete behaviors for the crawler. The policy tells the crawler what to do in the event that it detects a change in a table that already exists in the customer's database at the time of the crawl. The SchemaChangePolicy does not affect whether or how new tables and partitions are added. New tables and partitions are always created regardless of the SchemaChangePolicy on a crawler.

    The SchemaChangePolicy consists of two components, UpdateBehavior and DeleteBehavior .

    tablePrefix string
    The prefix added to the names of tables that are created.
    tags any

    The tags to use with this crawler.

    Search the CloudFormation User Guide for AWS::Glue::Crawler for more information about the expected schema for this property.

    role str
    The Amazon Resource Name (ARN) of an IAM role that's used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.
    targets CrawlerTargetsArgs
    A collection of targets to crawl.
    classifiers Sequence[str]
    A list of UTF-8 strings that specify the names of custom classifiers that are associated with the crawler.
    configuration str
    Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior.
    crawler_security_configuration str
    The name of the SecurityConfiguration structure to be used by this crawler.
    database_name str
    The name of the database in which the crawler's output is stored.
    description str
    A description of the crawler.
    lake_formation_configuration CrawlerLakeFormationConfigurationArgs
    Specifies whether the crawler should use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
    name str
    The name of the crawler.
    recrawl_policy CrawlerRecrawlPolicyArgs
    A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
    schedule CrawlerScheduleArgs
    For scheduled crawlers, the schedule when the crawler runs.
    schema_change_policy CrawlerSchemaChangePolicyArgs

    The policy that specifies update and delete behaviors for the crawler. The policy tells the crawler what to do in the event that it detects a change in a table that already exists in the customer's database at the time of the crawl. The SchemaChangePolicy does not affect whether or how new tables and partitions are added. New tables and partitions are always created regardless of the SchemaChangePolicy on a crawler.

    The SchemaChangePolicy consists of two components, UpdateBehavior and DeleteBehavior .

    table_prefix str
    The prefix added to the names of tables that are created.
    tags Any

    The tags to use with this crawler.

    Search the CloudFormation User Guide for AWS::Glue::Crawler for more information about the expected schema for this property.

    role String
    The Amazon Resource Name (ARN) of an IAM role that's used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.
    targets Property Map
    A collection of targets to crawl.
    classifiers List<String>
    A list of UTF-8 strings that specify the names of custom classifiers that are associated with the crawler.
    configuration String
    Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior.
    crawlerSecurityConfiguration String
    The name of the SecurityConfiguration structure to be used by this crawler.
    databaseName String
    The name of the database in which the crawler's output is stored.
    description String
    A description of the crawler.
    lakeFormationConfiguration Property Map
    Specifies whether the crawler should use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
    name String
    The name of the crawler.
    recrawlPolicy Property Map
    A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
    schedule Property Map
    For scheduled crawlers, the schedule when the crawler runs.
    schemaChangePolicy Property Map

    The policy that specifies update and delete behaviors for the crawler. The policy tells the crawler what to do in the event that it detects a change in a table that already exists in the customer's database at the time of the crawl. The SchemaChangePolicy does not affect whether or how new tables and partitions are added. New tables and partitions are always created regardless of the SchemaChangePolicy on a crawler.

    The SchemaChangePolicy consists of two components, UpdateBehavior and DeleteBehavior .

    tablePrefix String
    The prefix added to the names of tables that are created.
    tags Any

    The tags to use with this crawler.

    Search the CloudFormation User Guide for AWS::Glue::Crawler for more information about the expected schema for this property.

    Outputs

    All input properties are implicitly available as output properties. Additionally, the Crawler resource produces the following output properties:

    Id string
    The provider-assigned unique ID for this managed resource.
    Id string
    The provider-assigned unique ID for this managed resource.
    id String
    The provider-assigned unique ID for this managed resource.
    id string
    The provider-assigned unique ID for this managed resource.
    id str
    The provider-assigned unique ID for this managed resource.
    id String
    The provider-assigned unique ID for this managed resource.

    Supporting Types

    CrawlerCatalogTarget, CrawlerCatalogTargetArgs

    ConnectionName string
    The name of the connection for an Amazon S3-backed Data Catalog table to be a target of the crawl when using a Catalog connection type paired with a NETWORK Connection type.
    DatabaseName string
    The name of the database to be synchronized.
    DlqEventQueueArn string
    A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
    EventQueueArn string
    A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
    Tables List<string>
    A list of the tables to be synchronized.
    ConnectionName string
    The name of the connection for an Amazon S3-backed Data Catalog table to be a target of the crawl when using a Catalog connection type paired with a NETWORK Connection type.
    DatabaseName string
    The name of the database to be synchronized.
    DlqEventQueueArn string
    A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
    EventQueueArn string
    A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
    Tables []string
    A list of the tables to be synchronized.
    connectionName String
    The name of the connection for an Amazon S3-backed Data Catalog table to be a target of the crawl when using a Catalog connection type paired with a NETWORK Connection type.
    databaseName String
    The name of the database to be synchronized.
    dlqEventQueueArn String
    A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
    eventQueueArn String
    A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
    tables List<String>
    A list of the tables to be synchronized.
    connectionName string
    The name of the connection for an Amazon S3-backed Data Catalog table to be a target of the crawl when using a Catalog connection type paired with a NETWORK Connection type.
    databaseName string
    The name of the database to be synchronized.
    dlqEventQueueArn string
    A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
    eventQueueArn string
    A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
    tables string[]
    A list of the tables to be synchronized.
    connection_name str
    The name of the connection for an Amazon S3-backed Data Catalog table to be a target of the crawl when using a Catalog connection type paired with a NETWORK Connection type.
    database_name str
    The name of the database to be synchronized.
    dlq_event_queue_arn str
    A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
    event_queue_arn str
    A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
    tables Sequence[str]
    A list of the tables to be synchronized.
    connectionName String
    The name of the connection for an Amazon S3-backed Data Catalog table to be a target of the crawl when using a Catalog connection type paired with a NETWORK Connection type.
    databaseName String
    The name of the database to be synchronized.
    dlqEventQueueArn String
    A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
    eventQueueArn String
    A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
    tables List<String>
    A list of the tables to be synchronized.

    CrawlerDeltaTarget, CrawlerDeltaTargetArgs

    ConnectionName string
    The name of the connection to use to connect to the Delta table target.
    CreateNativeDeltaTable bool
    Specifies whether the crawler will create native tables, to allow integration with query engines that support querying of the Delta transaction log directly.
    DeltaTables List<string>
    A list of the Amazon S3 paths to the Delta tables.
    WriteManifest bool
    Specifies whether to write the manifest files to the Delta table path.
    ConnectionName string
    The name of the connection to use to connect to the Delta table target.
    CreateNativeDeltaTable bool
    Specifies whether the crawler will create native tables, to allow integration with query engines that support querying of the Delta transaction log directly.
    DeltaTables []string
    A list of the Amazon S3 paths to the Delta tables.
    WriteManifest bool
    Specifies whether to write the manifest files to the Delta table path.
    connectionName String
    The name of the connection to use to connect to the Delta table target.
    createNativeDeltaTable Boolean
    Specifies whether the crawler will create native tables, to allow integration with query engines that support querying of the Delta transaction log directly.
    deltaTables List<String>
    A list of the Amazon S3 paths to the Delta tables.
    writeManifest Boolean
    Specifies whether to write the manifest files to the Delta table path.
    connectionName string
    The name of the connection to use to connect to the Delta table target.
    createNativeDeltaTable boolean
    Specifies whether the crawler will create native tables, to allow integration with query engines that support querying of the Delta transaction log directly.
    deltaTables string[]
    A list of the Amazon S3 paths to the Delta tables.
    writeManifest boolean
    Specifies whether to write the manifest files to the Delta table path.
    connection_name str
    The name of the connection to use to connect to the Delta table target.
    create_native_delta_table bool
    Specifies whether the crawler will create native tables, to allow integration with query engines that support querying of the Delta transaction log directly.
    delta_tables Sequence[str]
    A list of the Amazon S3 paths to the Delta tables.
    write_manifest bool
    Specifies whether to write the manifest files to the Delta table path.
    connectionName String
    The name of the connection to use to connect to the Delta table target.
    createNativeDeltaTable Boolean
    Specifies whether the crawler will create native tables, to allow integration with query engines that support querying of the Delta transaction log directly.
    deltaTables List<String>
    A list of the Amazon S3 paths to the Delta tables.
    writeManifest Boolean
    Specifies whether to write the manifest files to the Delta table path.

    CrawlerDynamoDbTarget, CrawlerDynamoDbTargetArgs

    Path string
    The name of the DynamoDB table to crawl.
    Path string
    The name of the DynamoDB table to crawl.
    path String
    The name of the DynamoDB table to crawl.
    path string
    The name of the DynamoDB table to crawl.
    path str
    The name of the DynamoDB table to crawl.
    path String
    The name of the DynamoDB table to crawl.

    CrawlerIcebergTarget, CrawlerIcebergTargetArgs

    ConnectionName string
    The name of the connection to use to connect to the Iceberg target.
    Exclusions List<string>
    A list of global patterns used to exclude from the crawl.
    MaximumTraversalDepth int
    The maximum depth of Amazon S3 paths that the crawler can traverse to discover the Iceberg metadata folder in your Amazon S3 path. Used to limit the crawler run time.
    Paths List<string>
    One or more Amazon S3 paths that contains Iceberg metadata folders as s3://bucket/prefix .
    ConnectionName string
    The name of the connection to use to connect to the Iceberg target.
    Exclusions []string
    A list of global patterns used to exclude from the crawl.
    MaximumTraversalDepth int
    The maximum depth of Amazon S3 paths that the crawler can traverse to discover the Iceberg metadata folder in your Amazon S3 path. Used to limit the crawler run time.
    Paths []string
    One or more Amazon S3 paths that contains Iceberg metadata folders as s3://bucket/prefix .
    connectionName String
    The name of the connection to use to connect to the Iceberg target.
    exclusions List<String>
    A list of global patterns used to exclude from the crawl.
    maximumTraversalDepth Integer
    The maximum depth of Amazon S3 paths that the crawler can traverse to discover the Iceberg metadata folder in your Amazon S3 path. Used to limit the crawler run time.
    paths List<String>
    One or more Amazon S3 paths that contains Iceberg metadata folders as s3://bucket/prefix .
    connectionName string
    The name of the connection to use to connect to the Iceberg target.
    exclusions string[]
    A list of global patterns used to exclude from the crawl.
    maximumTraversalDepth number
    The maximum depth of Amazon S3 paths that the crawler can traverse to discover the Iceberg metadata folder in your Amazon S3 path. Used to limit the crawler run time.
    paths string[]
    One or more Amazon S3 paths that contains Iceberg metadata folders as s3://bucket/prefix .
    connection_name str
    The name of the connection to use to connect to the Iceberg target.
    exclusions Sequence[str]
    A list of global patterns used to exclude from the crawl.
    maximum_traversal_depth int
    The maximum depth of Amazon S3 paths that the crawler can traverse to discover the Iceberg metadata folder in your Amazon S3 path. Used to limit the crawler run time.
    paths Sequence[str]
    One or more Amazon S3 paths that contains Iceberg metadata folders as s3://bucket/prefix .
    connectionName String
    The name of the connection to use to connect to the Iceberg target.
    exclusions List<String>
    A list of global patterns used to exclude from the crawl.
    maximumTraversalDepth Number
    The maximum depth of Amazon S3 paths that the crawler can traverse to discover the Iceberg metadata folder in your Amazon S3 path. Used to limit the crawler run time.
    paths List<String>
    One or more Amazon S3 paths that contains Iceberg metadata folders as s3://bucket/prefix .

    CrawlerJdbcTarget, CrawlerJdbcTargetArgs

    ConnectionName string
    The name of the connection to use to connect to the JDBC target.
    EnableAdditionalMetadata List<string>

    Specify a value of RAWTYPES or COMMENTS to enable additional metadata in table responses. RAWTYPES provides the native-level datatype. COMMENTS provides comments associated with a column or table in the database.

    If you do not need additional metadata, keep the field empty.

    Exclusions List<string>
    A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler.
    Path string
    The path of the JDBC target.
    ConnectionName string
    The name of the connection to use to connect to the JDBC target.
    EnableAdditionalMetadata []string

    Specify a value of RAWTYPES or COMMENTS to enable additional metadata in table responses. RAWTYPES provides the native-level datatype. COMMENTS provides comments associated with a column or table in the database.

    If you do not need additional metadata, keep the field empty.

    Exclusions []string
    A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler.
    Path string
    The path of the JDBC target.
    connectionName String
    The name of the connection to use to connect to the JDBC target.
    enableAdditionalMetadata List<String>

    Specify a value of RAWTYPES or COMMENTS to enable additional metadata in table responses. RAWTYPES provides the native-level datatype. COMMENTS provides comments associated with a column or table in the database.

    If you do not need additional metadata, keep the field empty.

    exclusions List<String>
    A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler.
    path String
    The path of the JDBC target.
    connectionName string
    The name of the connection to use to connect to the JDBC target.
    enableAdditionalMetadata string[]

    Specify a value of RAWTYPES or COMMENTS to enable additional metadata in table responses. RAWTYPES provides the native-level datatype. COMMENTS provides comments associated with a column or table in the database.

    If you do not need additional metadata, keep the field empty.

    exclusions string[]
    A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler.
    path string
    The path of the JDBC target.
    connection_name str
    The name of the connection to use to connect to the JDBC target.
    enable_additional_metadata Sequence[str]

    Specify a value of RAWTYPES or COMMENTS to enable additional metadata in table responses. RAWTYPES provides the native-level datatype. COMMENTS provides comments associated with a column or table in the database.

    If you do not need additional metadata, keep the field empty.

    exclusions Sequence[str]
    A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler.
    path str
    The path of the JDBC target.
    connectionName String
    The name of the connection to use to connect to the JDBC target.
    enableAdditionalMetadata List<String>

    Specify a value of RAWTYPES or COMMENTS to enable additional metadata in table responses. RAWTYPES provides the native-level datatype. COMMENTS provides comments associated with a column or table in the database.

    If you do not need additional metadata, keep the field empty.

    exclusions List<String>
    A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler.
    path String
    The path of the JDBC target.

    CrawlerLakeFormationConfiguration, CrawlerLakeFormationConfigurationArgs

    AccountId string
    Required for cross account crawls. For same account crawls as the target data, this can be left as null.
    UseLakeFormationCredentials bool
    Specifies whether to use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
    AccountId string
    Required for cross account crawls. For same account crawls as the target data, this can be left as null.
    UseLakeFormationCredentials bool
    Specifies whether to use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
    accountId String
    Required for cross account crawls. For same account crawls as the target data, this can be left as null.
    useLakeFormationCredentials Boolean
    Specifies whether to use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
    accountId string
    Required for cross account crawls. For same account crawls as the target data, this can be left as null.
    useLakeFormationCredentials boolean
    Specifies whether to use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
    account_id str
    Required for cross account crawls. For same account crawls as the target data, this can be left as null.
    use_lake_formation_credentials bool
    Specifies whether to use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
    accountId String
    Required for cross account crawls. For same account crawls as the target data, this can be left as null.
    useLakeFormationCredentials Boolean
    Specifies whether to use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.

    CrawlerMongoDbTarget, CrawlerMongoDbTargetArgs

    ConnectionName string
    The name of the connection to use to connect to the Amazon DocumentDB or MongoDB target.
    Path string
    The path of the Amazon DocumentDB or MongoDB target (database/collection).
    ConnectionName string
    The name of the connection to use to connect to the Amazon DocumentDB or MongoDB target.
    Path string
    The path of the Amazon DocumentDB or MongoDB target (database/collection).
    connectionName String
    The name of the connection to use to connect to the Amazon DocumentDB or MongoDB target.
    path String
    The path of the Amazon DocumentDB or MongoDB target (database/collection).
    connectionName string
    The name of the connection to use to connect to the Amazon DocumentDB or MongoDB target.
    path string
    The path of the Amazon DocumentDB or MongoDB target (database/collection).
    connection_name str
    The name of the connection to use to connect to the Amazon DocumentDB or MongoDB target.
    path str
    The path of the Amazon DocumentDB or MongoDB target (database/collection).
    connectionName String
    The name of the connection to use to connect to the Amazon DocumentDB or MongoDB target.
    path String
    The path of the Amazon DocumentDB or MongoDB target (database/collection).

    CrawlerRecrawlPolicy, CrawlerRecrawlPolicyArgs

    RecrawlBehavior string
    Specifies whether to crawl the entire dataset again or to crawl only folders that were added since the last crawler run. A value of CRAWL_EVERYTHING specifies crawling the entire dataset again. A value of CRAWL_NEW_FOLDERS_ONLY specifies crawling only folders that were added since the last crawler run. A value of CRAWL_EVENT_MODE specifies crawling only the changes identified by Amazon S3 events.
    RecrawlBehavior string
    Specifies whether to crawl the entire dataset again or to crawl only folders that were added since the last crawler run. A value of CRAWL_EVERYTHING specifies crawling the entire dataset again. A value of CRAWL_NEW_FOLDERS_ONLY specifies crawling only folders that were added since the last crawler run. A value of CRAWL_EVENT_MODE specifies crawling only the changes identified by Amazon S3 events.
    recrawlBehavior String
    Specifies whether to crawl the entire dataset again or to crawl only folders that were added since the last crawler run. A value of CRAWL_EVERYTHING specifies crawling the entire dataset again. A value of CRAWL_NEW_FOLDERS_ONLY specifies crawling only folders that were added since the last crawler run. A value of CRAWL_EVENT_MODE specifies crawling only the changes identified by Amazon S3 events.
    recrawlBehavior string
    Specifies whether to crawl the entire dataset again or to crawl only folders that were added since the last crawler run. A value of CRAWL_EVERYTHING specifies crawling the entire dataset again. A value of CRAWL_NEW_FOLDERS_ONLY specifies crawling only folders that were added since the last crawler run. A value of CRAWL_EVENT_MODE specifies crawling only the changes identified by Amazon S3 events.
    recrawl_behavior str
    Specifies whether to crawl the entire dataset again or to crawl only folders that were added since the last crawler run. A value of CRAWL_EVERYTHING specifies crawling the entire dataset again. A value of CRAWL_NEW_FOLDERS_ONLY specifies crawling only folders that were added since the last crawler run. A value of CRAWL_EVENT_MODE specifies crawling only the changes identified by Amazon S3 events.
    recrawlBehavior String
    Specifies whether to crawl the entire dataset again or to crawl only folders that were added since the last crawler run. A value of CRAWL_EVERYTHING specifies crawling the entire dataset again. A value of CRAWL_NEW_FOLDERS_ONLY specifies crawling only folders that were added since the last crawler run. A value of CRAWL_EVENT_MODE specifies crawling only the changes identified by Amazon S3 events.

    CrawlerS3Target, CrawlerS3TargetArgs

    ConnectionName string
    The name of a connection which allows a job or crawler to access data in Amazon S3 within an Amazon Virtual Private Cloud environment (Amazon VPC).
    DlqEventQueueArn string
    A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
    EventQueueArn string
    A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
    Exclusions List<string>
    A list of glob patterns used to exclude from the crawl.
    Path string
    The path to the Amazon S3 target.
    SampleSize int
    Sets the number of files in each leaf folder to be crawled when crawling sample files in a dataset. If not set, all the files are crawled. A valid value is an integer between 1 and 249.
    ConnectionName string
    The name of a connection which allows a job or crawler to access data in Amazon S3 within an Amazon Virtual Private Cloud environment (Amazon VPC).
    DlqEventQueueArn string
    A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
    EventQueueArn string
    A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
    Exclusions []string
    A list of glob patterns used to exclude from the crawl.
    Path string
    The path to the Amazon S3 target.
    SampleSize int
    Sets the number of files in each leaf folder to be crawled when crawling sample files in a dataset. If not set, all the files are crawled. A valid value is an integer between 1 and 249.
    connectionName String
    The name of a connection which allows a job or crawler to access data in Amazon S3 within an Amazon Virtual Private Cloud environment (Amazon VPC).
    dlqEventQueueArn String
    A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
    eventQueueArn String
    A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
    exclusions List<String>
    A list of glob patterns used to exclude from the crawl.
    path String
    The path to the Amazon S3 target.
    sampleSize Integer
    Sets the number of files in each leaf folder to be crawled when crawling sample files in a dataset. If not set, all the files are crawled. A valid value is an integer between 1 and 249.
    connectionName string
    The name of a connection which allows a job or crawler to access data in Amazon S3 within an Amazon Virtual Private Cloud environment (Amazon VPC).
    dlqEventQueueArn string
    A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
    eventQueueArn string
    A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
    exclusions string[]
    A list of glob patterns used to exclude from the crawl.
    path string
    The path to the Amazon S3 target.
    sampleSize number
    Sets the number of files in each leaf folder to be crawled when crawling sample files in a dataset. If not set, all the files are crawled. A valid value is an integer between 1 and 249.
    connection_name str
    The name of a connection which allows a job or crawler to access data in Amazon S3 within an Amazon Virtual Private Cloud environment (Amazon VPC).
    dlq_event_queue_arn str
    A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
    event_queue_arn str
    A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
    exclusions Sequence[str]
    A list of glob patterns used to exclude from the crawl.
    path str
    The path to the Amazon S3 target.
    sample_size int
    Sets the number of files in each leaf folder to be crawled when crawling sample files in a dataset. If not set, all the files are crawled. A valid value is an integer between 1 and 249.
    connectionName String
    The name of a connection which allows a job or crawler to access data in Amazon S3 within an Amazon Virtual Private Cloud environment (Amazon VPC).
    dlqEventQueueArn String
    A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
    eventQueueArn String
    A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
    exclusions List<String>
    A list of glob patterns used to exclude from the crawl.
    path String
    The path to the Amazon S3 target.
    sampleSize Number
    Sets the number of files in each leaf folder to be crawled when crawling sample files in a dataset. If not set, all the files are crawled. A valid value is an integer between 1 and 249.

    CrawlerSchedule, CrawlerScheduleArgs

    ScheduleExpression string
    A cron expression used to specify the schedule. For more information, see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, specify cron(15 12 * * ? *).
    ScheduleExpression string
    A cron expression used to specify the schedule. For more information, see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, specify cron(15 12 * * ? *).
    scheduleExpression String
    A cron expression used to specify the schedule. For more information, see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, specify cron(15 12 * * ? *).
    scheduleExpression string
    A cron expression used to specify the schedule. For more information, see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, specify cron(15 12 * * ? *).
    schedule_expression str
    A cron expression used to specify the schedule. For more information, see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, specify cron(15 12 * * ? *).
    scheduleExpression String
    A cron expression used to specify the schedule. For more information, see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, specify cron(15 12 * * ? *).

    CrawlerSchemaChangePolicy, CrawlerSchemaChangePolicyArgs

    DeleteBehavior string
    The deletion behavior when the crawler finds a deleted object. A value of LOG specifies that if a table or partition is found to no longer exist, do not delete it, only log that it was found to no longer exist. A value of DELETE_FROM_DATABASE specifies that if a table or partition is found to have been removed, delete it from the database. A value of DEPRECATE_IN_DATABASE specifies that if a table has been found to no longer exist, to add a property to the table that says 'DEPRECATED' and includes a timestamp with the time of deprecation.
    UpdateBehavior string
    The update behavior when the crawler finds a changed schema. A value of LOG specifies that if a table or a partition already exists, and a change is detected, do not update it, only log that a change was detected. Add new tables and new partitions (including on existing tables). A value of UPDATE_IN_DATABASE specifies that if a table or partition already exists, and a change is detected, update it. Add new tables and partitions.
    DeleteBehavior string
    The deletion behavior when the crawler finds a deleted object. A value of LOG specifies that if a table or partition is found to no longer exist, do not delete it, only log that it was found to no longer exist. A value of DELETE_FROM_DATABASE specifies that if a table or partition is found to have been removed, delete it from the database. A value of DEPRECATE_IN_DATABASE specifies that if a table has been found to no longer exist, to add a property to the table that says 'DEPRECATED' and includes a timestamp with the time of deprecation.
    UpdateBehavior string
    The update behavior when the crawler finds a changed schema. A value of LOG specifies that if a table or a partition already exists, and a change is detected, do not update it, only log that a change was detected. Add new tables and new partitions (including on existing tables). A value of UPDATE_IN_DATABASE specifies that if a table or partition already exists, and a change is detected, update it. Add new tables and partitions.
    deleteBehavior String
    The deletion behavior when the crawler finds a deleted object. A value of LOG specifies that if a table or partition is found to no longer exist, do not delete it, only log that it was found to no longer exist. A value of DELETE_FROM_DATABASE specifies that if a table or partition is found to have been removed, delete it from the database. A value of DEPRECATE_IN_DATABASE specifies that if a table has been found to no longer exist, to add a property to the table that says 'DEPRECATED' and includes a timestamp with the time of deprecation.
    updateBehavior String
    The update behavior when the crawler finds a changed schema. A value of LOG specifies that if a table or a partition already exists, and a change is detected, do not update it, only log that a change was detected. Add new tables and new partitions (including on existing tables). A value of UPDATE_IN_DATABASE specifies that if a table or partition already exists, and a change is detected, update it. Add new tables and partitions.
    deleteBehavior string
    The deletion behavior when the crawler finds a deleted object. A value of LOG specifies that if a table or partition is found to no longer exist, do not delete it, only log that it was found to no longer exist. A value of DELETE_FROM_DATABASE specifies that if a table or partition is found to have been removed, delete it from the database. A value of DEPRECATE_IN_DATABASE specifies that if a table has been found to no longer exist, to add a property to the table that says 'DEPRECATED' and includes a timestamp with the time of deprecation.
    updateBehavior string
    The update behavior when the crawler finds a changed schema. A value of LOG specifies that if a table or a partition already exists, and a change is detected, do not update it, only log that a change was detected. Add new tables and new partitions (including on existing tables). A value of UPDATE_IN_DATABASE specifies that if a table or partition already exists, and a change is detected, update it. Add new tables and partitions.
    delete_behavior str
    The deletion behavior when the crawler finds a deleted object. A value of LOG specifies that if a table or partition is found to no longer exist, do not delete it, only log that it was found to no longer exist. A value of DELETE_FROM_DATABASE specifies that if a table or partition is found to have been removed, delete it from the database. A value of DEPRECATE_IN_DATABASE specifies that if a table has been found to no longer exist, to add a property to the table that says 'DEPRECATED' and includes a timestamp with the time of deprecation.
    update_behavior str
    The update behavior when the crawler finds a changed schema. A value of LOG specifies that if a table or a partition already exists, and a change is detected, do not update it, only log that a change was detected. Add new tables and new partitions (including on existing tables). A value of UPDATE_IN_DATABASE specifies that if a table or partition already exists, and a change is detected, update it. Add new tables and partitions.
    deleteBehavior String
    The deletion behavior when the crawler finds a deleted object. A value of LOG specifies that if a table or partition is found to no longer exist, do not delete it, only log that it was found to no longer exist. A value of DELETE_FROM_DATABASE specifies that if a table or partition is found to have been removed, delete it from the database. A value of DEPRECATE_IN_DATABASE specifies that if a table has been found to no longer exist, to add a property to the table that says 'DEPRECATED' and includes a timestamp with the time of deprecation.
    updateBehavior String
    The update behavior when the crawler finds a changed schema. A value of LOG specifies that if a table or a partition already exists, and a change is detected, do not update it, only log that a change was detected. Add new tables and new partitions (including on existing tables). A value of UPDATE_IN_DATABASE specifies that if a table or partition already exists, and a change is detected, update it. Add new tables and partitions.

    CrawlerTargets, CrawlerTargetsArgs

    CatalogTargets []CrawlerCatalogTarget
    Specifies AWS Glue Data Catalog targets.
    DeltaTargets []CrawlerDeltaTarget
    Specifies an array of Delta data store targets.
    DynamoDbTargets []CrawlerDynamoDbTarget
    Specifies Amazon DynamoDB targets.
    IcebergTargets []CrawlerIcebergTarget
    Specifies Apache Iceberg data store targets.
    JdbcTargets []CrawlerJdbcTarget
    Specifies JDBC targets.
    MongoDbTargets []CrawlerMongoDbTarget
    A list of Mongo DB targets.
    S3Targets []CrawlerS3Target
    Specifies Amazon Simple Storage Service (Amazon S3) targets.
    catalogTargets List<CrawlerCatalogTarget>
    Specifies AWS Glue Data Catalog targets.
    deltaTargets List<CrawlerDeltaTarget>
    Specifies an array of Delta data store targets.
    dynamoDbTargets List<CrawlerDynamoDbTarget>
    Specifies Amazon DynamoDB targets.
    icebergTargets List<CrawlerIcebergTarget>
    Specifies Apache Iceberg data store targets.
    jdbcTargets List<CrawlerJdbcTarget>
    Specifies JDBC targets.
    mongoDbTargets List<CrawlerMongoDbTarget>
    A list of Mongo DB targets.
    s3Targets List<CrawlerS3Target>
    Specifies Amazon Simple Storage Service (Amazon S3) targets.
    catalogTargets CrawlerCatalogTarget[]
    Specifies AWS Glue Data Catalog targets.
    deltaTargets CrawlerDeltaTarget[]
    Specifies an array of Delta data store targets.
    dynamoDbTargets CrawlerDynamoDbTarget[]
    Specifies Amazon DynamoDB targets.
    icebergTargets CrawlerIcebergTarget[]
    Specifies Apache Iceberg data store targets.
    jdbcTargets CrawlerJdbcTarget[]
    Specifies JDBC targets.
    mongoDbTargets CrawlerMongoDbTarget[]
    A list of Mongo DB targets.
    s3Targets CrawlerS3Target[]
    Specifies Amazon Simple Storage Service (Amazon S3) targets.
    catalog_targets Sequence[CrawlerCatalogTarget]
    Specifies AWS Glue Data Catalog targets.
    delta_targets Sequence[CrawlerDeltaTarget]
    Specifies an array of Delta data store targets.
    dynamo_db_targets Sequence[CrawlerDynamoDbTarget]
    Specifies Amazon DynamoDB targets.
    iceberg_targets Sequence[CrawlerIcebergTarget]
    Specifies Apache Iceberg data store targets.
    jdbc_targets Sequence[CrawlerJdbcTarget]
    Specifies JDBC targets.
    mongo_db_targets Sequence[CrawlerMongoDbTarget]
    A list of Mongo DB targets.
    s3_targets Sequence[CrawlerS3Target]
    Specifies Amazon Simple Storage Service (Amazon S3) targets.
    catalogTargets List<Property Map>
    Specifies AWS Glue Data Catalog targets.
    deltaTargets List<Property Map>
    Specifies an array of Delta data store targets.
    dynamoDbTargets List<Property Map>
    Specifies Amazon DynamoDB targets.
    icebergTargets List<Property Map>
    Specifies Apache Iceberg data store targets.
    jdbcTargets List<Property Map>
    Specifies JDBC targets.
    mongoDbTargets List<Property Map>
    A list of Mongo DB targets.
    s3Targets List<Property Map>
    Specifies Amazon Simple Storage Service (Amazon S3) targets.

    Package Details

    Repository
    AWS Native pulumi/pulumi-aws-native
    License
    Apache-2.0
    aws-native logo

    We recommend new projects start with resources from the AWS provider.

    AWS Cloud Control v1.9.0 published on Monday, Nov 18, 2024 by Pulumi