We recommend new projects start with resources from the AWS provider.
aws-native.glue.Crawler
Explore with Pulumi AI
We recommend new projects start with resources from the AWS provider.
Resource Type definition for AWS::Glue::Crawler
Create Crawler Resource
Resources are created with functions called constructors. To learn more about declaring and configuring resources, see Resources.
Constructor syntax
new Crawler(name: string, args: CrawlerArgs, opts?: CustomResourceOptions);
@overload
def Crawler(resource_name: str,
args: CrawlerArgs,
opts: Optional[ResourceOptions] = None)
@overload
def Crawler(resource_name: str,
opts: Optional[ResourceOptions] = None,
role: Optional[str] = None,
targets: Optional[CrawlerTargetsArgs] = None,
name: Optional[str] = None,
database_name: Optional[str] = None,
description: Optional[str] = None,
lake_formation_configuration: Optional[CrawlerLakeFormationConfigurationArgs] = None,
classifiers: Optional[Sequence[str]] = None,
recrawl_policy: Optional[CrawlerRecrawlPolicyArgs] = None,
crawler_security_configuration: Optional[str] = None,
schedule: Optional[CrawlerScheduleArgs] = None,
schema_change_policy: Optional[CrawlerSchemaChangePolicyArgs] = None,
table_prefix: Optional[str] = None,
tags: Optional[Any] = None,
configuration: Optional[str] = None)
func NewCrawler(ctx *Context, name string, args CrawlerArgs, opts ...ResourceOption) (*Crawler, error)
public Crawler(string name, CrawlerArgs args, CustomResourceOptions? opts = null)
public Crawler(String name, CrawlerArgs args)
public Crawler(String name, CrawlerArgs args, CustomResourceOptions options)
type: aws-native:glue:Crawler
properties: # The arguments to resource properties.
options: # Bag of options to control resource's behavior.
Parameters
- name string
- The unique name of the resource.
- args CrawlerArgs
- The arguments to resource properties.
- opts CustomResourceOptions
- Bag of options to control resource's behavior.
- resource_name str
- The unique name of the resource.
- args CrawlerArgs
- The arguments to resource properties.
- opts ResourceOptions
- Bag of options to control resource's behavior.
- ctx Context
- Context object for the current deployment.
- name string
- The unique name of the resource.
- args CrawlerArgs
- The arguments to resource properties.
- opts ResourceOption
- Bag of options to control resource's behavior.
- name string
- The unique name of the resource.
- args CrawlerArgs
- The arguments to resource properties.
- opts CustomResourceOptions
- Bag of options to control resource's behavior.
- name String
- The unique name of the resource.
- args CrawlerArgs
- The arguments to resource properties.
- options CustomResourceOptions
- Bag of options to control resource's behavior.
Crawler Resource Properties
To learn more about resource properties and how to use them, see Inputs and Outputs in the Architecture and Concepts docs.
Inputs
In Python, inputs that are objects can be passed either as argument classes or as dictionary literals.
The Crawler resource accepts the following input properties:
- Role string
- The Amazon Resource Name (ARN) of an IAM role that's used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.
- Targets
Pulumi.
Aws Native. Glue. Inputs. Crawler Targets - A collection of targets to crawl.
- Classifiers List<string>
- A list of UTF-8 strings that specify the names of custom classifiers that are associated with the crawler.
- Configuration string
- Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior.
- Crawler
Security stringConfiguration - The name of the SecurityConfiguration structure to be used by this crawler.
- Database
Name string - The name of the database in which the crawler's output is stored.
- Description string
- A description of the crawler.
- Lake
Formation Pulumi.Configuration Aws Native. Glue. Inputs. Crawler Lake Formation Configuration - Specifies whether the crawler should use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
- Name string
- The name of the crawler.
- Recrawl
Policy Pulumi.Aws Native. Glue. Inputs. Crawler Recrawl Policy - A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
- Schedule
Pulumi.
Aws Native. Glue. Inputs. Crawler Schedule - For scheduled crawlers, the schedule when the crawler runs.
- Schema
Change Pulumi.Policy Aws Native. Glue. Inputs. Crawler Schema Change Policy The policy that specifies update and delete behaviors for the crawler. The policy tells the crawler what to do in the event that it detects a change in a table that already exists in the customer's database at the time of the crawl. The
SchemaChangePolicy
does not affect whether or how new tables and partitions are added. New tables and partitions are always created regardless of theSchemaChangePolicy
on a crawler.The SchemaChangePolicy consists of two components,
UpdateBehavior
andDeleteBehavior
.- Table
Prefix string - The prefix added to the names of tables that are created.
- object
The tags to use with this crawler.
Search the CloudFormation User Guide for
AWS::Glue::Crawler
for more information about the expected schema for this property.
- Role string
- The Amazon Resource Name (ARN) of an IAM role that's used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.
- Targets
Crawler
Targets Args - A collection of targets to crawl.
- Classifiers []string
- A list of UTF-8 strings that specify the names of custom classifiers that are associated with the crawler.
- Configuration string
- Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior.
- Crawler
Security stringConfiguration - The name of the SecurityConfiguration structure to be used by this crawler.
- Database
Name string - The name of the database in which the crawler's output is stored.
- Description string
- A description of the crawler.
- Lake
Formation CrawlerConfiguration Lake Formation Configuration Args - Specifies whether the crawler should use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
- Name string
- The name of the crawler.
- Recrawl
Policy CrawlerRecrawl Policy Args - A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
- Schedule
Crawler
Schedule Args - For scheduled crawlers, the schedule when the crawler runs.
- Schema
Change CrawlerPolicy Schema Change Policy Args The policy that specifies update and delete behaviors for the crawler. The policy tells the crawler what to do in the event that it detects a change in a table that already exists in the customer's database at the time of the crawl. The
SchemaChangePolicy
does not affect whether or how new tables and partitions are added. New tables and partitions are always created regardless of theSchemaChangePolicy
on a crawler.The SchemaChangePolicy consists of two components,
UpdateBehavior
andDeleteBehavior
.- Table
Prefix string - The prefix added to the names of tables that are created.
- interface{}
The tags to use with this crawler.
Search the CloudFormation User Guide for
AWS::Glue::Crawler
for more information about the expected schema for this property.
- role String
- The Amazon Resource Name (ARN) of an IAM role that's used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.
- targets
Crawler
Targets - A collection of targets to crawl.
- classifiers List<String>
- A list of UTF-8 strings that specify the names of custom classifiers that are associated with the crawler.
- configuration String
- Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior.
- crawler
Security StringConfiguration - The name of the SecurityConfiguration structure to be used by this crawler.
- database
Name String - The name of the database in which the crawler's output is stored.
- description String
- A description of the crawler.
- lake
Formation CrawlerConfiguration Lake Formation Configuration - Specifies whether the crawler should use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
- name String
- The name of the crawler.
- recrawl
Policy CrawlerRecrawl Policy - A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
- schedule
Crawler
Schedule - For scheduled crawlers, the schedule when the crawler runs.
- schema
Change CrawlerPolicy Schema Change Policy The policy that specifies update and delete behaviors for the crawler. The policy tells the crawler what to do in the event that it detects a change in a table that already exists in the customer's database at the time of the crawl. The
SchemaChangePolicy
does not affect whether or how new tables and partitions are added. New tables and partitions are always created regardless of theSchemaChangePolicy
on a crawler.The SchemaChangePolicy consists of two components,
UpdateBehavior
andDeleteBehavior
.- table
Prefix String - The prefix added to the names of tables that are created.
- Object
The tags to use with this crawler.
Search the CloudFormation User Guide for
AWS::Glue::Crawler
for more information about the expected schema for this property.
- role string
- The Amazon Resource Name (ARN) of an IAM role that's used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.
- targets
Crawler
Targets - A collection of targets to crawl.
- classifiers string[]
- A list of UTF-8 strings that specify the names of custom classifiers that are associated with the crawler.
- configuration string
- Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior.
- crawler
Security stringConfiguration - The name of the SecurityConfiguration structure to be used by this crawler.
- database
Name string - The name of the database in which the crawler's output is stored.
- description string
- A description of the crawler.
- lake
Formation CrawlerConfiguration Lake Formation Configuration - Specifies whether the crawler should use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
- name string
- The name of the crawler.
- recrawl
Policy CrawlerRecrawl Policy - A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
- schedule
Crawler
Schedule - For scheduled crawlers, the schedule when the crawler runs.
- schema
Change CrawlerPolicy Schema Change Policy The policy that specifies update and delete behaviors for the crawler. The policy tells the crawler what to do in the event that it detects a change in a table that already exists in the customer's database at the time of the crawl. The
SchemaChangePolicy
does not affect whether or how new tables and partitions are added. New tables and partitions are always created regardless of theSchemaChangePolicy
on a crawler.The SchemaChangePolicy consists of two components,
UpdateBehavior
andDeleteBehavior
.- table
Prefix string - The prefix added to the names of tables that are created.
- any
The tags to use with this crawler.
Search the CloudFormation User Guide for
AWS::Glue::Crawler
for more information about the expected schema for this property.
- role str
- The Amazon Resource Name (ARN) of an IAM role that's used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.
- targets
Crawler
Targets Args - A collection of targets to crawl.
- classifiers Sequence[str]
- A list of UTF-8 strings that specify the names of custom classifiers that are associated with the crawler.
- configuration str
- Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior.
- crawler_
security_ strconfiguration - The name of the SecurityConfiguration structure to be used by this crawler.
- database_
name str - The name of the database in which the crawler's output is stored.
- description str
- A description of the crawler.
- lake_
formation_ Crawlerconfiguration Lake Formation Configuration Args - Specifies whether the crawler should use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
- name str
- The name of the crawler.
- recrawl_
policy CrawlerRecrawl Policy Args - A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
- schedule
Crawler
Schedule Args - For scheduled crawlers, the schedule when the crawler runs.
- schema_
change_ Crawlerpolicy Schema Change Policy Args The policy that specifies update and delete behaviors for the crawler. The policy tells the crawler what to do in the event that it detects a change in a table that already exists in the customer's database at the time of the crawl. The
SchemaChangePolicy
does not affect whether or how new tables and partitions are added. New tables and partitions are always created regardless of theSchemaChangePolicy
on a crawler.The SchemaChangePolicy consists of two components,
UpdateBehavior
andDeleteBehavior
.- table_
prefix str - The prefix added to the names of tables that are created.
- Any
The tags to use with this crawler.
Search the CloudFormation User Guide for
AWS::Glue::Crawler
for more information about the expected schema for this property.
- role String
- The Amazon Resource Name (ARN) of an IAM role that's used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.
- targets Property Map
- A collection of targets to crawl.
- classifiers List<String>
- A list of UTF-8 strings that specify the names of custom classifiers that are associated with the crawler.
- configuration String
- Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior.
- crawler
Security StringConfiguration - The name of the SecurityConfiguration structure to be used by this crawler.
- database
Name String - The name of the database in which the crawler's output is stored.
- description String
- A description of the crawler.
- lake
Formation Property MapConfiguration - Specifies whether the crawler should use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
- name String
- The name of the crawler.
- recrawl
Policy Property Map - A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
- schedule Property Map
- For scheduled crawlers, the schedule when the crawler runs.
- schema
Change Property MapPolicy The policy that specifies update and delete behaviors for the crawler. The policy tells the crawler what to do in the event that it detects a change in a table that already exists in the customer's database at the time of the crawl. The
SchemaChangePolicy
does not affect whether or how new tables and partitions are added. New tables and partitions are always created regardless of theSchemaChangePolicy
on a crawler.The SchemaChangePolicy consists of two components,
UpdateBehavior
andDeleteBehavior
.- table
Prefix String - The prefix added to the names of tables that are created.
- Any
The tags to use with this crawler.
Search the CloudFormation User Guide for
AWS::Glue::Crawler
for more information about the expected schema for this property.
Outputs
All input properties are implicitly available as output properties. Additionally, the Crawler resource produces the following output properties:
- Id string
- The provider-assigned unique ID for this managed resource.
- Id string
- The provider-assigned unique ID for this managed resource.
- id String
- The provider-assigned unique ID for this managed resource.
- id string
- The provider-assigned unique ID for this managed resource.
- id str
- The provider-assigned unique ID for this managed resource.
- id String
- The provider-assigned unique ID for this managed resource.
Supporting Types
CrawlerCatalogTarget, CrawlerCatalogTargetArgs
- Connection
Name string - The name of the connection for an Amazon S3-backed Data Catalog table to be a target of the crawl when using a Catalog connection type paired with a NETWORK Connection type.
- Database
Name string - The name of the database to be synchronized.
- Dlq
Event stringQueue Arn - A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
- Event
Queue stringArn - A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
- Tables List<string>
- A list of the tables to be synchronized.
- Connection
Name string - The name of the connection for an Amazon S3-backed Data Catalog table to be a target of the crawl when using a Catalog connection type paired with a NETWORK Connection type.
- Database
Name string - The name of the database to be synchronized.
- Dlq
Event stringQueue Arn - A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
- Event
Queue stringArn - A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
- Tables []string
- A list of the tables to be synchronized.
- connection
Name String - The name of the connection for an Amazon S3-backed Data Catalog table to be a target of the crawl when using a Catalog connection type paired with a NETWORK Connection type.
- database
Name String - The name of the database to be synchronized.
- dlq
Event StringQueue Arn - A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
- event
Queue StringArn - A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
- tables List<String>
- A list of the tables to be synchronized.
- connection
Name string - The name of the connection for an Amazon S3-backed Data Catalog table to be a target of the crawl when using a Catalog connection type paired with a NETWORK Connection type.
- database
Name string - The name of the database to be synchronized.
- dlq
Event stringQueue Arn - A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
- event
Queue stringArn - A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
- tables string[]
- A list of the tables to be synchronized.
- connection_
name str - The name of the connection for an Amazon S3-backed Data Catalog table to be a target of the crawl when using a Catalog connection type paired with a NETWORK Connection type.
- database_
name str - The name of the database to be synchronized.
- dlq_
event_ strqueue_ arn - A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
- event_
queue_ strarn - A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
- tables Sequence[str]
- A list of the tables to be synchronized.
- connection
Name String - The name of the connection for an Amazon S3-backed Data Catalog table to be a target of the crawl when using a Catalog connection type paired with a NETWORK Connection type.
- database
Name String - The name of the database to be synchronized.
- dlq
Event StringQueue Arn - A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
- event
Queue StringArn - A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
- tables List<String>
- A list of the tables to be synchronized.
CrawlerDeltaTarget, CrawlerDeltaTargetArgs
- Connection
Name string - The name of the connection to use to connect to the Delta table target.
- Create
Native boolDelta Table - Specifies whether the crawler will create native tables, to allow integration with query engines that support querying of the Delta transaction log directly.
- Delta
Tables List<string> - A list of the Amazon S3 paths to the Delta tables.
- Write
Manifest bool - Specifies whether to write the manifest files to the Delta table path.
- Connection
Name string - The name of the connection to use to connect to the Delta table target.
- Create
Native boolDelta Table - Specifies whether the crawler will create native tables, to allow integration with query engines that support querying of the Delta transaction log directly.
- Delta
Tables []string - A list of the Amazon S3 paths to the Delta tables.
- Write
Manifest bool - Specifies whether to write the manifest files to the Delta table path.
- connection
Name String - The name of the connection to use to connect to the Delta table target.
- create
Native BooleanDelta Table - Specifies whether the crawler will create native tables, to allow integration with query engines that support querying of the Delta transaction log directly.
- delta
Tables List<String> - A list of the Amazon S3 paths to the Delta tables.
- write
Manifest Boolean - Specifies whether to write the manifest files to the Delta table path.
- connection
Name string - The name of the connection to use to connect to the Delta table target.
- create
Native booleanDelta Table - Specifies whether the crawler will create native tables, to allow integration with query engines that support querying of the Delta transaction log directly.
- delta
Tables string[] - A list of the Amazon S3 paths to the Delta tables.
- write
Manifest boolean - Specifies whether to write the manifest files to the Delta table path.
- connection_
name str - The name of the connection to use to connect to the Delta table target.
- create_
native_ booldelta_ table - Specifies whether the crawler will create native tables, to allow integration with query engines that support querying of the Delta transaction log directly.
- delta_
tables Sequence[str] - A list of the Amazon S3 paths to the Delta tables.
- write_
manifest bool - Specifies whether to write the manifest files to the Delta table path.
- connection
Name String - The name of the connection to use to connect to the Delta table target.
- create
Native BooleanDelta Table - Specifies whether the crawler will create native tables, to allow integration with query engines that support querying of the Delta transaction log directly.
- delta
Tables List<String> - A list of the Amazon S3 paths to the Delta tables.
- write
Manifest Boolean - Specifies whether to write the manifest files to the Delta table path.
CrawlerDynamoDbTarget, CrawlerDynamoDbTargetArgs
- Path string
- The name of the DynamoDB table to crawl.
- Path string
- The name of the DynamoDB table to crawl.
- path String
- The name of the DynamoDB table to crawl.
- path string
- The name of the DynamoDB table to crawl.
- path str
- The name of the DynamoDB table to crawl.
- path String
- The name of the DynamoDB table to crawl.
CrawlerIcebergTarget, CrawlerIcebergTargetArgs
- Connection
Name string - The name of the connection to use to connect to the Iceberg target.
- Exclusions List<string>
- A list of global patterns used to exclude from the crawl.
- Maximum
Traversal intDepth - The maximum depth of Amazon S3 paths that the crawler can traverse to discover the Iceberg metadata folder in your Amazon S3 path. Used to limit the crawler run time.
- Paths List<string>
- One or more Amazon S3 paths that contains Iceberg metadata folders as s3://bucket/prefix .
- Connection
Name string - The name of the connection to use to connect to the Iceberg target.
- Exclusions []string
- A list of global patterns used to exclude from the crawl.
- Maximum
Traversal intDepth - The maximum depth of Amazon S3 paths that the crawler can traverse to discover the Iceberg metadata folder in your Amazon S3 path. Used to limit the crawler run time.
- Paths []string
- One or more Amazon S3 paths that contains Iceberg metadata folders as s3://bucket/prefix .
- connection
Name String - The name of the connection to use to connect to the Iceberg target.
- exclusions List<String>
- A list of global patterns used to exclude from the crawl.
- maximum
Traversal IntegerDepth - The maximum depth of Amazon S3 paths that the crawler can traverse to discover the Iceberg metadata folder in your Amazon S3 path. Used to limit the crawler run time.
- paths List<String>
- One or more Amazon S3 paths that contains Iceberg metadata folders as s3://bucket/prefix .
- connection
Name string - The name of the connection to use to connect to the Iceberg target.
- exclusions string[]
- A list of global patterns used to exclude from the crawl.
- maximum
Traversal numberDepth - The maximum depth of Amazon S3 paths that the crawler can traverse to discover the Iceberg metadata folder in your Amazon S3 path. Used to limit the crawler run time.
- paths string[]
- One or more Amazon S3 paths that contains Iceberg metadata folders as s3://bucket/prefix .
- connection_
name str - The name of the connection to use to connect to the Iceberg target.
- exclusions Sequence[str]
- A list of global patterns used to exclude from the crawl.
- maximum_
traversal_ intdepth - The maximum depth of Amazon S3 paths that the crawler can traverse to discover the Iceberg metadata folder in your Amazon S3 path. Used to limit the crawler run time.
- paths Sequence[str]
- One or more Amazon S3 paths that contains Iceberg metadata folders as s3://bucket/prefix .
- connection
Name String - The name of the connection to use to connect to the Iceberg target.
- exclusions List<String>
- A list of global patterns used to exclude from the crawl.
- maximum
Traversal NumberDepth - The maximum depth of Amazon S3 paths that the crawler can traverse to discover the Iceberg metadata folder in your Amazon S3 path. Used to limit the crawler run time.
- paths List<String>
- One or more Amazon S3 paths that contains Iceberg metadata folders as s3://bucket/prefix .
CrawlerJdbcTarget, CrawlerJdbcTargetArgs
- Connection
Name string - The name of the connection to use to connect to the JDBC target.
- Enable
Additional List<string>Metadata Specify a value of RAWTYPES or COMMENTS to enable additional metadata in table responses. RAWTYPES provides the native-level datatype. COMMENTS provides comments associated with a column or table in the database.
If you do not need additional metadata, keep the field empty.
- Exclusions List<string>
- A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler.
- Path string
- The path of the JDBC target.
- Connection
Name string - The name of the connection to use to connect to the JDBC target.
- Enable
Additional []stringMetadata Specify a value of RAWTYPES or COMMENTS to enable additional metadata in table responses. RAWTYPES provides the native-level datatype. COMMENTS provides comments associated with a column or table in the database.
If you do not need additional metadata, keep the field empty.
- Exclusions []string
- A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler.
- Path string
- The path of the JDBC target.
- connection
Name String - The name of the connection to use to connect to the JDBC target.
- enable
Additional List<String>Metadata Specify a value of RAWTYPES or COMMENTS to enable additional metadata in table responses. RAWTYPES provides the native-level datatype. COMMENTS provides comments associated with a column or table in the database.
If you do not need additional metadata, keep the field empty.
- exclusions List<String>
- A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler.
- path String
- The path of the JDBC target.
- connection
Name string - The name of the connection to use to connect to the JDBC target.
- enable
Additional string[]Metadata Specify a value of RAWTYPES or COMMENTS to enable additional metadata in table responses. RAWTYPES provides the native-level datatype. COMMENTS provides comments associated with a column or table in the database.
If you do not need additional metadata, keep the field empty.
- exclusions string[]
- A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler.
- path string
- The path of the JDBC target.
- connection_
name str - The name of the connection to use to connect to the JDBC target.
- enable_
additional_ Sequence[str]metadata Specify a value of RAWTYPES or COMMENTS to enable additional metadata in table responses. RAWTYPES provides the native-level datatype. COMMENTS provides comments associated with a column or table in the database.
If you do not need additional metadata, keep the field empty.
- exclusions Sequence[str]
- A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler.
- path str
- The path of the JDBC target.
- connection
Name String - The name of the connection to use to connect to the JDBC target.
- enable
Additional List<String>Metadata Specify a value of RAWTYPES or COMMENTS to enable additional metadata in table responses. RAWTYPES provides the native-level datatype. COMMENTS provides comments associated with a column or table in the database.
If you do not need additional metadata, keep the field empty.
- exclusions List<String>
- A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler.
- path String
- The path of the JDBC target.
CrawlerLakeFormationConfiguration, CrawlerLakeFormationConfigurationArgs
- Account
Id string - Required for cross account crawls. For same account crawls as the target data, this can be left as null.
- Use
Lake boolFormation Credentials - Specifies whether to use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
- Account
Id string - Required for cross account crawls. For same account crawls as the target data, this can be left as null.
- Use
Lake boolFormation Credentials - Specifies whether to use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
- account
Id String - Required for cross account crawls. For same account crawls as the target data, this can be left as null.
- use
Lake BooleanFormation Credentials - Specifies whether to use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
- account
Id string - Required for cross account crawls. For same account crawls as the target data, this can be left as null.
- use
Lake booleanFormation Credentials - Specifies whether to use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
- account_
id str - Required for cross account crawls. For same account crawls as the target data, this can be left as null.
- use_
lake_ boolformation_ credentials - Specifies whether to use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
- account
Id String - Required for cross account crawls. For same account crawls as the target data, this can be left as null.
- use
Lake BooleanFormation Credentials - Specifies whether to use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
CrawlerMongoDbTarget, CrawlerMongoDbTargetArgs
- Connection
Name string - The name of the connection to use to connect to the Amazon DocumentDB or MongoDB target.
- Path string
- The path of the Amazon DocumentDB or MongoDB target (database/collection).
- Connection
Name string - The name of the connection to use to connect to the Amazon DocumentDB or MongoDB target.
- Path string
- The path of the Amazon DocumentDB or MongoDB target (database/collection).
- connection
Name String - The name of the connection to use to connect to the Amazon DocumentDB or MongoDB target.
- path String
- The path of the Amazon DocumentDB or MongoDB target (database/collection).
- connection
Name string - The name of the connection to use to connect to the Amazon DocumentDB or MongoDB target.
- path string
- The path of the Amazon DocumentDB or MongoDB target (database/collection).
- connection_
name str - The name of the connection to use to connect to the Amazon DocumentDB or MongoDB target.
- path str
- The path of the Amazon DocumentDB or MongoDB target (database/collection).
- connection
Name String - The name of the connection to use to connect to the Amazon DocumentDB or MongoDB target.
- path String
- The path of the Amazon DocumentDB or MongoDB target (database/collection).
CrawlerRecrawlPolicy, CrawlerRecrawlPolicyArgs
- Recrawl
Behavior string - Specifies whether to crawl the entire dataset again or to crawl only folders that were added since the last crawler run. A value of CRAWL_EVERYTHING specifies crawling the entire dataset again. A value of CRAWL_NEW_FOLDERS_ONLY specifies crawling only folders that were added since the last crawler run. A value of CRAWL_EVENT_MODE specifies crawling only the changes identified by Amazon S3 events.
- Recrawl
Behavior string - Specifies whether to crawl the entire dataset again or to crawl only folders that were added since the last crawler run. A value of CRAWL_EVERYTHING specifies crawling the entire dataset again. A value of CRAWL_NEW_FOLDERS_ONLY specifies crawling only folders that were added since the last crawler run. A value of CRAWL_EVENT_MODE specifies crawling only the changes identified by Amazon S3 events.
- recrawl
Behavior String - Specifies whether to crawl the entire dataset again or to crawl only folders that were added since the last crawler run. A value of CRAWL_EVERYTHING specifies crawling the entire dataset again. A value of CRAWL_NEW_FOLDERS_ONLY specifies crawling only folders that were added since the last crawler run. A value of CRAWL_EVENT_MODE specifies crawling only the changes identified by Amazon S3 events.
- recrawl
Behavior string - Specifies whether to crawl the entire dataset again or to crawl only folders that were added since the last crawler run. A value of CRAWL_EVERYTHING specifies crawling the entire dataset again. A value of CRAWL_NEW_FOLDERS_ONLY specifies crawling only folders that were added since the last crawler run. A value of CRAWL_EVENT_MODE specifies crawling only the changes identified by Amazon S3 events.
- recrawl_
behavior str - Specifies whether to crawl the entire dataset again or to crawl only folders that were added since the last crawler run. A value of CRAWL_EVERYTHING specifies crawling the entire dataset again. A value of CRAWL_NEW_FOLDERS_ONLY specifies crawling only folders that were added since the last crawler run. A value of CRAWL_EVENT_MODE specifies crawling only the changes identified by Amazon S3 events.
- recrawl
Behavior String - Specifies whether to crawl the entire dataset again or to crawl only folders that were added since the last crawler run. A value of CRAWL_EVERYTHING specifies crawling the entire dataset again. A value of CRAWL_NEW_FOLDERS_ONLY specifies crawling only folders that were added since the last crawler run. A value of CRAWL_EVENT_MODE specifies crawling only the changes identified by Amazon S3 events.
CrawlerS3Target, CrawlerS3TargetArgs
- Connection
Name string - The name of a connection which allows a job or crawler to access data in Amazon S3 within an Amazon Virtual Private Cloud environment (Amazon VPC).
- Dlq
Event stringQueue Arn - A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
- Event
Queue stringArn - A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
- Exclusions List<string>
- A list of glob patterns used to exclude from the crawl.
- Path string
- The path to the Amazon S3 target.
- Sample
Size int - Sets the number of files in each leaf folder to be crawled when crawling sample files in a dataset. If not set, all the files are crawled. A valid value is an integer between 1 and 249.
- Connection
Name string - The name of a connection which allows a job or crawler to access data in Amazon S3 within an Amazon Virtual Private Cloud environment (Amazon VPC).
- Dlq
Event stringQueue Arn - A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
- Event
Queue stringArn - A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
- Exclusions []string
- A list of glob patterns used to exclude from the crawl.
- Path string
- The path to the Amazon S3 target.
- Sample
Size int - Sets the number of files in each leaf folder to be crawled when crawling sample files in a dataset. If not set, all the files are crawled. A valid value is an integer between 1 and 249.
- connection
Name String - The name of a connection which allows a job or crawler to access data in Amazon S3 within an Amazon Virtual Private Cloud environment (Amazon VPC).
- dlq
Event StringQueue Arn - A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
- event
Queue StringArn - A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
- exclusions List<String>
- A list of glob patterns used to exclude from the crawl.
- path String
- The path to the Amazon S3 target.
- sample
Size Integer - Sets the number of files in each leaf folder to be crawled when crawling sample files in a dataset. If not set, all the files are crawled. A valid value is an integer between 1 and 249.
- connection
Name string - The name of a connection which allows a job or crawler to access data in Amazon S3 within an Amazon Virtual Private Cloud environment (Amazon VPC).
- dlq
Event stringQueue Arn - A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
- event
Queue stringArn - A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
- exclusions string[]
- A list of glob patterns used to exclude from the crawl.
- path string
- The path to the Amazon S3 target.
- sample
Size number - Sets the number of files in each leaf folder to be crawled when crawling sample files in a dataset. If not set, all the files are crawled. A valid value is an integer between 1 and 249.
- connection_
name str - The name of a connection which allows a job or crawler to access data in Amazon S3 within an Amazon Virtual Private Cloud environment (Amazon VPC).
- dlq_
event_ strqueue_ arn - A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
- event_
queue_ strarn - A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
- exclusions Sequence[str]
- A list of glob patterns used to exclude from the crawl.
- path str
- The path to the Amazon S3 target.
- sample_
size int - Sets the number of files in each leaf folder to be crawled when crawling sample files in a dataset. If not set, all the files are crawled. A valid value is an integer between 1 and 249.
- connection
Name String - The name of a connection which allows a job or crawler to access data in Amazon S3 within an Amazon Virtual Private Cloud environment (Amazon VPC).
- dlq
Event StringQueue Arn - A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.
- event
Queue StringArn - A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.
- exclusions List<String>
- A list of glob patterns used to exclude from the crawl.
- path String
- The path to the Amazon S3 target.
- sample
Size Number - Sets the number of files in each leaf folder to be crawled when crawling sample files in a dataset. If not set, all the files are crawled. A valid value is an integer between 1 and 249.
CrawlerSchedule, CrawlerScheduleArgs
- Schedule
Expression string - A cron expression used to specify the schedule. For more information, see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, specify cron(15 12 * * ? *).
- Schedule
Expression string - A cron expression used to specify the schedule. For more information, see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, specify cron(15 12 * * ? *).
- schedule
Expression String - A cron expression used to specify the schedule. For more information, see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, specify cron(15 12 * * ? *).
- schedule
Expression string - A cron expression used to specify the schedule. For more information, see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, specify cron(15 12 * * ? *).
- schedule_
expression str - A cron expression used to specify the schedule. For more information, see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, specify cron(15 12 * * ? *).
- schedule
Expression String - A cron expression used to specify the schedule. For more information, see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, specify cron(15 12 * * ? *).
CrawlerSchemaChangePolicy, CrawlerSchemaChangePolicyArgs
- Delete
Behavior string - The deletion behavior when the crawler finds a deleted object. A value of LOG specifies that if a table or partition is found to no longer exist, do not delete it, only log that it was found to no longer exist. A value of DELETE_FROM_DATABASE specifies that if a table or partition is found to have been removed, delete it from the database. A value of DEPRECATE_IN_DATABASE specifies that if a table has been found to no longer exist, to add a property to the table that says 'DEPRECATED' and includes a timestamp with the time of deprecation.
- Update
Behavior string - The update behavior when the crawler finds a changed schema. A value of LOG specifies that if a table or a partition already exists, and a change is detected, do not update it, only log that a change was detected. Add new tables and new partitions (including on existing tables). A value of UPDATE_IN_DATABASE specifies that if a table or partition already exists, and a change is detected, update it. Add new tables and partitions.
- Delete
Behavior string - The deletion behavior when the crawler finds a deleted object. A value of LOG specifies that if a table or partition is found to no longer exist, do not delete it, only log that it was found to no longer exist. A value of DELETE_FROM_DATABASE specifies that if a table or partition is found to have been removed, delete it from the database. A value of DEPRECATE_IN_DATABASE specifies that if a table has been found to no longer exist, to add a property to the table that says 'DEPRECATED' and includes a timestamp with the time of deprecation.
- Update
Behavior string - The update behavior when the crawler finds a changed schema. A value of LOG specifies that if a table or a partition already exists, and a change is detected, do not update it, only log that a change was detected. Add new tables and new partitions (including on existing tables). A value of UPDATE_IN_DATABASE specifies that if a table or partition already exists, and a change is detected, update it. Add new tables and partitions.
- delete
Behavior String - The deletion behavior when the crawler finds a deleted object. A value of LOG specifies that if a table or partition is found to no longer exist, do not delete it, only log that it was found to no longer exist. A value of DELETE_FROM_DATABASE specifies that if a table or partition is found to have been removed, delete it from the database. A value of DEPRECATE_IN_DATABASE specifies that if a table has been found to no longer exist, to add a property to the table that says 'DEPRECATED' and includes a timestamp with the time of deprecation.
- update
Behavior String - The update behavior when the crawler finds a changed schema. A value of LOG specifies that if a table or a partition already exists, and a change is detected, do not update it, only log that a change was detected. Add new tables and new partitions (including on existing tables). A value of UPDATE_IN_DATABASE specifies that if a table or partition already exists, and a change is detected, update it. Add new tables and partitions.
- delete
Behavior string - The deletion behavior when the crawler finds a deleted object. A value of LOG specifies that if a table or partition is found to no longer exist, do not delete it, only log that it was found to no longer exist. A value of DELETE_FROM_DATABASE specifies that if a table or partition is found to have been removed, delete it from the database. A value of DEPRECATE_IN_DATABASE specifies that if a table has been found to no longer exist, to add a property to the table that says 'DEPRECATED' and includes a timestamp with the time of deprecation.
- update
Behavior string - The update behavior when the crawler finds a changed schema. A value of LOG specifies that if a table or a partition already exists, and a change is detected, do not update it, only log that a change was detected. Add new tables and new partitions (including on existing tables). A value of UPDATE_IN_DATABASE specifies that if a table or partition already exists, and a change is detected, update it. Add new tables and partitions.
- delete_
behavior str - The deletion behavior when the crawler finds a deleted object. A value of LOG specifies that if a table or partition is found to no longer exist, do not delete it, only log that it was found to no longer exist. A value of DELETE_FROM_DATABASE specifies that if a table or partition is found to have been removed, delete it from the database. A value of DEPRECATE_IN_DATABASE specifies that if a table has been found to no longer exist, to add a property to the table that says 'DEPRECATED' and includes a timestamp with the time of deprecation.
- update_
behavior str - The update behavior when the crawler finds a changed schema. A value of LOG specifies that if a table or a partition already exists, and a change is detected, do not update it, only log that a change was detected. Add new tables and new partitions (including on existing tables). A value of UPDATE_IN_DATABASE specifies that if a table or partition already exists, and a change is detected, update it. Add new tables and partitions.
- delete
Behavior String - The deletion behavior when the crawler finds a deleted object. A value of LOG specifies that if a table or partition is found to no longer exist, do not delete it, only log that it was found to no longer exist. A value of DELETE_FROM_DATABASE specifies that if a table or partition is found to have been removed, delete it from the database. A value of DEPRECATE_IN_DATABASE specifies that if a table has been found to no longer exist, to add a property to the table that says 'DEPRECATED' and includes a timestamp with the time of deprecation.
- update
Behavior String - The update behavior when the crawler finds a changed schema. A value of LOG specifies that if a table or a partition already exists, and a change is detected, do not update it, only log that a change was detected. Add new tables and new partitions (including on existing tables). A value of UPDATE_IN_DATABASE specifies that if a table or partition already exists, and a change is detected, update it. Add new tables and partitions.
CrawlerTargets, CrawlerTargetsArgs
- Catalog
Targets List<Pulumi.Aws Native. Glue. Inputs. Crawler Catalog Target> - Specifies AWS Glue Data Catalog targets.
- Delta
Targets List<Pulumi.Aws Native. Glue. Inputs. Crawler Delta Target> - Specifies an array of Delta data store targets.
- Dynamo
Db List<Pulumi.Targets Aws Native. Glue. Inputs. Crawler Dynamo Db Target> - Specifies Amazon DynamoDB targets.
- Iceberg
Targets List<Pulumi.Aws Native. Glue. Inputs. Crawler Iceberg Target> - Specifies Apache Iceberg data store targets.
- Jdbc
Targets List<Pulumi.Aws Native. Glue. Inputs. Crawler Jdbc Target> - Specifies JDBC targets.
- Mongo
Db List<Pulumi.Targets Aws Native. Glue. Inputs. Crawler Mongo Db Target> - A list of Mongo DB targets.
- S3Targets
List<Pulumi.
Aws Native. Glue. Inputs. Crawler S3Target> - Specifies Amazon Simple Storage Service (Amazon S3) targets.
- Catalog
Targets []CrawlerCatalog Target - Specifies AWS Glue Data Catalog targets.
- Delta
Targets []CrawlerDelta Target - Specifies an array of Delta data store targets.
- Dynamo
Db []CrawlerTargets Dynamo Db Target - Specifies Amazon DynamoDB targets.
- Iceberg
Targets []CrawlerIceberg Target - Specifies Apache Iceberg data store targets.
- Jdbc
Targets []CrawlerJdbc Target - Specifies JDBC targets.
- Mongo
Db []CrawlerTargets Mongo Db Target - A list of Mongo DB targets.
- S3Targets
[]Crawler
S3Target - Specifies Amazon Simple Storage Service (Amazon S3) targets.
- catalog
Targets List<CrawlerCatalog Target> - Specifies AWS Glue Data Catalog targets.
- delta
Targets List<CrawlerDelta Target> - Specifies an array of Delta data store targets.
- dynamo
Db List<CrawlerTargets Dynamo Db Target> - Specifies Amazon DynamoDB targets.
- iceberg
Targets List<CrawlerIceberg Target> - Specifies Apache Iceberg data store targets.
- jdbc
Targets List<CrawlerJdbc Target> - Specifies JDBC targets.
- mongo
Db List<CrawlerTargets Mongo Db Target> - A list of Mongo DB targets.
- s3Targets
List<Crawler
S3Target> - Specifies Amazon Simple Storage Service (Amazon S3) targets.
- catalog
Targets CrawlerCatalog Target[] - Specifies AWS Glue Data Catalog targets.
- delta
Targets CrawlerDelta Target[] - Specifies an array of Delta data store targets.
- dynamo
Db CrawlerTargets Dynamo Db Target[] - Specifies Amazon DynamoDB targets.
- iceberg
Targets CrawlerIceberg Target[] - Specifies Apache Iceberg data store targets.
- jdbc
Targets CrawlerJdbc Target[] - Specifies JDBC targets.
- mongo
Db CrawlerTargets Mongo Db Target[] - A list of Mongo DB targets.
- s3Targets
Crawler
S3Target[] - Specifies Amazon Simple Storage Service (Amazon S3) targets.
- catalog_
targets Sequence[CrawlerCatalog Target] - Specifies AWS Glue Data Catalog targets.
- delta_
targets Sequence[CrawlerDelta Target] - Specifies an array of Delta data store targets.
- dynamo_
db_ Sequence[Crawlertargets Dynamo Db Target] - Specifies Amazon DynamoDB targets.
- iceberg_
targets Sequence[CrawlerIceberg Target] - Specifies Apache Iceberg data store targets.
- jdbc_
targets Sequence[CrawlerJdbc Target] - Specifies JDBC targets.
- mongo_
db_ Sequence[Crawlertargets Mongo Db Target] - A list of Mongo DB targets.
- s3_
targets Sequence[CrawlerS3Target] - Specifies Amazon Simple Storage Service (Amazon S3) targets.
- catalog
Targets List<Property Map> - Specifies AWS Glue Data Catalog targets.
- delta
Targets List<Property Map> - Specifies an array of Delta data store targets.
- dynamo
Db List<Property Map>Targets - Specifies Amazon DynamoDB targets.
- iceberg
Targets List<Property Map> - Specifies Apache Iceberg data store targets.
- jdbc
Targets List<Property Map> - Specifies JDBC targets.
- mongo
Db List<Property Map>Targets - A list of Mongo DB targets.
- s3Targets List<Property Map>
- Specifies Amazon Simple Storage Service (Amazon S3) targets.
Package Details
- Repository
- AWS Native pulumi/pulumi-aws-native
- License
- Apache-2.0
We recommend new projects start with resources from the AWS provider.