AWS Cloud Control v1.9.0, Nov 18 24

We recommend new projects start with resources from the AWS provider.

AWS Cloud Control v1.9.0 published on Monday, Nov 18, 2024 by Pulumi

pulumi/pulumi-aws-native

aws-native.sagemaker.InferenceComponent

Explore with Pulumi AI

We recommend new projects start with resources from the AWS provider.

AWS Cloud Control v1.9.0 published on Monday, Nov 18, 2024 by Pulumi

pulumi/pulumi-aws-native

Create InferenceComponent Resource

Resources are created with functions called constructors. To learn more about declaring and configuring resources, see Resources.

Constructor syntax

new InferenceComponent(name: string, args: InferenceComponentArgs, opts?: CustomResourceOptions);

@overload
def InferenceComponent(resource_name: str,
                       args: InferenceComponentArgs,
                       opts: Optional[ResourceOptions] = None)

@overload
def InferenceComponent(resource_name: str,
                       opts: Optional[ResourceOptions] = None,
                       endpoint_name: Optional[str] = None,
                       runtime_config: Optional[InferenceComponentRuntimeConfigArgs] = None,
                       specification: Optional[InferenceComponentSpecificationArgs] = None,
                       variant_name: Optional[str] = None,
                       endpoint_arn: Optional[str] = None,
                       inference_component_name: Optional[str] = None,
                       tags: Optional[Sequence[_root_inputs.TagArgs]] = None)

func NewInferenceComponent(ctx *Context, name string, args InferenceComponentArgs, opts ...ResourceOption) (*InferenceComponent, error)

public InferenceComponent(string name, InferenceComponentArgs args, CustomResourceOptions? opts = null)

public InferenceComponent(String name, InferenceComponentArgs args)
public InferenceComponent(String name, InferenceComponentArgs args, CustomResourceOptions options)

type: aws-native:sagemaker:InferenceComponent
properties: # The arguments to resource properties.
options: # Bag of options to control resource's behavior.

Parameters

name string: The unique name of the resource.
args InferenceComponentArgs: The arguments to resource properties.
opts CustomResourceOptions: Bag of options to control resource's behavior.

resource_name str: The unique name of the resource.
args InferenceComponentArgs: The arguments to resource properties.
opts ResourceOptions: Bag of options to control resource's behavior.

ctx Context: Context object for the current deployment.
name string: The unique name of the resource.
args InferenceComponentArgs: The arguments to resource properties.
opts ResourceOption: Bag of options to control resource's behavior.

name string: The unique name of the resource.
args InferenceComponentArgs: The arguments to resource properties.
opts CustomResourceOptions: Bag of options to control resource's behavior.

name String: The unique name of the resource.
args InferenceComponentArgs: The arguments to resource properties.
options CustomResourceOptions: Bag of options to control resource's behavior.

InferenceComponent Resource Properties

To learn more about resource properties and how to use them, see Inputs and Outputs in the Architecture and Concepts docs.

Inputs

In Python, inputs that are objects can be passed either as argument classes or as dictionary literals.

The InferenceComponent resource accepts the following input properties:

EndpointName string: The name of the endpoint that hosts the inference component.
RuntimeConfig Pulumi.AwsNative.SageMaker.Inputs.InferenceComponentRuntimeConfig
Specification Pulumi.AwsNative.SageMaker.Inputs.InferenceComponentSpecification
VariantName string: The name of the production variant that hosts the inference component.
EndpointArn string: The Amazon Resource Name (ARN) of the endpoint that hosts the inference component.
InferenceComponentName string: The name of the inference component.
Tags List<Pulumi.AwsNative.Inputs.Tag>

EndpointName string: The name of the endpoint that hosts the inference component.
RuntimeConfig InferenceComponentRuntimeConfigArgs
Specification InferenceComponentSpecificationArgs
VariantName string: The name of the production variant that hosts the inference component.
EndpointArn string: The Amazon Resource Name (ARN) of the endpoint that hosts the inference component.
InferenceComponentName string: The name of the inference component.
Tags TagArgs

endpointName String: The name of the endpoint that hosts the inference component.
runtimeConfig InferenceComponentRuntimeConfig
specification InferenceComponentSpecification
variantName String: The name of the production variant that hosts the inference component.
endpointArn String: The Amazon Resource Name (ARN) of the endpoint that hosts the inference component.
inferenceComponentName String: The name of the inference component.
tags List<Tag>

endpointName string: The name of the endpoint that hosts the inference component.
runtimeConfig InferenceComponentRuntimeConfig
specification InferenceComponentSpecification
variantName string: The name of the production variant that hosts the inference component.
endpointArn string: The Amazon Resource Name (ARN) of the endpoint that hosts the inference component.
inferenceComponentName string: The name of the inference component.
tags Tag[]

endpoint_name str: The name of the endpoint that hosts the inference component.
runtime_config InferenceComponentRuntimeConfigArgs
specification InferenceComponentSpecificationArgs
variant_name str: The name of the production variant that hosts the inference component.
endpoint_arn str: The Amazon Resource Name (ARN) of the endpoint that hosts the inference component.
inference_component_name str: The name of the inference component.
tags Sequence[TagArgs]

endpointName String: The name of the endpoint that hosts the inference component.
runtimeConfig Property Map
specification Property Map
variantName String: The name of the production variant that hosts the inference component.
endpointArn String: The Amazon Resource Name (ARN) of the endpoint that hosts the inference component.
inferenceComponentName String: The name of the inference component.
tags List<Property Map>

Outputs

All input properties are implicitly available as output properties. Additionally, the InferenceComponent resource produces the following output properties:

CreationTime string: The time when the inference component was created.
FailureReason string
Id string: The provider-assigned unique ID for this managed resource.
InferenceComponentArn string: The Amazon Resource Name (ARN) of the inference component.
InferenceComponentStatus Pulumi.AwsNative.SageMaker.InferenceComponentStatus: The status of the inference component.
LastModifiedTime string: The time when the inference component was last updated.

CreationTime string: The time when the inference component was created.
FailureReason string
Id string: The provider-assigned unique ID for this managed resource.
InferenceComponentArn string: The Amazon Resource Name (ARN) of the inference component.
InferenceComponentStatus InferenceComponentStatus: The status of the inference component.
LastModifiedTime string: The time when the inference component was last updated.

creationTime String: The time when the inference component was created.
failureReason String
id String: The provider-assigned unique ID for this managed resource.
inferenceComponentArn String: The Amazon Resource Name (ARN) of the inference component.
inferenceComponentStatus InferenceComponentStatus: The status of the inference component.
lastModifiedTime String: The time when the inference component was last updated.

creationTime string: The time when the inference component was created.
failureReason string
id string: The provider-assigned unique ID for this managed resource.
inferenceComponentArn string: The Amazon Resource Name (ARN) of the inference component.
inferenceComponentStatus InferenceComponentStatus: The status of the inference component.
lastModifiedTime string: The time when the inference component was last updated.

creation_time str: The time when the inference component was created.
failure_reason str
id str: The provider-assigned unique ID for this managed resource.
inference_component_arn str: The Amazon Resource Name (ARN) of the inference component.
inference_component_status InferenceComponentStatus: The status of the inference component.
last_modified_time str: The time when the inference component was last updated.

creationTime String: The time when the inference component was created.
failureReason String
id String: The provider-assigned unique ID for this managed resource.
inferenceComponentArn String: The Amazon Resource Name (ARN) of the inference component.
inferenceComponentStatus "InService" | "Creating" | "Updating" | "Failed" | "Deleting": The status of the inference component.
lastModifiedTime String: The time when the inference component was last updated.

Supporting Types

InferenceComponentComputeResourceRequirements, InferenceComponentComputeResourceRequirementsArgs

MaxMemoryRequiredInMb int: The maximum MB of memory to allocate to run a model that you assign to an inference component.
MinMemoryRequiredInMb int: The minimum MB of memory to allocate to run a model that you assign to an inference component.
NumberOfAcceleratorDevicesRequired double: The number of accelerators to allocate to run a model that you assign to an inference component. Accelerators include GPUs and AWS Inferentia.
NumberOfCpuCoresRequired double: The number of CPU cores to allocate to run a model that you assign to an inference component.

MaxMemoryRequiredInMb int: The maximum MB of memory to allocate to run a model that you assign to an inference component.
MinMemoryRequiredInMb int: The minimum MB of memory to allocate to run a model that you assign to an inference component.
NumberOfAcceleratorDevicesRequired float64: The number of accelerators to allocate to run a model that you assign to an inference component. Accelerators include GPUs and AWS Inferentia.
NumberOfCpuCoresRequired float64: The number of CPU cores to allocate to run a model that you assign to an inference component.

maxMemoryRequiredInMb Integer: The maximum MB of memory to allocate to run a model that you assign to an inference component.
minMemoryRequiredInMb Integer: The minimum MB of memory to allocate to run a model that you assign to an inference component.
numberOfAcceleratorDevicesRequired Double: The number of accelerators to allocate to run a model that you assign to an inference component. Accelerators include GPUs and AWS Inferentia.
numberOfCpuCoresRequired Double: The number of CPU cores to allocate to run a model that you assign to an inference component.

maxMemoryRequiredInMb number: The maximum MB of memory to allocate to run a model that you assign to an inference component.
minMemoryRequiredInMb number: The minimum MB of memory to allocate to run a model that you assign to an inference component.
numberOfAcceleratorDevicesRequired number: The number of accelerators to allocate to run a model that you assign to an inference component. Accelerators include GPUs and AWS Inferentia.
numberOfCpuCoresRequired number: The number of CPU cores to allocate to run a model that you assign to an inference component.

max_memory_required_in_mb int: The maximum MB of memory to allocate to run a model that you assign to an inference component.
min_memory_required_in_mb int: The minimum MB of memory to allocate to run a model that you assign to an inference component.
number_of_accelerator_devices_required float: The number of accelerators to allocate to run a model that you assign to an inference component. Accelerators include GPUs and AWS Inferentia.
number_of_cpu_cores_required float: The number of CPU cores to allocate to run a model that you assign to an inference component.

maxMemoryRequiredInMb Number: The maximum MB of memory to allocate to run a model that you assign to an inference component.
minMemoryRequiredInMb Number: The minimum MB of memory to allocate to run a model that you assign to an inference component.
numberOfAcceleratorDevicesRequired Number: The number of accelerators to allocate to run a model that you assign to an inference component. Accelerators include GPUs and AWS Inferentia.
numberOfCpuCoresRequired Number: The number of CPU cores to allocate to run a model that you assign to an inference component.

InferenceComponentContainerSpecification, InferenceComponentContainerSpecificationArgs

ArtifactUrl string: The Amazon S3 path where the model artifacts, which result from model training, are stored. This path must point to a single gzip compressed tar archive (.tar.gz suffix).
DeployedImage Pulumi.AwsNative.SageMaker.Inputs.InferenceComponentDeployedImage
Environment Dictionary<string, string>: The environment variables to set in the Docker container. Each key and value in the Environment string-to-string map can have length of up to 1024. We support up to 16 entries in the map.
Image string: The Amazon Elastic Container Registry (Amazon ECR) path where the Docker image for the model is stored.

ArtifactUrl string: The Amazon S3 path where the model artifacts, which result from model training, are stored. This path must point to a single gzip compressed tar archive (.tar.gz suffix).
DeployedImage InferenceComponentDeployedImage
Environment map[string]string: The environment variables to set in the Docker container. Each key and value in the Environment string-to-string map can have length of up to 1024. We support up to 16 entries in the map.
Image string: The Amazon Elastic Container Registry (Amazon ECR) path where the Docker image for the model is stored.

artifactUrl String: The Amazon S3 path where the model artifacts, which result from model training, are stored. This path must point to a single gzip compressed tar archive (.tar.gz suffix).
deployedImage InferenceComponentDeployedImage
environment Map<String,String>: The environment variables to set in the Docker container. Each key and value in the Environment string-to-string map can have length of up to 1024. We support up to 16 entries in the map.
image String: The Amazon Elastic Container Registry (Amazon ECR) path where the Docker image for the model is stored.

artifactUrl string: The Amazon S3 path where the model artifacts, which result from model training, are stored. This path must point to a single gzip compressed tar archive (.tar.gz suffix).
deployedImage InferenceComponentDeployedImage
environment {[key: string]: string}: The environment variables to set in the Docker container. Each key and value in the Environment string-to-string map can have length of up to 1024. We support up to 16 entries in the map.
image string: The Amazon Elastic Container Registry (Amazon ECR) path where the Docker image for the model is stored.

artifact_url str: The Amazon S3 path where the model artifacts, which result from model training, are stored. This path must point to a single gzip compressed tar archive (.tar.gz suffix).
deployed_image InferenceComponentDeployedImage
environment Mapping[str, str]: The environment variables to set in the Docker container. Each key and value in the Environment string-to-string map can have length of up to 1024. We support up to 16 entries in the map.
image str: The Amazon Elastic Container Registry (Amazon ECR) path where the Docker image for the model is stored.

artifactUrl String: The Amazon S3 path where the model artifacts, which result from model training, are stored. This path must point to a single gzip compressed tar archive (.tar.gz suffix).
deployedImage Property Map
environment Map<String>: The environment variables to set in the Docker container. Each key and value in the Environment string-to-string map can have length of up to 1024. We support up to 16 entries in the map.
image String: The Amazon Elastic Container Registry (Amazon ECR) path where the Docker image for the model is stored.

InferenceComponentDeployedImage, InferenceComponentDeployedImageArgs

ResolutionTime string: The date and time when the image path for the model resolved to the ResolvedImage
ResolvedImage string: The specific digest path of the image hosted in this ProductionVariant .
SpecifiedImage string: The image path you specified when you created the model.

ResolutionTime string: The date and time when the image path for the model resolved to the ResolvedImage
ResolvedImage string: The specific digest path of the image hosted in this ProductionVariant .
SpecifiedImage string: The image path you specified when you created the model.

resolutionTime String: The date and time when the image path for the model resolved to the ResolvedImage
resolvedImage String: The specific digest path of the image hosted in this ProductionVariant .
specifiedImage String: The image path you specified when you created the model.

resolutionTime string: The date and time when the image path for the model resolved to the ResolvedImage
resolvedImage string: The specific digest path of the image hosted in this ProductionVariant .
specifiedImage string: The image path you specified when you created the model.

resolution_time str: The date and time when the image path for the model resolved to the ResolvedImage
resolved_image str: The specific digest path of the image hosted in this ProductionVariant .
specified_image str: The image path you specified when you created the model.

resolutionTime String: The date and time when the image path for the model resolved to the ResolvedImage
resolvedImage String: The specific digest path of the image hosted in this ProductionVariant .
specifiedImage String: The image path you specified when you created the model.

InferenceComponentRuntimeConfig, InferenceComponentRuntimeConfigArgs

CopyCount int: The number of runtime copies of the model container to deploy with the inference component. Each copy can serve inference requests.
CurrentCopyCount int
DesiredCopyCount int

CopyCount int: The number of runtime copies of the model container to deploy with the inference component. Each copy can serve inference requests.
CurrentCopyCount int
DesiredCopyCount int

copyCount Integer: The number of runtime copies of the model container to deploy with the inference component. Each copy can serve inference requests.
currentCopyCount Integer
desiredCopyCount Integer

copyCount number: The number of runtime copies of the model container to deploy with the inference component. Each copy can serve inference requests.
currentCopyCount number
desiredCopyCount number

copy_count int: The number of runtime copies of the model container to deploy with the inference component. Each copy can serve inference requests.
current_copy_count int
desired_copy_count int

copyCount Number: The number of runtime copies of the model container to deploy with the inference component. Each copy can serve inference requests.
currentCopyCount Number
desiredCopyCount Number

InferenceComponentSpecification, InferenceComponentSpecificationArgs

ComputeResourceRequirements Pulumi.AwsNative.SageMaker.Inputs.InferenceComponentComputeResourceRequirements: The compute resources allocated to run the model assigned to the inference component.
Container Pulumi.AwsNative.SageMaker.Inputs.InferenceComponentContainerSpecification: Defines a container that provides the runtime environment for a model that you deploy with an inference component.
ModelName string: The name of an existing SageMaker model object in your account that you want to deploy with the inference component.
StartupParameters Pulumi.AwsNative.SageMaker.Inputs.InferenceComponentStartupParameters: Settings that take effect while the model container starts up.

ComputeResourceRequirements InferenceComponentComputeResourceRequirements: The compute resources allocated to run the model assigned to the inference component.
Container InferenceComponentContainerSpecification: Defines a container that provides the runtime environment for a model that you deploy with an inference component.
ModelName string: The name of an existing SageMaker model object in your account that you want to deploy with the inference component.
StartupParameters InferenceComponentStartupParameters: Settings that take effect while the model container starts up.

computeResourceRequirements InferenceComponentComputeResourceRequirements: The compute resources allocated to run the model assigned to the inference component.
container InferenceComponentContainerSpecification: Defines a container that provides the runtime environment for a model that you deploy with an inference component.
modelName String: The name of an existing SageMaker model object in your account that you want to deploy with the inference component.
startupParameters InferenceComponentStartupParameters: Settings that take effect while the model container starts up.

computeResourceRequirements InferenceComponentComputeResourceRequirements: The compute resources allocated to run the model assigned to the inference component.
container InferenceComponentContainerSpecification: Defines a container that provides the runtime environment for a model that you deploy with an inference component.
modelName string: The name of an existing SageMaker model object in your account that you want to deploy with the inference component.
startupParameters InferenceComponentStartupParameters: Settings that take effect while the model container starts up.

compute_resource_requirements InferenceComponentComputeResourceRequirements: The compute resources allocated to run the model assigned to the inference component.
container InferenceComponentContainerSpecification: Defines a container that provides the runtime environment for a model that you deploy with an inference component.
model_name str: The name of an existing SageMaker model object in your account that you want to deploy with the inference component.
startup_parameters InferenceComponentStartupParameters: Settings that take effect while the model container starts up.

computeResourceRequirements Property Map: The compute resources allocated to run the model assigned to the inference component.
container Property Map: Defines a container that provides the runtime environment for a model that you deploy with an inference component.
modelName String: The name of an existing SageMaker model object in your account that you want to deploy with the inference component.
startupParameters Property Map: Settings that take effect while the model container starts up.

InferenceComponentStartupParameters, InferenceComponentStartupParametersArgs

ContainerStartupHealthCheckTimeoutInSeconds int: The timeout value, in seconds, for your inference container to pass health check by Amazon S3 Hosting. For more information about health check, see How Your Container Should Respond to Health Check (Ping) Requests .
ModelDataDownloadTimeoutInSeconds int: The timeout value, in seconds, to download and extract the model that you want to host from Amazon S3 to the individual inference instance associated with this inference component.

ContainerStartupHealthCheckTimeoutInSeconds int: The timeout value, in seconds, for your inference container to pass health check by Amazon S3 Hosting. For more information about health check, see How Your Container Should Respond to Health Check (Ping) Requests .
ModelDataDownloadTimeoutInSeconds int: The timeout value, in seconds, to download and extract the model that you want to host from Amazon S3 to the individual inference instance associated with this inference component.

containerStartupHealthCheckTimeoutInSeconds Integer: The timeout value, in seconds, for your inference container to pass health check by Amazon S3 Hosting. For more information about health check, see How Your Container Should Respond to Health Check (Ping) Requests .
modelDataDownloadTimeoutInSeconds Integer: The timeout value, in seconds, to download and extract the model that you want to host from Amazon S3 to the individual inference instance associated with this inference component.

containerStartupHealthCheckTimeoutInSeconds number: The timeout value, in seconds, for your inference container to pass health check by Amazon S3 Hosting. For more information about health check, see How Your Container Should Respond to Health Check (Ping) Requests .
modelDataDownloadTimeoutInSeconds number: The timeout value, in seconds, to download and extract the model that you want to host from Amazon S3 to the individual inference instance associated with this inference component.

container_startup_health_check_timeout_in_seconds int: The timeout value, in seconds, for your inference container to pass health check by Amazon S3 Hosting. For more information about health check, see How Your Container Should Respond to Health Check (Ping) Requests .
model_data_download_timeout_in_seconds int: The timeout value, in seconds, to download and extract the model that you want to host from Amazon S3 to the individual inference instance associated with this inference component.

containerStartupHealthCheckTimeoutInSeconds Number: The timeout value, in seconds, for your inference container to pass health check by Amazon S3 Hosting. For more information about health check, see How Your Container Should Respond to Health Check (Ping) Requests .
modelDataDownloadTimeoutInSeconds Number: The timeout value, in seconds, to download and extract the model that you want to host from Amazon S3 to the individual inference instance associated with this inference component.

InferenceComponentStatus, InferenceComponentStatusArgs

InService: InService
Creating: Creating
Updating: Updating
Failed: Failed
Deleting: Deleting

InferenceComponentStatusInService: InService
InferenceComponentStatusCreating: Creating
InferenceComponentStatusUpdating: Updating
InferenceComponentStatusFailed: Failed
InferenceComponentStatusDeleting: Deleting

InService: InService
Creating: Creating
Updating: Updating
Failed: Failed
Deleting: Deleting

InService: InService
Creating: Creating
Updating: Updating
Failed: Failed
Deleting: Deleting

IN_SERVICE: InService
CREATING: Creating
UPDATING: Updating
FAILED: Failed
DELETING: Deleting

"InService": InService
"Creating": Creating
"Updating": Updating
"Failed": Failed
"Deleting": Deleting

Tag, TagArgs

Key string: The key name of the tag
Value string: The value of the tag

Key string: The key name of the tag
Value string: The value of the tag

key String: The key name of the tag
value String: The value of the tag

key string: The key name of the tag
value string: The value of the tag

key str: The key name of the tag
value str: The value of the tag

key String: The key name of the tag
value String: The value of the tag

Package Details

Repository: AWS Native pulumi/pulumi-aws-native
License: Apache-2.0

We recommend new projects start with resources from the AWS provider.

AWS Cloud Control v1.9.0 published on Monday, Nov 18, 2024 by Pulumi

pulumi/pulumi-aws-native

aws-native.sagemaker.InferenceComponent

On this page

On this page

Create InferenceComponent Resource

Constructor syntax

Parameters

InferenceComponent Resource Properties

Inputs

Outputs

Supporting Types

InferenceComponentComputeResourceRequirements, InferenceComponentComputeResourceRequirementsArgs

InferenceComponentContainerSpecification, InferenceComponentContainerSpecificationArgs

InferenceComponentDeployedImage, InferenceComponentDeployedImageArgs

InferenceComponentRuntimeConfig, InferenceComponentRuntimeConfigArgs

InferenceComponentSpecification, InferenceComponentSpecificationArgs

InferenceComponentStartupParameters, InferenceComponentStartupParametersArgs

InferenceComponentStatus, InferenceComponentStatusArgs

Tag, TagArgs

Package Details

On this page

On this page