AWSSageMakerRunner#

class AWSSageMakerRunner[source]#

Bases: Runner

Runs pipelines remotely using AWS SageMaker.

Requires Everett configuration of form:

[SAGEMAKER]
role=
cpu_image=
cpu_instance_type=
gpu_image=
gpu_instance_type=
train_image=
train_instance_type=
train_instance_count=
use_spot_instances=
spot_instance_max_wait_time=
max_run_time=
__init__()#

Methods

__init__()

build_pipeline(cfg_json_uri, pipeline, commands)

Build a SageMaker Pipeline with each command as a step within it.

build_step(pipeline, step_name, job_name, ...)

Build appropriate SageMaker pipeline step.

get_split_ind()

Get the split_ind for the process.

run(cfg_json_uri, pipeline, commands[, ...])

Run commands in a Pipeline using a serialized PipelineConfig.

run_command(cmd[, use_gpu, image_uri, ...])

Run a single command as a SageMaker processing job.

build_pipeline(cfg_json_uri: str, pipeline: Pipeline, commands: list[str], num_splits: int = 1, cmd_prefix: list[str] = ['python', '-m', 'rastervision.pipeline.cli'], pipeline_run_name: str = 'rv') SageMakerPipeline[source]#

Build a SageMaker Pipeline with each command as a step within it.

Parameters:
Return type:

SageMakerPipeline

build_step(pipeline: RVPipeline, step_name: str, job_name: str, cmd: list[str], role: str, image_uri: str, instance_type: str, use_spot_instances: bool, sagemaker_session: PipelineSession, instance_count: int = 1, max_wait: int = 86400, max_run: int = 86400, **kwargs) TrainingStep | ProcessingStep[source]#

Build appropriate SageMaker pipeline step.

If step_name=='train', builds a TrainingStep. Otherwise, a ProcessingStep.

Parameters:
  • pipeline (RVPipeline) –

  • step_name (str) –

  • job_name (str) –

  • cmd (list[str]) –

  • role (str) –

  • image_uri (str) –

  • instance_type (str) –

  • use_spot_instances (bool) –

  • sagemaker_session (PipelineSession) –

  • instance_count (int) –

  • max_wait (int) –

  • max_run (int) –

Return type:

TrainingStep | ProcessingStep

get_split_ind() int | None#

Get the split_ind for the process.

For split commands, the split_ind determines which split of work to perform within the current OS process. The CLI has a –split-ind option, but some runners may have their own means of communicating the split_ind, and this method should be overridden in such cases. If this method returns None, then the –split-ind option will be used. If both are null, then it won’t be possible to run the command.

Return type:

int | None

run(cfg_json_uri: str, pipeline: Pipeline, commands: list[str], num_splits: int = 1, cmd_prefix: list[str] = ['python', '-m', 'rastervision.pipeline.cli'], pipeline_run_name: str = 'rv')[source]#

Run commands in a Pipeline using a serialized PipelineConfig.

Parameters:
  • cfg_json_uri (str) – URI of a JSON file with a serialized PipelineConfig

  • pipeline (Pipeline) – the Pipeline to run

  • commands (list[str]) – names of commands to run

  • num_splits (int) – number of splits to use for splittable commands

  • cmd_prefix (list[str]) –

  • pipeline_run_name (str) –

run_command(cmd: list[str], use_gpu: bool = False, image_uri: str | None = None, instance_type: str | None = None, role: str | None = None, job_name: str | None = None, sagemaker_session: Session | None = None) None[source]#

Run a single command as a SageMaker processing job.

Parameters:
  • cmd (list[str]) – The command to run.

  • use_gpu (bool) – Use the GPU instance type and image from the Everett config. This is ignored if image_uri and instance_type are provided. Defaults to False.

  • image_uri (str | None) – URI of docker image to use. If not provided, will be picked up from Everett config. Defaults to None.

  • instance_type (str | None) – AWS instance type to use. If not provided, will be picked up from Everett config. Defaults to None.

  • role (str | None) – AWS IAM role with SageMaker permissions. If not provided, will be picked up from Everett config. Defaults to None.

  • job_name (str | None) – Optional job name. Defaults to None.

  • sagemaker_session (Session | None) – SageMaker session. Defaults to None.

Return type:

None