Ray release docs by jinyan-li1 · Pull Request #5844 · aws/deep-learning-containers

jinyan-li1 · 2026-03-27T23:28:10Z

Purpose

Test Plan

Test Result

Toggle if you are merging into master Branch

By default, docker image builds and tests are disabled. Two ways to run builds and tests:

Using dlc_developer_config.toml
Using this PR description (currently only supported for PyTorch, TensorFlow, vllm, and base images)

How to use the helper utility for updating dlc_developer_config.toml

Assuming your remote is called origin (you can find out more with git remote -v)...

Run default builds and tests for a particular buildspec - also commits and pushes changes to remote; Example:

python src/prepare_dlc_dev_environment.py -b </path/to/buildspec.yml> -cp origin

Enable specific tests for a buildspec or set of buildspecs - also commits and pushes changes to remote; Example:

python src/prepare_dlc_dev_environment.py -b </path/to/buildspec.yml> -t sanity_tests -cp origin

Restore TOML file when ready to merge

python src/prepare_dlc_dev_environment.py -rcp origin

NOTE: If you are creating a PR for a new framework version, please ensure success of the local, standard, rc, and efa sagemaker tests by updating the dlc_developer_config.toml file:

sagemaker_remote_tests = true
sagemaker_efa_tests = true
sagemaker_rc_tests = true
sagemaker_local_tests = true

How to use PR description

Use the code block below to uncomment commands and run the PR CodeBuild jobs. There are two commands available:

# /buildspec <buildspec_path>
- e.g.: # /buildspec pytorch/training/buildspec.yml
- If this line is commented out, dlc_developer_config.toml will be used.
# /tests <test_list>
- e.g.: # /tests sanity security ec2
- If this line is commented out, it will run the default set of tests (same as the defaults in dlc_developer_config.toml): sanity, security, ec2, ecs, eks, sagemaker, sagemaker-local.

# /buildspec <buildspec_path>
# /tests <test_list>

Toggle if you are merging into main Branch

PR Checklist

[] I ran pre-commit run --all-files locally before creating this PR. (Read DEVELOPMENT.md for details).

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 275 X-AI-Prompt: Looks right. Two additions: also create docs/releasenotes/ray/index.md as a placeholder (the nav references it). And before writing the data YAML files, read docs/src/data/sglang/0.5.9-gpu-ec2.yml to match the exact field format. Proceed.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 35 X-AI-Prompt: docs/src/global.yml has a duplicate ray key, and use ray not ray serve

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 93 X-AI-Prompt: Make these fixes to docs/ray/index.md: 1. The nav label for the Ray page should be "Ray" not "Ray Serve" — update docs/.nav.yml accordingly. 2. Pull commands: remove the auth step entirely — only show the docker pull command. Look at how other framework pages in the repo show pull commands and follow the same pattern. 3. Remove the Key Packages table from docs/ray/index.md entirely. Package versions belong on the release notes page (per-version), not the framework overview. Add a brief note like "For package versions, see the Release Notes." with a link instead. 4. For the pull command URIs — check if the existing framework pages use macros or hardcoded URIs, and follow the same approach. If there's a way to reference the latest image URI dynamically from the data files, use it; otherwise use a placeholder.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 108 X-AI-Prompt: 1. Add Ray to the macro system so pull commands use dynamic URIs. Check docs/src/macros.py and docs/src/image_config.py to understand how other frameworks do it, then add Ray entries following the same pattern. Update docs/ray/index.md pull commands to use the macros. 2. In the Versioning Strategy section, remove the sentence "where <platform> is ec2 or sagemaker" — keep the tag format as-is, just drop that explanation.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 64 X-AI-Prompt: get_latest_image_uri() only accepts 2 arguments. Check docs/src/image_config.py to see the correct signature and fix the macros.py entries accordingly. Remove the "Image URIs are placeholders" note from docs/ray/index.md.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 19 X-AI-Prompt: Simplify the pull commands from 4 tabs to 2 tabs. Each tab shows both GPU and CPU variants

--- X-AI-Tool: Human X-AI-Prompt: Simplify the pull commands from 4 tabs to 2 tabs. Each tab shows both GPU and CPU variants

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 30 X-AI-Prompt: Check the Ray upstream release date for version 2.54.0 on the Ray GitHub repo (https://github.com/ray-project/ray/releases). Use that date as the GA date and set EOP to exactly 1 year later. Update the support policy table in docs/ray/index.md with the real dates.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 71 X-AI-Prompt: Update the Examples section in docs/ray/index.md to cover both EC2 and SageMaker deployment. Read test/ray/ec2/common.py and test/ray/sagemaker/common.py for the actual test patterns. Structure: - EC2 section: docker run + HTTP requests for the 4 categories (NLP/sentiment, CV/DenseNet, Audio, Tabular/Iris) - SageMaker section: SageMaker SDK deployment (Model.deploy() + predictor.predict()) showing the same 4 categories Use realistic request/response examples based on what the tests validate. Use placeholder S3 paths and model names — no real bucket names from the test code. Refer to the structure and style of https://docs.ray.io/en/latest/serve/tutorials/text- classification.html for how to structure each example — show the full flow from deployment to inference request.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 15 X-AI-Prompt: In the Examples section, remove the "(NLP)" and "(CV)" suffixes from the EC2 section headings so they match the SageMaker section: "Sentiment Analysis", "Image Classification", "Audio Transcription", "Tabular Classification".

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 58 X-AI-Prompt: Proceed with two corrections: 1. CV response: only show "predictions" — omit "model" and "device" fields 2. Audio response: change transcription to "<transcription depends on audio input>" — the tests use sine waves so there's no meaningful example output

--- X-AI-Tool: Human X-AI-Prompt: Proceed with two corrections: 1. CV response: only show "predictions" — omit "model" and "device" fields 2. Audio response: change transcription to "<transcription depends on audio input>" — the tests use sine waves so there's no meaningful example output

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 37 X-AI-Prompt: In the config.yaml example in the Deployment Guide, change num_gpus: 0 to num_gpus: 1 for GPU deployments. Add a note that num_gpus: 0 should be used for CPU images and num_gpus: 1 for GPU images.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 19 X-AI-Prompt: Actually the numbers of GPUs should be the number of GPUs allocated per replica, right?

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 17 X-AI-Prompt: The SageMaker examples use v2 SDK (sagemaker>=2,<3). Update any SageMaker SDK documentation links to point to the v2 docs: https://sagemaker.readthedocs.io/en /v2/ and verify the code examples use v2 API patterns.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 36 X-AI-Prompt: Fix the SageMaker tabular example: the tabular model uses config.yaml (no SM_RAYSERVE_APP). Remove env={"SM_RAYSERVE_APP": "deployment:app"} from the tabular Model(). If you want to show the SM_RAYSERVE_APP pattern, add a separate brief example showing it with a note that it's for models without a config.yaml (like the MNIST test), check with test scripts again.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 30 X-AI-Prompt: Should the Deployment Guide clarify what import_path: deployment:app means? Specifically: deployment refers to deployment.py in the model package, and app is the variable defined at the bottom of that file (e.g. app = MyDeployment.bind()). Also, should we add a note that on SageMaker, the model tarball is automatically downloaded from S3 and extracted to /opt/ml/model/ before the container starts? Check the actual deployment.py files in markdown/rayserve-tars/ to see how app is defined there, and use that as the example.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 43 X-AI-Prompt: In the "Direct App Import" example, add a note explaining RAYSERVE_NUM_GPUS: it is a custom environment variable read by the deployment code to configure GPU allocation per replica. It is only needed when using SM_RAYSERVE_APP without a config.yaml — when using config.yaml, set num_gpus directly under ray_actor_options instead. Check markdown/rayserve-tars/ mnist-direct-app deployment.py to confirm how it's used.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 17 X-AI-Prompt: Fix the EC2 health check command — the /healthz endpoint returns HTTP 200 (not a body containing "OK"). Check dlc-copy/test/ray/ec2/common.py again.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 18 X-AI-Prompt: In the EC2 Tabular Classification example, remove /opt/ml/model/config.yaml from the end of the docker run command — the entrypoint auto-detects config.yaml at /opt/ml/model/config.yaml when no CLI arg is provided.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 99 X-AI-Prompt: The EC2 examples section needs the actual deployment.py and config.yaml code inline so users can copy-paste and try the container immediately. Right now we show docker run + curl but not the model package files. For each EC2 example, add the code before the docker run command. Show users how to mkdir, save the files, then run. Follow the pattern Ray uses: https:// docs.ray.io/en/latest/serve/tutorials/text-classification.html The working files are in workspace/2-wk-challenge/markdown/rayserve-tars/ under nlp/, cv-densenet/, audio-ffmpeg/, and tabular/. Read them and put the code inline, but clean up the following before including: Remove from all deployments: - "device": str(self.device) from response dicts - "model": "densenet161" / "model": "mnist_cnn" from response dicts - "deployment_method": "SM_RAYSERVE_APP" from response dicts - "audio_backend": "ffmpeg" from response dicts - "installed_packages" field and the tabulate validation block in tabular - Any logging that mentions test patterns, backends, or deployment methods (e.g. f"loaded on {self.device}", "(SM_RAYSERVE_APP deployment)") Keep the code functional — just strip the CI/test instrumentation so responses match what the guide already documents. For models that download weights at runtime (NLP, CV, Audio): provide the full code inline — users just create the files and go. For models with local weights (Tabular with iris_model.pth/norm_params.json, MNIST with model.pth): these can't be copy-pasted since users need the actual weight files. Either skip inline code for these and just describe the model package structure, or add a small training script that generates the weights. Your call on what fits the guide best.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 150 X-AI-Prompt: A few fixes and improvements: Bug fixes: 1. Add weights_only=True to torch.load in the tabular deployment.py — PyTorch 2.6+ defaults to weights_only=True and will break without it. 2. The image classification example has max_replicas: 2 with num_gpus: 1 in config.yaml. On single-GPU instances (g5.xlarge, g4dn.xlarge) the second replica can never be placed. add a note immediately after the cv-model/config.yaml code block. The note should read something like: The autoscaling_config in deployment.py sets max_replicas: 2. Each replica requests 1 GPU, so this configuration requires a multi-GPU instance. On single-GPU instances, reduce max_replicas to 1. Do not modify any code blocks or other sections. Documentation: 3. The "Direct App Import" section references RAYSERVE_NUM_GPUS but never shows how the deployment code reads it. Add a minimal snippet showing the pattern: python import os num_gpus = int(os.environ.get("RAYSERVE_NUM_GPUS", "0")) @serve.deployment(ray_actor_options={"num_gpus": num_gpus}) class MyDeployment: ... 4. Remove the training script (train.py). Instead, note that this example requires pre-trained weights (iris_model.pth, norm_params.json) in the model directory. Keep deployment.py and config.yaml inline as reference for the expected structure. Also update the intro line from: "Each example below includes the full model package files so you can copy-paste and run immediately." to something like: "Each example below includes the full model package files. The first three download weights automatically on startup. The tabular example requires pre-trained weights — substitute your own trained model."

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 19 X-AI-Prompt: Revert the torch.load change in tabular-model/deployment.py — restore it to the original without weights_only=True.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 22 X-AI-Prompt: In the Deployment Paths table, "Place config.yaml at root of model package" is ambiguous — users may not know if that means the mounted directory, inside the tarball, or the final path on disk. Clarify this so it's unambiguous for both EC2 (directory mount) and SageMaker (tarball extraction).

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 19 X-AI-Prompt: More concise but clear to fit the table better

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 57 X-AI-Prompt: Move "Direct App Import (No config.yaml)" out of the SageMaker section and make it its own section after both EC2 and SageMaker. It applies to both platforms — EC2 uses a CLI argument (docker run <image> deployment:app), SageMaker uses the SM_RAYSERVE_APP env var. Show both mechanisms in the section. The RAYSERVE_NUM_GPUS explanation applies to both since neither has a config.yaml to set num_gpus in.

--- X-AI-Tool: Human X-AI-Prompt: Move "Direct App Import (No config.yaml)" out of the SageMaker section and make it its own section after both EC2 and SageMaker. It applies to both platforms — EC2 uses a CLI argument (docker run <image> deployment:app), SageMaker uses the SM_RAYSERVE_APP env var. Show both mechanisms in the section. The RAYSERVE_NUM_GPUS explanation applies to both since neither has a config.yaml to set num_gpus in.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 29 X-AI-Prompt: Add a "SageMaker Environment Variables" table next to the existing "EC2 Environment Variables" table. Include SM_RAYSERVE_APP and CA_REPOSITORY_ARN with defaults and descriptions. Check the entrypoint code to confirm the actual defaults before writing the table.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 20 X-AI-Prompt: Make it clear that those are both optional variables

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 89 X-AI-Prompt: Actually, let's remove the SageMaker Environment Variables table since they are both optional variables - it may confuse readers, and they are both mentioned in relevant sections.

--- X-AI-Tool: Human X-AI-Prompt: Should we add a comment to let users know they would need to parse the response?

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 24 X-AI-Prompt: In the Image Classification EC2 example, add a curl -O download step for the test image before the inference request. Use the public TorchServe kitten image at https://s3.amazonaws.com/model-server/inputs/kitten.jpg. Place it after the health check and before the inference curl. This makes the example fully copy-pasteable without requiring the user to supply their own image. The NLP and tabular examples already work out of the box because test data is inline — this makes the image the same as well.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 60 X-AI-Prompt: One minor optimization — move the download before the health check so it happens while the container is starting up, saving time.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 79 X-AI-Prompt: Two fixes needed for the SageMaker Deployment section based on testing: 1. Add SageMaker SDK prerequisite At the top of the "SageMaker Deployment" section (before the tarball/upload steps), add a note that users need to install the SageMaker Python SDK v2: bash pip install 'sagemaker>=2,<3' v3 drops the Model, Predictor, and Serializer APIs used in these examples. 2. Add inference_ami_version to GPU SageMaker deploy calls GPU SageMaker deploys fail with CannotStartContainerError without specifying the inference AMI version — the default SageMaker host AMI has incompatible NVIDIA drivers for our CUDA 12.9 images. This is SageMaker-specific only; EC2 users can pick their own compatible instance/AMI. Add inference_ami_version="al2-ami-sagemaker-inference-gpu-3-1" to all GPU .deploy() calls in the SageMaker section. This applies to: - The Sentiment Analysis example - The Direct App Import SageMaker example Do NOT add it to CPU deploys (tabular) or any EC2 examples. Add a brief note explaining why, with a link to the API reference for valid values: https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ ProductionVariant.html Here is the tested working sentiment example: python import json from sagemaker.model import Model from sagemaker.predictor import Predictor from sagemaker.serializers import JSONSerializer predictor = Model( image_uri="763104351884.dkr.ecr.us-west-2.amazonaws.com/ray:serve-ml-sagemaker-cuda-v1.0.0", role="arn:aws:iam::<ACCOUNT>:role/SageMakerExecutionRole", model_data="s3://<BUCKET>/models/nlp-sentiment/model.tar.gz", predictor_cls=Predictor, ).deploy( instance_type="ml.g5.xlarge", initial_instance_count=1, endpoint_name="ray-serve-nlp", serializer=JSONSerializer(), inference_ami_version="al2-ami-sagemaker-inference-gpu-3-1", wait=True, ) response = predictor.predict({"text": "I love this so much, best purchase ever!"}) result = json.loads(response) # predictor.predict() returns raw bytes # {"predictions": [{"label": "POSITIVE", "score": 0.9991}]} And the Direct App Import SageMaker example should also get it: python predictor = model.deploy( instance_type="ml.g5.xlarge", initial_instance_count=1, endpoint_name="ray-serve-mnist", inference_ami_version="al2-ami-sagemaker-inference-gpu-3-1", wait=True, ) Double check the test scripts to confirm and verify correctness.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 37 X-AI-Prompt: Fix the format issue around predictor.delete_endpoint()

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 18 X-AI-Prompt: In the "Direct App Import" section, before the SageMaker code block, clarify two things: 1. Connect it to the sentiment walkthrough: the tarball packaging and S3 upload steps are the same. 2. Clarify that the SM_RAYSERVE_APP import path (e.g. deployment:app) resolves relative to /opt/ml/model/, so deployment.py must be at the root of the tarball — same as the config.yaml examples, just without the config.yaml. Something like: "Package your model directory the same way as the sentiment example (tarball uploaded to S3), but omit config.yaml. The deployment.py must be at the tarball root — SM_RAYSERVE_APP=deployment:app resolves the module from /opt/ml/model/."

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 35 X-AI-Prompt: The Deployment Paths table is inaccurate — on EC2, a CLI argument takes precedence over config.yaml, but the table implies config.yaml is always the default. Update the table to reflect that CLI arg overrides config.yaml on EC2. Keep it concise. The Direct App Import SageMaker code block is missing imports for Model and Predictor. Add them.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 24 X-AI-Prompt: Is there a way to make the desc more concise?

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 25 X-AI-Prompt: Update the Support Policy table to include the DLC version column. Users need to map between the Ray version and the DLC image tag version (v1.0.0). The table should be: | Version | DLC Version | GA Date | End of Patch | |---|---|---|---| | Ray 2.54 | v1.0.0 | 2026-02-18 | 2027-02-18 | This makes it clear that dlc's ray v1.0.0 corresponds to Ray 2.54.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 20 X-AI-Prompt: Update the Support Policy table to include key version info that users need at a glance. DLC version first (it's the image tag), then the three components that drive major version bumps per the versioning strategy. Full package details stay in Release Notes. | DLC Version | Ray | Python | CUDA | GA Date | End of Patch | |---|---|---|---|---|---| | v1.0.0 | 2.54.0 | 3.13 | 12.9.1 | 2026-02-18 | 2027-02-18 |

docs: add Ray Serve DLC framework page and release notes infrastructure Add comprehensive documentation for the Ray Serve inference DLC images, including a per-framework page at docs/ray/index.md with deployment guide, EC2 and SageMaker examples, versioning strategy, and support policy. Also add release notes data files, table config, and macro support for dynamic image URIs. New files: - docs/ray/index.md: Framework page with pull commands, deployment guide (model package structure, deployment paths, env vars, runtime deps), EC2 examples (sentiment, image classification, audio, tabular with full inline deployment code), SageMaker deployment walkthrough, and direct app import section. - docs/releasenotes/ray/index.md: Placeholder release notes page. - docs/src/data/ray/: Image config YAML files for Ray 2.54 (GPU/CPU x EC2/SageMaker). - docs/src/tables/ray.yml: Table column config for Ray. Updated files: - docs/src/global.yml: Added ray display name and table_order entry. - docs/src/macros.py: Added Ray image URI macros with accelerator filtering. - docs/src/image_config.py: No changes (kept original 2-arg signature). - docs/releasenotes/index.md: Added Ray link. - docs/.nav.yml: Added Ray under Getting Started and Release Notes. ai-dev-branch commit IDs: - 7096f96 - 369cd33 - 19ad220 - ad4e26f - 8e724be - c15bbad - 866794a - eb0a311 - 200f84a - b68464c - bfe969d - 7a2da45 - e256c17 - 8f1d183 - 0bd7491 - a0d26e2 - 21f0d4b - ede013d - fa54049 - f915294 - 1833e1d - 070d2ef - be67d50 - dee31d3 - 5123d7e - 2ad7ba3 - 9f8a65d - 923e120 - 46f71b1 - 199395f - 62f49e1 - 877fa5c - db00781 - 5cc4737 - 71a66aa - 3a01673 - f51449a - 6521964 - 2e4eb9a - 1df8040 - 7f9483d - 19015fe - 95f4aad - 8464858 - 826ee85 The prompts used are captured in the footers of those commits. The initial prompt was: Looks right. Two additions: also create docs/releasenotes/ray/index.md as a placeholder (the nav references it). And before writing the data YAML files, read docs/src/data/sglang/0.5.9-gpu-ec2.yml to match the exact field format. Proceed. --- X-AI-Handle-Time-Seconds: 2175 X-AI-Line-Changes: New:973, Altered:71, Deleted:174 X-Human-Line-Changes: New:0, Altered:4, Deleted:4 X-AI-Line-Changes-Kiro-cli: New:973, Altered:71, Deleted:174 X-AI-Handle-Time-Seconds-Kiro-cli: 2175 X-AI-Change-Count: 41 X-Human-Change-Count: 4 X-AI-Change-Count-Kiro-cli: 41 X-CR-Amendment: false

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 29 X-AI-Prompt: Remove model_weights.pth from the Model Package Structure tree diagram — 3 out of 4 examples download weights at runtime, so showing it in the standard layout is misleading. Add a note after the tree that model weights can optionally be placed at the tarball root alongside config.yaml and deployment.py (extracted to /opt/ml/model/ at runtime) if your model doesn't download them at startup.

--- X-AI-Tool: Human X-AI-Prompt: Remove model_weights.pth from the Model Package Structure tree diagram — 3 out of 4 examples download weights at runtime, so showing it in the standard layout is misleading. Add a note after the tree that model weights can optionally be placed at the tarball root alongside config.yaml and deployment.py (extracted to /opt/ml/model/ at runtime) if your model doesn't download them at startup.

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 297 X-AI-Prompt: 1. Image tag changes — remove ec2 platform from default images - **Pull Commands section**: Rename the "EC2" tab to "Default". Update the image tags to remove the ec2 platform segment — e.g. ray:serve-ml-cuda-v1.0.0 and ray:serve-ml-cpu-v1.0.0 instead of ray:serve-ml-ec2-cuda-v1.0.0 and ray:serve-ml-ec2-cpu-v1.0.0. Add a note after the pull commands that the Default images were tested on EC2 instances. - **All EC2 examples**: Update the docker run image URIs to use the new tags without ec2 (e.g. ray:serve-ml-cuda-v1.0.0, ray:serve-ml-cpu-v1.0.0). - **SageMaker images stay unchanged** — they still use ray:serve-ml-sagemaker-cuda-v1.0.0 / ray:serve-ml-sagemaker-cpu-v1.0.0. - **Versioning Strategy**: Update the tag format to show platform as optional: ray:serve-ml-[<platform>-]{cpu|cuda}-v<MAJOR>.<MINOR>.<PATCH>. Explain that <platform> is omitted for the default images and present for platform-specific images (e.g. sagemaker). 2. Add Announcements section Add an "Announcements" section right after the intro paragraph and before "Pull Commands". Placeholder content for now: ## Announcements *No announcements at this time.*

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 83 X-AI-Prompt: remove ECS/EKS references since those haven't been tested

--- X-AI-Tool: Human X-AI-Prompt: remove ECS/EKS references since those haven't been tested

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 59 X-AI-Prompt: The release notes title "EC2, ECS, EKS" comes from docs/src/global.yml which maps platform: ec2 → "EC2, ECS, EKS". Don't change this global mapping (it affects all DLCs). Instead: 1. Add a new platform mapping in global.yml: default: "EC2" 2. Update the Ray YAML data files (2.54-gpu-ec2.yml, 2.54-cpu-ec2.yml) to use platform: default instead of platform: ec2 This way Ray release notes show "EC2" only while other DLCs keep "EC2, ECS, EKS".

--- X-AI-Tool: Human X-AI-Prompt: The release notes title "EC2, ECS, EKS" comes from docs/src/global.yml which maps platform: ec2 → "EC2, ECS, EKS". Don't change this global mapping (it affects all DLCs). Instead: 1. Add a new platform mapping in global.yml: default: "EC2" 2. Update the Ray YAML data files (2.54-gpu-ec2.yml, 2.54-cpu-ec2.yml) to use platform: default instead of platform: ec2 This way Ray release notes show "EC2" only while other DLCs keep "EC2, ECS, EKS".

--- X-AI-Tool: Human X-AI-Prompt: Move all code snippets from the Ray index page into standalone files under examples/ray/ so they can be run in CI while still displaying on the page. 1. Add pymdownx.snippets to mkdocs.yaml: yaml markdown_extensions: - pymdownx.snippets: base_path: ["."] 2. Create the example files under examples/ray/: Extract each code block from the Ray index page into its own file, preserving the exact content. The structure should be: examples/ray/ ├── nlp-model/ │ ├── config.yaml │ └── deployment.py ├── cv-model/ │ ├── config.yaml │ └── deployment.py ├── audio-model/ │ ├── config.yaml │ └── deployment.py ├── tabular-model/ │ ├── config.yaml │ └── deployment.py └── sagemaker/ ├── deploy_sentiment.py └── deploy_direct_app.py 3. Replace inline code blocks in docs/ray/index.md with snippet references: For example, replace: `markdown python from ray import serve from transformers import pipeline ... With: markdown python --8<-- "examples/ray/nlp-model/deployment.py" Do this for every config.yaml and deployment.py code block. The rendered page should look identical to before. **4. Verify** by running mkdocs build

--- X-AI-Tool: Human X-AI-Prompt: The SageMaker deploy snippet that was moved to examples/ has unrendered {{ images.latest_ray_sagemaker_gpu }} macros — pymdownx.snippets includes raw file content before the macros plugin runs. Fix: In all example files under examples/ray/, replace macro references with the base tag (no version suffix). This way examples never go stale across releases: - GPU default: 763104351884.dkr.ecr.us-west-2.amazonaws.com/ray:serve-ml-cuda - CPU default: 763104351884.dkr.ecr.us-west-2.amazonaws.com/ray:serve-ml-cpu - GPU SageMaker: 763104351884.dkr.ecr.us-west-2.amazonaws.com/ray:serve-ml-sagemaker-cuda - CPU SageMaker: 763104351884.dkr.ecr.us-west-2.amazonaws.com/ray:serve-ml-sagemaker-cpu Apply the same to any inline docker run commands in the markdown that were using macros — replace them with the base tags too. Keep the Pull Commands section at the top using macros to show the exact versioned tags.

--- X-AI-Tool: Human X-AI-Prompt: Do NOT change any inline code in the markdown files — those should keep using {{ }} macros.

docs: extract Ray examples to standalone files, update image tags Extract all inline deployment code from docs/ray/index.md into standalone files under examples/ray/ using pymdownx.snippets, so examples can be run in CI while still rendering on the docs page. Update image tags to remove ec2 platform segment from default images (serve-ml-cuda, serve-ml-cpu) while keeping sagemaker-prefixed tags for SageMaker images. Add default platform mapping in global.yml so Ray release notes show "EC2" only. Remove untested ECS/EKS references. New files: - examples/ray/{nlp,cv,audio,tabular}-model/{config.yaml,deployment.py} - examples/ray/sagemaker/{deploy_sentiment,deploy_direct_app}.py - mkdocs.yaml: added pymdownx.snippets extension Updated files: - docs/ray/index.md: replaced inline code with snippet references, updated image tags, removed model_weights.pth from package structure, added Announcements section - docs/src/global.yml: added default platform mapping - docs/src/data/ray/*.yml: updated ec2 files to platform: default, removed ec2 from tags - docs/src/macros.py: renamed ec2 macros to default ai-dev-branch commit IDs: - 9763a98 - 0aa5c42 - bff86f9 - fea14f9 - 8f0787d - 1f9c7dc - 9c96a28 - 8d03ac1 - b49aee3 - 47d7583 The prompts used are captured in the footers of those commits. The initial prompt was: Remove model_weights.pth from the Model Package Structure tree diagram — 3 out of 4 examples download weights at runtime, so showing it in the standard layout is misleading. --- X-AI-Handle-Time-Seconds: 468 X-AI-Line-Changes: New:6, Altered:18, Deleted:0 X-Human-Line-Changes: New:202794, Altered:2771, Deleted:0 X-AI-Line-Changes-Kiro-cli: New:6, Altered:18, Deleted:0 X-AI-Handle-Time-Seconds-Kiro-cli: 468 X-AI-Change-Count: 4 X-Human-Change-Count: 6 X-AI-Change-Count-Kiro-cli: 4 X-CR-Amendment: false

…ls/ to .gitignore

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 80 X-AI-Prompt: can you run precommit hooks

style: apply pre-commit formatting fixes Run pre-commit hooks on all files. Fixes applied by ruff-format (Python formatting), ruff isort (import sorting), and flowmark (markdown line wrapping). ai-dev-branch commit IDs: - 6805d4f The prompts used are captured in the footers of those commits. The initial prompt was: can you run precommit hooks --- X-AI-Handle-Time-Seconds: 80 X-AI-Line-Changes: New:26, Altered:37, Deleted:0 X-Human-Line-Changes: New:0, Altered:0, Deleted:0 X-AI-Line-Changes-Kiro-cli: New:26, Altered:37, Deleted:0 X-AI-Handle-Time-Seconds-Kiro-cli: 80 X-AI-Change-Count: 1 X-Human-Change-Count: 0 X-AI-Change-Count-Kiro-cli: 1 X-CR-Amendment: false

--- X-AI-Tool: Human X-AI-Prompt: <none - new session>

docs: replace broken tabs with bold headers for pull commands The flowmark markdown formatter broke the pymdownx.tabbed syntax (=== "Tab") by removing the required indentation. Replace with bold headers which render correctly without tab extension support. ai-dev-branch commit IDs: - 1444eed The prompts used are captured in the footers of those commits. The initial prompt was: <none - new session> --- X-AI-Handle-Time-Seconds: 0 X-AI-Line-Changes: New:0, Altered:0, Deleted:0 X-Human-Line-Changes: New:0, Altered:2, Deleted:4 X-AI-Change-Count: 0 X-Human-Change-Count: 1 X-CR-Amendment: true

jinyan-li1 added 30 commits March 25, 2026 13:01

AI changes made during Kiro-cli session

369cd33

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 35 X-AI-Prompt: docs/src/global.yml has a duplicate ray key, and use ray not ray serve

AI changes made during Kiro-cli session

c15bbad

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 19 X-AI-Prompt: Simplify the pull commands from 4 tabs to 2 tabs. Each tab shows both GPU and CPU variants

Human changes made during kiro-cli session after prompt completion.

866794a

--- X-AI-Tool: Human X-AI-Prompt: Simplify the pull commands from 4 tabs to 2 tabs. Each tab shows both GPU and CPU variants

AI changes made during Kiro-cli session

8f1d183

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 19 X-AI-Prompt: Actually the numbers of GPUs should be the number of GPUs allocated per replica, right?

AI changes made during Kiro-cli session

fa54049

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 17 X-AI-Prompt: Fix the EC2 health check command — the /healthz endpoint returns HTTP 200 (not a body containing "OK"). Check dlc-copy/test/ray/ec2/common.py again.

AI changes made during Kiro-cli session

be67d50

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 19 X-AI-Prompt: Revert the torch.load change in tabular-model/deployment.py — restore it to the original without weights_only=True.

AI changes made during Kiro-cli session

5123d7e

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 19 X-AI-Prompt: More concise but clear to fit the table better

AI changes made during Kiro-cli session

46f71b1

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 20 X-AI-Prompt: Make it clear that those are both optional variables

AI changes made during Kiro-cli session

199395f

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 89 X-AI-Prompt: Actually, let's remove the SageMaker Environment Variables table since they are both optional variables - it may confuse readers, and they are both mentioned in relevant sections.

jinyan-li1 added 20 commits March 26, 2026 13:07

Human changes made during kiro-cli session after prompt completion.

3a01673

--- X-AI-Tool: Human X-AI-Prompt: Should we add a comment to let users know they would need to parse the response?

AI changes made during Kiro-cli session

6521964

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 60 X-AI-Prompt: One minor optimization — move the download before the health check so it happens while the container is starting up, saving time.

AI changes made during Kiro-cli session

1df8040

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 37 X-AI-Prompt: Fix the format issue around predictor.delete_endpoint()

AI changes made during Kiro-cli session

95f4aad

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 24 X-AI-Prompt: Is there a way to make the desc more concise?

AI changes made during Kiro-cli session

fea14f9

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 83 X-AI-Prompt: remove ECS/EKS references since those haven't been tested

Human changes made during kiro-cli session after prompt completion.

8f0787d

--- X-AI-Tool: Human X-AI-Prompt: remove ECS/EKS references since those haven't been tested

aws-deep-learning-containers-ci bot added authorized Size:XL Determines the size of the PR labels Mar 27, 2026

jinyan-li1 and others added 8 commits March 30, 2026 10:10

Human changes made during kiro-cli session after prompt completion.

47d7583

--- X-AI-Tool: Human X-AI-Prompt: Do NOT change any inline code in the markdown files — those should keep using {{ }} macros.

chore: remove site/ build output from tracking, add site/ and tutoria…

9b31518

…ls/ to .gitignore

Merge branch 'main' into ray-release-docs

291f18a

AI changes made during Kiro-cli session

6805d4f

--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 80 X-AI-Prompt: can you run precommit hooks

Human changes made during kiro-cli session after prompt completion.

1444eed

--- X-AI-Tool: Human X-AI-Prompt: <none - new session>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ray release docs#5844

Ray release docs#5844
jinyan-li1 wants to merge 63 commits intomainfrom
ray-release-docs

jinyan-li1 commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jinyan-li1 commented Mar 27, 2026

Purpose

Test Plan

Test Result

PR Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant