Skip to content

Ray release docs#5844

Draft
jinyan-li1 wants to merge 63 commits intomainfrom
ray-release-docs
Draft

Ray release docs#5844
jinyan-li1 wants to merge 63 commits intomainfrom
ray-release-docs

Conversation

@jinyan-li1
Copy link
Copy Markdown
Contributor

Purpose

Test Plan

Test Result


Toggle if you are merging into master Branch

By default, docker image builds and tests are disabled. Two ways to run builds and tests:

  1. Using dlc_developer_config.toml
  2. Using this PR description (currently only supported for PyTorch, TensorFlow, vllm, and base images)
How to use the helper utility for updating dlc_developer_config.toml

Assuming your remote is called origin (you can find out more with git remote -v)...

  • Run default builds and tests for a particular buildspec - also commits and pushes changes to remote; Example:

python src/prepare_dlc_dev_environment.py -b </path/to/buildspec.yml> -cp origin

  • Enable specific tests for a buildspec or set of buildspecs - also commits and pushes changes to remote; Example:

python src/prepare_dlc_dev_environment.py -b </path/to/buildspec.yml> -t sanity_tests -cp origin

  • Restore TOML file when ready to merge

python src/prepare_dlc_dev_environment.py -rcp origin

NOTE: If you are creating a PR for a new framework version, please ensure success of the local, standard, rc, and efa sagemaker tests by updating the dlc_developer_config.toml file:

  • sagemaker_remote_tests = true
  • sagemaker_efa_tests = true
  • sagemaker_rc_tests = true
  • sagemaker_local_tests = true
How to use PR description Use the code block below to uncomment commands and run the PR CodeBuild jobs. There are two commands available:
  • # /buildspec <buildspec_path>
    • e.g.: # /buildspec pytorch/training/buildspec.yml
    • If this line is commented out, dlc_developer_config.toml will be used.
  • # /tests <test_list>
    • e.g.: # /tests sanity security ec2
    • If this line is commented out, it will run the default set of tests (same as the defaults in dlc_developer_config.toml): sanity, security, ec2, ecs, eks, sagemaker, sagemaker-local.
# /buildspec <buildspec_path>
# /tests <test_list>
Toggle if you are merging into main Branch

PR Checklist

  • [] I ran pre-commit run --all-files locally before creating this PR. (Read DEVELOPMENT.md for details).

---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 275
X-AI-Prompt: Looks right. Two additions: also create docs/releasenotes/ray/index.md as a
placeholder (the nav references it). And before writing the data YAML files,
read docs/src/data/sglang/0.5.9-gpu-ec2.yml to match the exact field format.
Proceed.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 35
X-AI-Prompt: docs/src/global.yml has a duplicate ray key, and use ray not ray serve
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 93
X-AI-Prompt: Make these fixes to docs/ray/index.md:

1. The nav label for the Ray page should be "Ray" not "Ray Serve" — update
docs/.nav.yml accordingly.

2. Pull commands: remove the auth step entirely — only show the docker pull
command. Look at how other framework pages in the repo show pull commands and
follow the same pattern.

3. Remove the Key Packages table from docs/ray/index.md entirely. Package
versions belong on the release notes page (per-version), not the framework
overview. Add a brief note like "For package versions, see the Release Notes."
with a link instead.

4. For the pull command URIs — check if the existing framework pages use macros
or hardcoded URIs, and follow the same approach. If there's a way to reference
the latest image URI dynamically from the data files, use it; otherwise use a
placeholder.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 108
X-AI-Prompt: 1. Add Ray to the macro system so pull commands use dynamic URIs. Check
docs/src/macros.py and docs/src/image_config.py to understand how other
frameworks do it, then add Ray entries following the same pattern. Update
docs/ray/index.md pull commands to use the macros.
2. In the Versioning Strategy section, remove the sentence "where <platform>
is ec2 or sagemaker" — keep the tag format as-is, just drop that explanation.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 64
X-AI-Prompt: get_latest_image_uri() only accepts 2 arguments. Check
docs/src/image_config.py to see the correct signature and fix the macros.py
entries accordingly.
Remove the "Image URIs are placeholders" note from docs/ray/index.md.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 19
X-AI-Prompt: Simplify the pull commands from 4 tabs to 2 tabs. Each tab shows both GPU
and CPU variants
---
X-AI-Tool: Human
X-AI-Prompt: Simplify the pull commands from 4 tabs to 2 tabs. Each tab shows both GPU
and CPU variants
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 30
X-AI-Prompt: Check the Ray upstream release date for version 2.54.0 on the Ray GitHub
repo (https://github.com/ray-project/ray/releases). Use that date as the GA date
and set EOP to exactly 1 year later. Update the support policy table in
docs/ray/index.md with the real dates.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 71
X-AI-Prompt: Update the Examples section in docs/ray/index.md to cover both EC2 and
SageMaker deployment. Read test/ray/ec2/common.py and
test/ray/sagemaker/common.py for the actual test patterns.

Structure:
- EC2 section: docker run + HTTP requests for the 4 categories (NLP/sentiment,
CV/DenseNet, Audio, Tabular/Iris)
- SageMaker section: SageMaker SDK deployment (Model.deploy() +
predictor.predict()) showing the same 4 categories

Use realistic request/response examples based on what the tests validate. Use
placeholder S3 paths and model names — no real bucket names from the test code.

Refer to the structure and style of https://docs.ray.io/en/latest/serve/tutorials/text-
classification.html for how to structure each example — show the full flow from
deployment to inference request.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 15
X-AI-Prompt: In the Examples section, remove the "(NLP)" and "(CV)" suffixes from the EC2
section headings so they match the SageMaker section: "Sentiment Analysis",
"Image Classification", "Audio Transcription", "Tabular Classification".
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 58
X-AI-Prompt: Proceed with two corrections:
1. CV response: only show "predictions" — omit "model" and "device" fields
2. Audio response: change transcription to
"<transcription depends on audio input>" — the tests use sine waves so there's
no meaningful example output
---
X-AI-Tool: Human
X-AI-Prompt: Proceed with two corrections:
1. CV response: only show "predictions" — omit "model" and "device" fields
2. Audio response: change transcription to
"<transcription depends on audio input>" — the tests use sine waves so there's
no meaningful example output
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 37
X-AI-Prompt: In the config.yaml example in the Deployment Guide, change num_gpus: 0 to
num_gpus: 1 for GPU deployments. Add a note that num_gpus: 0 should be used for
CPU images and num_gpus: 1 for GPU images.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 19
X-AI-Prompt: Actually the numbers of GPUs should be the number of GPUs allocated per replica, right?
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 17
X-AI-Prompt: The SageMaker examples use v2 SDK (sagemaker>=2,<3). Update any SageMaker SDK
documentation links to point to the v2 docs: https://sagemaker.readthedocs.io/en
/v2/ and verify the code examples use v2 API patterns.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 36
X-AI-Prompt: Fix the SageMaker tabular example: the tabular model uses config.yaml (no
SM_RAYSERVE_APP). Remove env={"SM_RAYSERVE_APP": "deployment:app"} from the
tabular Model(). If you want to show the SM_RAYSERVE_APP pattern, add a separate
brief example showing it with a note that it's for models without a
config.yaml (like the MNIST test), check with test scripts again.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 30
X-AI-Prompt: Should the Deployment Guide clarify what import_path: deployment:app means?
Specifically: deployment refers to deployment.py in the model package, and app
is the variable defined at the bottom of that file (e.g.
app = MyDeployment.bind()). Also, should we add a note that on SageMaker, the
model tarball is automatically downloaded from S3 and extracted to
/opt/ml/model/ before the container starts? Check the actual deployment.py files
in markdown/rayserve-tars/ to see how app is defined there, and use that as the
example.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 43
X-AI-Prompt: In the "Direct App Import" example, add a note explaining RAYSERVE_NUM_GPUS:
it is a custom environment variable read by the deployment code to configure
GPU allocation per replica. It is only needed when using SM_RAYSERVE_APP without
a config.yaml — when using config.yaml, set num_gpus directly under
ray_actor_options instead. Check markdown/rayserve-tars/ mnist-direct-app
deployment.py to confirm how it's used.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 17
X-AI-Prompt: Fix the EC2 health check command — the /healthz endpoint returns HTTP 200 (not
a body containing "OK"). Check dlc-copy/test/ray/ec2/common.py again.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 18
X-AI-Prompt: In the EC2 Tabular Classification example, remove /opt/ml/model/config.yaml
from the end of the docker run command — the entrypoint auto-detects config.yaml
at /opt/ml/model/config.yaml when no CLI arg is provided.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 99
X-AI-Prompt:
The EC2 examples section needs the actual deployment.py and config.yaml code
inline so users can copy-paste and try the container immediately. Right now we
show docker run + curl but not the model package files.

For each EC2 example, add the code before the docker run command. Show users how
to mkdir, save the files, then run. Follow the pattern Ray uses: https://
docs.ray.io/en/latest/serve/tutorials/text-classification.html

The working files are in workspace/2-wk-challenge/markdown/rayserve-tars/ under
nlp/, cv-densenet/, audio-ffmpeg/, and tabular/. Read them and put the code
inline, but clean up the following before including:

Remove from all deployments:
- "device": str(self.device) from response dicts
- "model": "densenet161" / "model": "mnist_cnn" from response dicts
- "deployment_method": "SM_RAYSERVE_APP" from response dicts
- "audio_backend": "ffmpeg" from response dicts
- "installed_packages" field and the tabulate validation block in tabular
- Any logging that mentions test patterns, backends, or deployment methods (e.g.
f"loaded on {self.device}", "(SM_RAYSERVE_APP deployment)")

Keep the code functional — just strip the CI/test instrumentation so responses
match what the guide already documents.

For models that download weights at runtime (NLP, CV, Audio): provide the full
code inline — users just create the files and go.

For models with local weights (Tabular with iris_model.pth/norm_params.json,
MNIST with model.pth): these can't be copy-pasted since users need the actual
weight files. Either skip inline code for these and just describe the model
package structure, or add a small training script that generates the weights.
Your call on what fits the guide best.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 150
X-AI-Prompt: A few fixes and improvements:

Bug fixes:

1. Add weights_only=True to torch.load in the tabular deployment.py — PyTorch
2.6+ defaults to weights_only=True and will break without it.

2. The image classification example has max_replicas: 2 with num_gpus: 1 in
config.yaml. On single-GPU instances (g5.xlarge, g4dn.xlarge) the second replica
can never be placed. add a note immediately after the cv-model/config.yaml code block. The note should read something like: The autoscaling_config in deployment.py sets max_replicas: 2. Each replica requests 1 GPU, so this configuration requires a multi-GPU instance. On single-GPU instances, reduce max_replicas to 1.
Do not modify any code blocks or other sections.

Documentation:

3. The "Direct App Import" section references RAYSERVE_NUM_GPUS but never shows
how the deployment code reads it. Add a minimal snippet showing the pattern:
python
import os
num_gpus = int(os.environ.get("RAYSERVE_NUM_GPUS", "0"))

@serve.deployment(ray_actor_options={"num_gpus": num_gpus})
class MyDeployment:
...
4. Remove the training script (train.py). Instead, note that this example requires pre-trained weights (iris_model.pth, norm_params.json) in the model directory. Keep deployment.py and config.yaml inline as reference for the expected structure.

Also update the intro line from:

"Each example below includes the full model package files so you can copy-paste and run immediately."

to something like:

"Each example below includes the full model package files. The first three download weights automatically on startup. The tabular example requires pre-trained weights — substitute your own trained model."
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 19
X-AI-Prompt: Revert the torch.load change in tabular-model/deployment.py — restore it to the original without weights_only=True.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 22
X-AI-Prompt: In the Deployment Paths table, "Place config.yaml at root of model package" is
ambiguous — users may not know if that means the mounted directory, inside the
tarball, or the final path on disk. Clarify this so it's unambiguous for both
EC2 (directory mount) and SageMaker (tarball extraction).
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 19
X-AI-Prompt: More concise but clear to fit the table better
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 57
X-AI-Prompt: Move "Direct App Import (No config.yaml)" out of the SageMaker section and
make it its own section after both EC2 and SageMaker. It applies to both
platforms — EC2 uses a CLI argument (docker run <image> deployment:app),
SageMaker uses the SM_RAYSERVE_APP env var. Show both mechanisms in the section.
The RAYSERVE_NUM_GPUS explanation applies to both since neither has a
config.yaml to set num_gpus in.
---
X-AI-Tool: Human
X-AI-Prompt: Move "Direct App Import (No config.yaml)" out of the SageMaker section and
make it its own section after both EC2 and SageMaker. It applies to both
platforms — EC2 uses a CLI argument (docker run <image> deployment:app),
SageMaker uses the SM_RAYSERVE_APP env var. Show both mechanisms in the section.
The RAYSERVE_NUM_GPUS explanation applies to both since neither has a
config.yaml to set num_gpus in.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 29
X-AI-Prompt: Add a "SageMaker Environment Variables" table next to the existing "EC2
Environment Variables" table. Include SM_RAYSERVE_APP and CA_REPOSITORY_ARN
with defaults and descriptions. Check the entrypoint code to confirm the
actual defaults before writing the table.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 20
X-AI-Prompt: Make it clear that those are both optional variables
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 89
X-AI-Prompt: Actually, let's remove the SageMaker Environment Variables table since they are both optional variables - it may confuse readers, and they are both mentioned in relevant sections.
---
X-AI-Tool: Human
X-AI-Prompt: Should we add a comment to let users know they would need to parse the response?
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 24
X-AI-Prompt: In the Image Classification EC2 example, add a curl -O download step for the test image before the inference request. Use the public TorchServe kitten image at https://s3.amazonaws.com/model-server/inputs/kitten.jpg. Place it after the health check and before the inference curl. This makes the example fully copy-pasteable without requiring the user to supply their own image. The NLP and tabular examples already work out of the box because test data is inline — this makes the image the same as well.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 60
X-AI-Prompt: One minor optimization — move the download before the health check so it happens while the container is starting up, saving time.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 79
X-AI-Prompt: Two fixes needed for the SageMaker Deployment section based on testing:

1. Add SageMaker SDK prerequisite

At the top of the "SageMaker Deployment" section (before the tarball/upload
steps), add a note that users need to install the SageMaker Python SDK v2:

bash
pip install 'sagemaker>=2,<3'

v3 drops the Model, Predictor, and Serializer APIs used in these examples.

2. Add inference_ami_version to GPU SageMaker deploy calls

GPU SageMaker deploys fail with CannotStartContainerError without specifying the
inference AMI version — the default SageMaker host AMI has incompatible NVIDIA
drivers for our CUDA 12.9 images. This is SageMaker-specific only; EC2 users
can pick their own compatible instance/AMI.

Add inference_ami_version="al2-ami-sagemaker-inference-gpu-3-1" to all GPU
.deploy() calls in the SageMaker section. This applies to:
- The Sentiment Analysis example
- The Direct App Import SageMaker example

Do NOT add it to CPU deploys (tabular) or any EC2 examples.

Add a brief note explaining why, with a link to the API reference for valid
values: https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_
ProductionVariant.html

Here is the tested working sentiment example:

python
import json

from sagemaker.model import Model
from sagemaker.predictor import Predictor
from sagemaker.serializers import JSONSerializer

predictor = Model(
image_uri="763104351884.dkr.ecr.us-west-2.amazonaws.com/ray:serve-ml-sagemaker-cuda-v1.0.0",
role="arn:aws:iam::<ACCOUNT>:role/SageMakerExecutionRole",
model_data="s3://<BUCKET>/models/nlp-sentiment/model.tar.gz",
predictor_cls=Predictor,
).deploy(
instance_type="ml.g5.xlarge",
initial_instance_count=1,
endpoint_name="ray-serve-nlp",
serializer=JSONSerializer(),
inference_ami_version="al2-ami-sagemaker-inference-gpu-3-1",
wait=True,
)

response = predictor.predict({"text": "I love this so much, best purchase ever!"})
result = json.loads(response) # predictor.predict() returns raw bytes
# {"predictions": [{"label": "POSITIVE", "score": 0.9991}]}

And the Direct App Import SageMaker example should also get it:

python
predictor = model.deploy(
instance_type="ml.g5.xlarge",
initial_instance_count=1,
endpoint_name="ray-serve-mnist",
inference_ami_version="al2-ami-sagemaker-inference-gpu-3-1",
wait=True,
)

Double check the test scripts to confirm and verify correctness.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 37
X-AI-Prompt: Fix the format issue around predictor.delete_endpoint()
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 18
X-AI-Prompt: In the "Direct App Import" section, before the SageMaker code block, clarify two
things:

1. Connect it to the sentiment walkthrough: the tarball packaging and S3 upload
steps are the same.
2. Clarify that the SM_RAYSERVE_APP import path (e.g. deployment:app) resolves
relative to /opt/ml/model/, so deployment.py must be at the root of the tarball
— same as the config.yaml examples, just without the config.yaml.

Something like: "Package your model directory the same way as the sentiment
example (tarball uploaded to S3), but omit config.yaml. The deployment.py must
be at the tarball root — SM_RAYSERVE_APP=deployment:app resolves the module from
/opt/ml/model/."
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 35
X-AI-Prompt: The Deployment Paths table is inaccurate — on EC2, a CLI argument takes precedence over config.yaml, but the table implies config.yaml is always the default. Update the table to reflect that CLI arg overrides config.yaml on EC2. Keep it concise.

The Direct App Import SageMaker code block is missing imports for Model and Predictor. Add them.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 24
X-AI-Prompt: Is there a way to make the desc more concise?
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 25
X-AI-Prompt: Update the Support Policy table to include the DLC version column. Users need to
map between the Ray version and the DLC image tag version (v1.0.0). The table
should be:

| Version | DLC Version | GA Date | End of Patch |
|---|---|---|---|
| Ray 2.54 | v1.0.0 | 2026-02-18 | 2027-02-18 |

This makes it clear that dlc's ray v1.0.0 corresponds
to Ray 2.54.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 20
X-AI-Prompt: Update the Support Policy table to include key version info that users need at
a glance. DLC version first (it's the image tag), then the three components
that drive major version bumps per the versioning strategy. Full package details
stay in Release Notes.

| DLC Version | Ray | Python | CUDA | GA Date | End of Patch |
|---|---|---|---|---|---|
| v1.0.0 | 2.54.0 | 3.13 | 12.9.1 | 2026-02-18 | 2027-02-18 |
docs: add Ray Serve DLC framework page and release notes infrastructure

Add comprehensive documentation for the Ray Serve inference DLC images,
including a per-framework page at docs/ray/index.md with deployment
guide, EC2 and SageMaker examples, versioning strategy, and support
policy. Also add release notes data files, table config, and macro
support for dynamic image URIs.

New files:
- docs/ray/index.md: Framework page with pull commands, deployment
  guide (model package structure, deployment paths, env vars, runtime
  deps), EC2 examples (sentiment, image classification, audio, tabular
  with full inline deployment code), SageMaker deployment walkthrough,
  and direct app import section.
- docs/releasenotes/ray/index.md: Placeholder release notes page.
- docs/src/data/ray/: Image config YAML files for Ray 2.54 (GPU/CPU x
  EC2/SageMaker).
- docs/src/tables/ray.yml: Table column config for Ray.

Updated files:
- docs/src/global.yml: Added ray display name and table_order entry.
- docs/src/macros.py: Added Ray image URI macros with accelerator
  filtering.
- docs/src/image_config.py: No changes (kept original 2-arg signature).
- docs/releasenotes/index.md: Added Ray link.
- docs/.nav.yml: Added Ray under Getting Started and Release Notes.

ai-dev-branch commit IDs:
- 7096f96
- 369cd33
- 19ad220
- ad4e26f
- 8e724be
- c15bbad
- 866794a
- eb0a311
- 200f84a
- b68464c
- bfe969d
- 7a2da45
- e256c17
- 8f1d183
- 0bd7491
- a0d26e2
- 21f0d4b
- ede013d
- fa54049
- f915294
- 1833e1d
- 070d2ef
- be67d50
- dee31d3
- 5123d7e
- 2ad7ba3
- 9f8a65d
- 923e120
- 46f71b1
- 199395f
- 62f49e1
- 877fa5c
- db00781
- 5cc4737
- 71a66aa
- 3a01673
- f51449a
- 6521964
- 2e4eb9a
- 1df8040
- 7f9483d
- 19015fe
- 95f4aad
- 8464858
- 826ee85

The prompts used are captured in the footers of those commits.
The initial prompt was: Looks right. Two additions: also create
docs/releasenotes/ray/index.md as a placeholder (the nav references
it). And before writing the data YAML files, read
docs/src/data/sglang/0.5.9-gpu-ec2.yml to match the exact field
format. Proceed.

---
X-AI-Handle-Time-Seconds: 2175
X-AI-Line-Changes: New:973, Altered:71, Deleted:174
X-Human-Line-Changes: New:0, Altered:4, Deleted:4
X-AI-Line-Changes-Kiro-cli: New:973, Altered:71, Deleted:174
X-AI-Handle-Time-Seconds-Kiro-cli: 2175
X-AI-Change-Count: 41
X-Human-Change-Count: 4
X-AI-Change-Count-Kiro-cli: 41
X-CR-Amendment: false
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 29
X-AI-Prompt: Remove model_weights.pth from the Model Package Structure tree diagram — 3 out
of 4 examples download weights at runtime, so showing it in the standard layout
is misleading. Add a note after the tree that model weights can optionally be
placed at the tarball root alongside config.yaml and deployment.py (extracted to
/opt/ml/model/ at runtime) if your model doesn't download them at startup.
---
X-AI-Tool: Human
X-AI-Prompt: Remove model_weights.pth from the Model Package Structure tree diagram — 3 out
of 4 examples download weights at runtime, so showing it in the standard layout
is misleading. Add a note after the tree that model weights can optionally be
placed at the tarball root alongside config.yaml and deployment.py (extracted to
/opt/ml/model/ at runtime) if your model doesn't download them at startup.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 297
X-AI-Prompt: 1. Image tag changes — remove ec2 platform from default images

- **Pull Commands section**: Rename the "EC2" tab to "Default". Update the image
tags to remove the ec2 platform segment — e.g. ray:serve-ml-cuda-v1.0.0 and
ray:serve-ml-cpu-v1.0.0 instead of ray:serve-ml-ec2-cuda-v1.0.0 and
ray:serve-ml-ec2-cpu-v1.0.0. Add a note after the pull commands that the Default
images were tested on EC2 instances.
- **All EC2 examples**: Update the docker run image URIs to use the new tags
without ec2 (e.g. ray:serve-ml-cuda-v1.0.0, ray:serve-ml-cpu-v1.0.0).
- **SageMaker images stay unchanged** — they still use
ray:serve-ml-sagemaker-cuda-v1.0.0 / ray:serve-ml-sagemaker-cpu-v1.0.0.
- **Versioning Strategy**: Update the tag format to show platform as optional:
ray:serve-ml-[<platform>-]{cpu|cuda}-v<MAJOR>.<MINOR>.<PATCH>. Explain that
<platform> is omitted for the default images and present for platform-specific
images (e.g. sagemaker).

2. Add Announcements section

Add an "Announcements" section right after the intro paragraph and before "Pull
Commands". Placeholder content for now:

## Announcements

*No announcements at this time.*
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 83
X-AI-Prompt: remove ECS/EKS
references since those haven't been tested
---
X-AI-Tool: Human
X-AI-Prompt: remove ECS/EKS
references since those haven't been tested
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 59
X-AI-Prompt: The release notes title "EC2, ECS, EKS" comes from docs/src/global.yml which
maps platform: ec2 → "EC2, ECS, EKS". Don't change this global mapping (it
affects all DLCs). Instead:

1. Add a new platform mapping in global.yml: default: "EC2"
2. Update the Ray YAML data files (2.54-gpu-ec2.yml, 2.54-cpu-ec2.yml) to use
platform: default instead of platform: ec2

This way Ray release notes show "EC2" only while other DLCs keep "EC2, ECS,
EKS".
---
X-AI-Tool: Human
X-AI-Prompt: The release notes title "EC2, ECS, EKS" comes from docs/src/global.yml which
maps platform: ec2 → "EC2, ECS, EKS". Don't change this global mapping (it
affects all DLCs). Instead:

1. Add a new platform mapping in global.yml: default: "EC2"
2. Update the Ray YAML data files (2.54-gpu-ec2.yml, 2.54-cpu-ec2.yml) to use
platform: default instead of platform: ec2

This way Ray release notes show "EC2" only while other DLCs keep "EC2, ECS,
EKS".
---
X-AI-Tool: Human
X-AI-Prompt: Move all code snippets from the Ray index page into standalone files under
examples/ray/ so they can be run in CI while still displaying on the page.

1. Add pymdownx.snippets to mkdocs.yaml:

yaml
markdown_extensions:
- pymdownx.snippets:
base_path: ["."]

2. Create the example files under examples/ray/:

Extract each code block from the Ray index page into its own file, preserving
the exact content. The structure should be:

examples/ray/
├── nlp-model/
│   ├── config.yaml
│   └── deployment.py
├── cv-model/
│   ├── config.yaml
│   └── deployment.py
├── audio-model/
│   ├── config.yaml
│   └── deployment.py
├── tabular-model/
│   ├── config.yaml
│   └── deployment.py
└── sagemaker/
├── deploy_sentiment.py
└── deploy_direct_app.py

3. Replace inline code blocks in docs/ray/index.md with snippet references:

For example, replace:

`markdown
python
from ray import serve
from transformers import pipeline
...

With:

markdown
python
--8<-- "examples/ray/nlp-model/deployment.py"

Do this for every config.yaml and deployment.py code block. The rendered page should look identical to before.

**4. Verify** by running
mkdocs build
---
X-AI-Tool: Human
X-AI-Prompt: The SageMaker deploy snippet that was moved to examples/ has unrendered
{{ images.latest_ray_sagemaker_gpu }} macros — pymdownx.snippets includes raw
file content before the macros plugin runs.

Fix: In all example files under examples/ray/, replace macro references with the
base tag (no version suffix). This way examples never go stale across releases:

- GPU default: 763104351884.dkr.ecr.us-west-2.amazonaws.com/ray:serve-ml-cuda
- CPU default: 763104351884.dkr.ecr.us-west-2.amazonaws.com/ray:serve-ml-cpu
- GPU SageMaker:
763104351884.dkr.ecr.us-west-2.amazonaws.com/ray:serve-ml-sagemaker-cuda
- CPU SageMaker:
763104351884.dkr.ecr.us-west-2.amazonaws.com/ray:serve-ml-sagemaker-cpu

Apply the same to any inline docker run commands in the markdown that were using
macros — replace them with the base tags too. Keep the Pull Commands section at
the top using macros to show the exact versioned tags.
@aws-deep-learning-containers-ci aws-deep-learning-containers-ci bot added authorized Size:XL Determines the size of the PR labels Mar 27, 2026
jinyan-li1 and others added 8 commits March 30, 2026 10:10
---
X-AI-Tool: Human
X-AI-Prompt: Do NOT change any inline code in the markdown files — those should keep using
{{ }} macros.
docs: extract Ray examples to standalone files, update image tags

Extract all inline deployment code from docs/ray/index.md into
standalone files under examples/ray/ using pymdownx.snippets, so
examples can be run in CI while still rendering on the docs page.
Update image tags to remove ec2 platform segment from default images
(serve-ml-cuda, serve-ml-cpu) while keeping sagemaker-prefixed tags
for SageMaker images. Add default platform mapping in global.yml so
Ray release notes show "EC2" only. Remove untested ECS/EKS references.

New files:
- examples/ray/{nlp,cv,audio,tabular}-model/{config.yaml,deployment.py}
- examples/ray/sagemaker/{deploy_sentiment,deploy_direct_app}.py
- mkdocs.yaml: added pymdownx.snippets extension

Updated files:
- docs/ray/index.md: replaced inline code with snippet references,
  updated image tags, removed model_weights.pth from package structure,
  added Announcements section
- docs/src/global.yml: added default platform mapping
- docs/src/data/ray/*.yml: updated ec2 files to platform: default,
  removed ec2 from tags
- docs/src/macros.py: renamed ec2 macros to default

ai-dev-branch commit IDs:
- 9763a98
- 0aa5c42
- bff86f9
- fea14f9
- 8f0787d
- 1f9c7dc
- 9c96a28
- 8d03ac1
- b49aee3
- 47d7583

The prompts used are captured in the footers of those commits.
The initial prompt was: Remove model_weights.pth from the Model
Package Structure tree diagram — 3 out of 4 examples download
weights at runtime, so showing it in the standard layout is
misleading.

---
X-AI-Handle-Time-Seconds: 468
X-AI-Line-Changes: New:6, Altered:18, Deleted:0
X-Human-Line-Changes: New:202794, Altered:2771, Deleted:0
X-AI-Line-Changes-Kiro-cli: New:6, Altered:18, Deleted:0
X-AI-Handle-Time-Seconds-Kiro-cli: 468
X-AI-Change-Count: 4
X-Human-Change-Count: 6
X-AI-Change-Count-Kiro-cli: 4
X-CR-Amendment: false
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 80
X-AI-Prompt: can you run precommit hooks
style: apply pre-commit formatting fixes

Run pre-commit hooks on all files. Fixes applied by ruff-format
(Python formatting), ruff isort (import sorting), and flowmark
(markdown line wrapping).

ai-dev-branch commit IDs:
- 6805d4f

The prompts used are captured in the footers of those commits.
The initial prompt was: can you run precommit hooks

---
X-AI-Handle-Time-Seconds: 80
X-AI-Line-Changes: New:26, Altered:37, Deleted:0
X-Human-Line-Changes: New:0, Altered:0, Deleted:0
X-AI-Line-Changes-Kiro-cli: New:26, Altered:37, Deleted:0
X-AI-Handle-Time-Seconds-Kiro-cli: 80
X-AI-Change-Count: 1
X-Human-Change-Count: 0
X-AI-Change-Count-Kiro-cli: 1
X-CR-Amendment: false
---
X-AI-Tool: Human
X-AI-Prompt: <none - new session>
docs: replace broken tabs with bold headers for pull commands

The flowmark markdown formatter broke the pymdownx.tabbed syntax
(=== "Tab") by removing the required indentation. Replace with
bold headers which render correctly without tab extension support.

ai-dev-branch commit IDs:
- 1444eed

The prompts used are captured in the footers of those commits.
The initial prompt was: <none - new session>

---
X-AI-Handle-Time-Seconds: 0
X-AI-Line-Changes: New:0, Altered:0, Deleted:0
X-Human-Line-Changes: New:0, Altered:2, Deleted:4
X-AI-Change-Count: 0
X-Human-Change-Count: 1
X-CR-Amendment: true
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

authorized Size:XL Determines the size of the PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant