Draft
Conversation
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 275 X-AI-Prompt: Looks right. Two additions: also create docs/releasenotes/ray/index.md as a placeholder (the nav references it). And before writing the data YAML files, read docs/src/data/sglang/0.5.9-gpu-ec2.yml to match the exact field format. Proceed.
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 35 X-AI-Prompt: docs/src/global.yml has a duplicate ray key, and use ray not ray serve
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 93 X-AI-Prompt: Make these fixes to docs/ray/index.md: 1. The nav label for the Ray page should be "Ray" not "Ray Serve" — update docs/.nav.yml accordingly. 2. Pull commands: remove the auth step entirely — only show the docker pull command. Look at how other framework pages in the repo show pull commands and follow the same pattern. 3. Remove the Key Packages table from docs/ray/index.md entirely. Package versions belong on the release notes page (per-version), not the framework overview. Add a brief note like "For package versions, see the Release Notes." with a link instead. 4. For the pull command URIs — check if the existing framework pages use macros or hardcoded URIs, and follow the same approach. If there's a way to reference the latest image URI dynamically from the data files, use it; otherwise use a placeholder.
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 108 X-AI-Prompt: 1. Add Ray to the macro system so pull commands use dynamic URIs. Check docs/src/macros.py and docs/src/image_config.py to understand how other frameworks do it, then add Ray entries following the same pattern. Update docs/ray/index.md pull commands to use the macros. 2. In the Versioning Strategy section, remove the sentence "where <platform> is ec2 or sagemaker" — keep the tag format as-is, just drop that explanation.
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 64 X-AI-Prompt: get_latest_image_uri() only accepts 2 arguments. Check docs/src/image_config.py to see the correct signature and fix the macros.py entries accordingly. Remove the "Image URIs are placeholders" note from docs/ray/index.md.
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 19 X-AI-Prompt: Simplify the pull commands from 4 tabs to 2 tabs. Each tab shows both GPU and CPU variants
--- X-AI-Tool: Human X-AI-Prompt: Simplify the pull commands from 4 tabs to 2 tabs. Each tab shows both GPU and CPU variants
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 30 X-AI-Prompt: Check the Ray upstream release date for version 2.54.0 on the Ray GitHub repo (https://github.com/ray-project/ray/releases). Use that date as the GA date and set EOP to exactly 1 year later. Update the support policy table in docs/ray/index.md with the real dates.
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 71 X-AI-Prompt: Update the Examples section in docs/ray/index.md to cover both EC2 and SageMaker deployment. Read test/ray/ec2/common.py and test/ray/sagemaker/common.py for the actual test patterns. Structure: - EC2 section: docker run + HTTP requests for the 4 categories (NLP/sentiment, CV/DenseNet, Audio, Tabular/Iris) - SageMaker section: SageMaker SDK deployment (Model.deploy() + predictor.predict()) showing the same 4 categories Use realistic request/response examples based on what the tests validate. Use placeholder S3 paths and model names — no real bucket names from the test code. Refer to the structure and style of https://docs.ray.io/en/latest/serve/tutorials/text- classification.html for how to structure each example — show the full flow from deployment to inference request.
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 15 X-AI-Prompt: In the Examples section, remove the "(NLP)" and "(CV)" suffixes from the EC2 section headings so they match the SageMaker section: "Sentiment Analysis", "Image Classification", "Audio Transcription", "Tabular Classification".
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 58 X-AI-Prompt: Proceed with two corrections: 1. CV response: only show "predictions" — omit "model" and "device" fields 2. Audio response: change transcription to "<transcription depends on audio input>" — the tests use sine waves so there's no meaningful example output
--- X-AI-Tool: Human X-AI-Prompt: Proceed with two corrections: 1. CV response: only show "predictions" — omit "model" and "device" fields 2. Audio response: change transcription to "<transcription depends on audio input>" — the tests use sine waves so there's no meaningful example output
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 37 X-AI-Prompt: In the config.yaml example in the Deployment Guide, change num_gpus: 0 to num_gpus: 1 for GPU deployments. Add a note that num_gpus: 0 should be used for CPU images and num_gpus: 1 for GPU images.
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 19 X-AI-Prompt: Actually the numbers of GPUs should be the number of GPUs allocated per replica, right?
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 17 X-AI-Prompt: The SageMaker examples use v2 SDK (sagemaker>=2,<3). Update any SageMaker SDK documentation links to point to the v2 docs: https://sagemaker.readthedocs.io/en /v2/ and verify the code examples use v2 API patterns.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 36
X-AI-Prompt: Fix the SageMaker tabular example: the tabular model uses config.yaml (no
SM_RAYSERVE_APP). Remove env={"SM_RAYSERVE_APP": "deployment:app"} from the
tabular Model(). If you want to show the SM_RAYSERVE_APP pattern, add a separate
brief example showing it with a note that it's for models without a
config.yaml (like the MNIST test), check with test scripts again.
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 30 X-AI-Prompt: Should the Deployment Guide clarify what import_path: deployment:app means? Specifically: deployment refers to deployment.py in the model package, and app is the variable defined at the bottom of that file (e.g. app = MyDeployment.bind()). Also, should we add a note that on SageMaker, the model tarball is automatically downloaded from S3 and extracted to /opt/ml/model/ before the container starts? Check the actual deployment.py files in markdown/rayserve-tars/ to see how app is defined there, and use that as the example.
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 43 X-AI-Prompt: In the "Direct App Import" example, add a note explaining RAYSERVE_NUM_GPUS: it is a custom environment variable read by the deployment code to configure GPU allocation per replica. It is only needed when using SM_RAYSERVE_APP without a config.yaml — when using config.yaml, set num_gpus directly under ray_actor_options instead. Check markdown/rayserve-tars/ mnist-direct-app deployment.py to confirm how it's used.
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 17 X-AI-Prompt: Fix the EC2 health check command — the /healthz endpoint returns HTTP 200 (not a body containing "OK"). Check dlc-copy/test/ray/ec2/common.py again.
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 18 X-AI-Prompt: In the EC2 Tabular Classification example, remove /opt/ml/model/config.yaml from the end of the docker run command — the entrypoint auto-detects config.yaml at /opt/ml/model/config.yaml when no CLI arg is provided.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 99
X-AI-Prompt:
The EC2 examples section needs the actual deployment.py and config.yaml code
inline so users can copy-paste and try the container immediately. Right now we
show docker run + curl but not the model package files.
For each EC2 example, add the code before the docker run command. Show users how
to mkdir, save the files, then run. Follow the pattern Ray uses: https://
docs.ray.io/en/latest/serve/tutorials/text-classification.html
The working files are in workspace/2-wk-challenge/markdown/rayserve-tars/ under
nlp/, cv-densenet/, audio-ffmpeg/, and tabular/. Read them and put the code
inline, but clean up the following before including:
Remove from all deployments:
- "device": str(self.device) from response dicts
- "model": "densenet161" / "model": "mnist_cnn" from response dicts
- "deployment_method": "SM_RAYSERVE_APP" from response dicts
- "audio_backend": "ffmpeg" from response dicts
- "installed_packages" field and the tabulate validation block in tabular
- Any logging that mentions test patterns, backends, or deployment methods (e.g.
f"loaded on {self.device}", "(SM_RAYSERVE_APP deployment)")
Keep the code functional — just strip the CI/test instrumentation so responses
match what the guide already documents.
For models that download weights at runtime (NLP, CV, Audio): provide the full
code inline — users just create the files and go.
For models with local weights (Tabular with iris_model.pth/norm_params.json,
MNIST with model.pth): these can't be copy-pasted since users need the actual
weight files. Either skip inline code for these and just describe the model
package structure, or add a small training script that generates the weights.
Your call on what fits the guide best.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 150
X-AI-Prompt: A few fixes and improvements:
Bug fixes:
1. Add weights_only=True to torch.load in the tabular deployment.py — PyTorch
2.6+ defaults to weights_only=True and will break without it.
2. The image classification example has max_replicas: 2 with num_gpus: 1 in
config.yaml. On single-GPU instances (g5.xlarge, g4dn.xlarge) the second replica
can never be placed. add a note immediately after the cv-model/config.yaml code block. The note should read something like: The autoscaling_config in deployment.py sets max_replicas: 2. Each replica requests 1 GPU, so this configuration requires a multi-GPU instance. On single-GPU instances, reduce max_replicas to 1.
Do not modify any code blocks or other sections.
Documentation:
3. The "Direct App Import" section references RAYSERVE_NUM_GPUS but never shows
how the deployment code reads it. Add a minimal snippet showing the pattern:
python
import os
num_gpus = int(os.environ.get("RAYSERVE_NUM_GPUS", "0"))
@serve.deployment(ray_actor_options={"num_gpus": num_gpus})
class MyDeployment:
...
4. Remove the training script (train.py). Instead, note that this example requires pre-trained weights (iris_model.pth, norm_params.json) in the model directory. Keep deployment.py and config.yaml inline as reference for the expected structure.
Also update the intro line from:
"Each example below includes the full model package files so you can copy-paste and run immediately."
to something like:
"Each example below includes the full model package files. The first three download weights automatically on startup. The tabular example requires pre-trained weights — substitute your own trained model."
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 19 X-AI-Prompt: Revert the torch.load change in tabular-model/deployment.py — restore it to the original without weights_only=True.
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 22 X-AI-Prompt: In the Deployment Paths table, "Place config.yaml at root of model package" is ambiguous — users may not know if that means the mounted directory, inside the tarball, or the final path on disk. Clarify this so it's unambiguous for both EC2 (directory mount) and SageMaker (tarball extraction).
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 19 X-AI-Prompt: More concise but clear to fit the table better
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 57 X-AI-Prompt: Move "Direct App Import (No config.yaml)" out of the SageMaker section and make it its own section after both EC2 and SageMaker. It applies to both platforms — EC2 uses a CLI argument (docker run <image> deployment:app), SageMaker uses the SM_RAYSERVE_APP env var. Show both mechanisms in the section. The RAYSERVE_NUM_GPUS explanation applies to both since neither has a config.yaml to set num_gpus in.
--- X-AI-Tool: Human X-AI-Prompt: Move "Direct App Import (No config.yaml)" out of the SageMaker section and make it its own section after both EC2 and SageMaker. It applies to both platforms — EC2 uses a CLI argument (docker run <image> deployment:app), SageMaker uses the SM_RAYSERVE_APP env var. Show both mechanisms in the section. The RAYSERVE_NUM_GPUS explanation applies to both since neither has a config.yaml to set num_gpus in.
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 29 X-AI-Prompt: Add a "SageMaker Environment Variables" table next to the existing "EC2 Environment Variables" table. Include SM_RAYSERVE_APP and CA_REPOSITORY_ARN with defaults and descriptions. Check the entrypoint code to confirm the actual defaults before writing the table.
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 20 X-AI-Prompt: Make it clear that those are both optional variables
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 89 X-AI-Prompt: Actually, let's remove the SageMaker Environment Variables table since they are both optional variables - it may confuse readers, and they are both mentioned in relevant sections.
--- X-AI-Tool: Human X-AI-Prompt: Should we add a comment to let users know they would need to parse the response?
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 24 X-AI-Prompt: In the Image Classification EC2 example, add a curl -O download step for the test image before the inference request. Use the public TorchServe kitten image at https://s3.amazonaws.com/model-server/inputs/kitten.jpg. Place it after the health check and before the inference curl. This makes the example fully copy-pasteable without requiring the user to supply their own image. The NLP and tabular examples already work out of the box because test data is inline — this makes the image the same as well.
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 60 X-AI-Prompt: One minor optimization — move the download before the health check so it happens while the container is starting up, saving time.
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 79 X-AI-Prompt: Two fixes needed for the SageMaker Deployment section based on testing: 1. Add SageMaker SDK prerequisite At the top of the "SageMaker Deployment" section (before the tarball/upload steps), add a note that users need to install the SageMaker Python SDK v2: bash pip install 'sagemaker>=2,<3' v3 drops the Model, Predictor, and Serializer APIs used in these examples. 2. Add inference_ami_version to GPU SageMaker deploy calls GPU SageMaker deploys fail with CannotStartContainerError without specifying the inference AMI version — the default SageMaker host AMI has incompatible NVIDIA drivers for our CUDA 12.9 images. This is SageMaker-specific only; EC2 users can pick their own compatible instance/AMI. Add inference_ami_version="al2-ami-sagemaker-inference-gpu-3-1" to all GPU .deploy() calls in the SageMaker section. This applies to: - The Sentiment Analysis example - The Direct App Import SageMaker example Do NOT add it to CPU deploys (tabular) or any EC2 examples. Add a brief note explaining why, with a link to the API reference for valid values: https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ ProductionVariant.html Here is the tested working sentiment example: python import json from sagemaker.model import Model from sagemaker.predictor import Predictor from sagemaker.serializers import JSONSerializer predictor = Model( image_uri="763104351884.dkr.ecr.us-west-2.amazonaws.com/ray:serve-ml-sagemaker-cuda-v1.0.0", role="arn:aws:iam::<ACCOUNT>:role/SageMakerExecutionRole", model_data="s3://<BUCKET>/models/nlp-sentiment/model.tar.gz", predictor_cls=Predictor, ).deploy( instance_type="ml.g5.xlarge", initial_instance_count=1, endpoint_name="ray-serve-nlp", serializer=JSONSerializer(), inference_ami_version="al2-ami-sagemaker-inference-gpu-3-1", wait=True, ) response = predictor.predict({"text": "I love this so much, best purchase ever!"}) result = json.loads(response) # predictor.predict() returns raw bytes # {"predictions": [{"label": "POSITIVE", "score": 0.9991}]} And the Direct App Import SageMaker example should also get it: python predictor = model.deploy( instance_type="ml.g5.xlarge", initial_instance_count=1, endpoint_name="ray-serve-mnist", inference_ami_version="al2-ami-sagemaker-inference-gpu-3-1", wait=True, ) Double check the test scripts to confirm and verify correctness.
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 37 X-AI-Prompt: Fix the format issue around predictor.delete_endpoint()
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 18 X-AI-Prompt: In the "Direct App Import" section, before the SageMaker code block, clarify two things: 1. Connect it to the sentiment walkthrough: the tarball packaging and S3 upload steps are the same. 2. Clarify that the SM_RAYSERVE_APP import path (e.g. deployment:app) resolves relative to /opt/ml/model/, so deployment.py must be at the root of the tarball — same as the config.yaml examples, just without the config.yaml. Something like: "Package your model directory the same way as the sentiment example (tarball uploaded to S3), but omit config.yaml. The deployment.py must be at the tarball root — SM_RAYSERVE_APP=deployment:app resolves the module from /opt/ml/model/."
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 35 X-AI-Prompt: The Deployment Paths table is inaccurate — on EC2, a CLI argument takes precedence over config.yaml, but the table implies config.yaml is always the default. Update the table to reflect that CLI arg overrides config.yaml on EC2. Keep it concise. The Direct App Import SageMaker code block is missing imports for Model and Predictor. Add them.
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 24 X-AI-Prompt: Is there a way to make the desc more concise?
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 25 X-AI-Prompt: Update the Support Policy table to include the DLC version column. Users need to map between the Ray version and the DLC image tag version (v1.0.0). The table should be: | Version | DLC Version | GA Date | End of Patch | |---|---|---|---| | Ray 2.54 | v1.0.0 | 2026-02-18 | 2027-02-18 | This makes it clear that dlc's ray v1.0.0 corresponds to Ray 2.54.
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 20 X-AI-Prompt: Update the Support Policy table to include key version info that users need at a glance. DLC version first (it's the image tag), then the three components that drive major version bumps per the versioning strategy. Full package details stay in Release Notes. | DLC Version | Ray | Python | CUDA | GA Date | End of Patch | |---|---|---|---|---|---| | v1.0.0 | 2.54.0 | 3.13 | 12.9.1 | 2026-02-18 | 2027-02-18 |
docs: add Ray Serve DLC framework page and release notes infrastructure Add comprehensive documentation for the Ray Serve inference DLC images, including a per-framework page at docs/ray/index.md with deployment guide, EC2 and SageMaker examples, versioning strategy, and support policy. Also add release notes data files, table config, and macro support for dynamic image URIs. New files: - docs/ray/index.md: Framework page with pull commands, deployment guide (model package structure, deployment paths, env vars, runtime deps), EC2 examples (sentiment, image classification, audio, tabular with full inline deployment code), SageMaker deployment walkthrough, and direct app import section. - docs/releasenotes/ray/index.md: Placeholder release notes page. - docs/src/data/ray/: Image config YAML files for Ray 2.54 (GPU/CPU x EC2/SageMaker). - docs/src/tables/ray.yml: Table column config for Ray. Updated files: - docs/src/global.yml: Added ray display name and table_order entry. - docs/src/macros.py: Added Ray image URI macros with accelerator filtering. - docs/src/image_config.py: No changes (kept original 2-arg signature). - docs/releasenotes/index.md: Added Ray link. - docs/.nav.yml: Added Ray under Getting Started and Release Notes. ai-dev-branch commit IDs: - 7096f96 - 369cd33 - 19ad220 - ad4e26f - 8e724be - c15bbad - 866794a - eb0a311 - 200f84a - b68464c - bfe969d - 7a2da45 - e256c17 - 8f1d183 - 0bd7491 - a0d26e2 - 21f0d4b - ede013d - fa54049 - f915294 - 1833e1d - 070d2ef - be67d50 - dee31d3 - 5123d7e - 2ad7ba3 - 9f8a65d - 923e120 - 46f71b1 - 199395f - 62f49e1 - 877fa5c - db00781 - 5cc4737 - 71a66aa - 3a01673 - f51449a - 6521964 - 2e4eb9a - 1df8040 - 7f9483d - 19015fe - 95f4aad - 8464858 - 826ee85 The prompts used are captured in the footers of those commits. The initial prompt was: Looks right. Two additions: also create docs/releasenotes/ray/index.md as a placeholder (the nav references it). And before writing the data YAML files, read docs/src/data/sglang/0.5.9-gpu-ec2.yml to match the exact field format. Proceed. --- X-AI-Handle-Time-Seconds: 2175 X-AI-Line-Changes: New:973, Altered:71, Deleted:174 X-Human-Line-Changes: New:0, Altered:4, Deleted:4 X-AI-Line-Changes-Kiro-cli: New:973, Altered:71, Deleted:174 X-AI-Handle-Time-Seconds-Kiro-cli: 2175 X-AI-Change-Count: 41 X-Human-Change-Count: 4 X-AI-Change-Count-Kiro-cli: 41 X-CR-Amendment: false
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 29 X-AI-Prompt: Remove model_weights.pth from the Model Package Structure tree diagram — 3 out of 4 examples download weights at runtime, so showing it in the standard layout is misleading. Add a note after the tree that model weights can optionally be placed at the tarball root alongside config.yaml and deployment.py (extracted to /opt/ml/model/ at runtime) if your model doesn't download them at startup.
--- X-AI-Tool: Human X-AI-Prompt: Remove model_weights.pth from the Model Package Structure tree diagram — 3 out of 4 examples download weights at runtime, so showing it in the standard layout is misleading. Add a note after the tree that model weights can optionally be placed at the tarball root alongside config.yaml and deployment.py (extracted to /opt/ml/model/ at runtime) if your model doesn't download them at startup.
---
X-AI-Tool: Kiro-cli
X-AI-Handle-Time-Seconds: 297
X-AI-Prompt: 1. Image tag changes — remove ec2 platform from default images
- **Pull Commands section**: Rename the "EC2" tab to "Default". Update the image
tags to remove the ec2 platform segment — e.g. ray:serve-ml-cuda-v1.0.0 and
ray:serve-ml-cpu-v1.0.0 instead of ray:serve-ml-ec2-cuda-v1.0.0 and
ray:serve-ml-ec2-cpu-v1.0.0. Add a note after the pull commands that the Default
images were tested on EC2 instances.
- **All EC2 examples**: Update the docker run image URIs to use the new tags
without ec2 (e.g. ray:serve-ml-cuda-v1.0.0, ray:serve-ml-cpu-v1.0.0).
- **SageMaker images stay unchanged** — they still use
ray:serve-ml-sagemaker-cuda-v1.0.0 / ray:serve-ml-sagemaker-cpu-v1.0.0.
- **Versioning Strategy**: Update the tag format to show platform as optional:
ray:serve-ml-[<platform>-]{cpu|cuda}-v<MAJOR>.<MINOR>.<PATCH>. Explain that
<platform> is omitted for the default images and present for platform-specific
images (e.g. sagemaker).
2. Add Announcements section
Add an "Announcements" section right after the intro paragraph and before "Pull
Commands". Placeholder content for now:
## Announcements
*No announcements at this time.*
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 83 X-AI-Prompt: remove ECS/EKS references since those haven't been tested
--- X-AI-Tool: Human X-AI-Prompt: remove ECS/EKS references since those haven't been tested
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 59 X-AI-Prompt: The release notes title "EC2, ECS, EKS" comes from docs/src/global.yml which maps platform: ec2 → "EC2, ECS, EKS". Don't change this global mapping (it affects all DLCs). Instead: 1. Add a new platform mapping in global.yml: default: "EC2" 2. Update the Ray YAML data files (2.54-gpu-ec2.yml, 2.54-cpu-ec2.yml) to use platform: default instead of platform: ec2 This way Ray release notes show "EC2" only while other DLCs keep "EC2, ECS, EKS".
--- X-AI-Tool: Human X-AI-Prompt: The release notes title "EC2, ECS, EKS" comes from docs/src/global.yml which maps platform: ec2 → "EC2, ECS, EKS". Don't change this global mapping (it affects all DLCs). Instead: 1. Add a new platform mapping in global.yml: default: "EC2" 2. Update the Ray YAML data files (2.54-gpu-ec2.yml, 2.54-cpu-ec2.yml) to use platform: default instead of platform: ec2 This way Ray release notes show "EC2" only while other DLCs keep "EC2, ECS, EKS".
--- X-AI-Tool: Human X-AI-Prompt: Move all code snippets from the Ray index page into standalone files under examples/ray/ so they can be run in CI while still displaying on the page. 1. Add pymdownx.snippets to mkdocs.yaml: yaml markdown_extensions: - pymdownx.snippets: base_path: ["."] 2. Create the example files under examples/ray/: Extract each code block from the Ray index page into its own file, preserving the exact content. The structure should be: examples/ray/ ├── nlp-model/ │ ├── config.yaml │ └── deployment.py ├── cv-model/ │ ├── config.yaml │ └── deployment.py ├── audio-model/ │ ├── config.yaml │ └── deployment.py ├── tabular-model/ │ ├── config.yaml │ └── deployment.py └── sagemaker/ ├── deploy_sentiment.py └── deploy_direct_app.py 3. Replace inline code blocks in docs/ray/index.md with snippet references: For example, replace: `markdown python from ray import serve from transformers import pipeline ... With: markdown python --8<-- "examples/ray/nlp-model/deployment.py" Do this for every config.yaml and deployment.py code block. The rendered page should look identical to before. **4. Verify** by running mkdocs build
---
X-AI-Tool: Human
X-AI-Prompt: The SageMaker deploy snippet that was moved to examples/ has unrendered
{{ images.latest_ray_sagemaker_gpu }} macros — pymdownx.snippets includes raw
file content before the macros plugin runs.
Fix: In all example files under examples/ray/, replace macro references with the
base tag (no version suffix). This way examples never go stale across releases:
- GPU default: 763104351884.dkr.ecr.us-west-2.amazonaws.com/ray:serve-ml-cuda
- CPU default: 763104351884.dkr.ecr.us-west-2.amazonaws.com/ray:serve-ml-cpu
- GPU SageMaker:
763104351884.dkr.ecr.us-west-2.amazonaws.com/ray:serve-ml-sagemaker-cuda
- CPU SageMaker:
763104351884.dkr.ecr.us-west-2.amazonaws.com/ray:serve-ml-sagemaker-cpu
Apply the same to any inline docker run commands in the markdown that were using
macros — replace them with the base tags too. Keep the Pull Commands section at
the top using macros to show the exact versioned tags.
---
X-AI-Tool: Human
X-AI-Prompt: Do NOT change any inline code in the markdown files — those should keep using
{{ }} macros.
docs: extract Ray examples to standalone files, update image tags
Extract all inline deployment code from docs/ray/index.md into
standalone files under examples/ray/ using pymdownx.snippets, so
examples can be run in CI while still rendering on the docs page.
Update image tags to remove ec2 platform segment from default images
(serve-ml-cuda, serve-ml-cpu) while keeping sagemaker-prefixed tags
for SageMaker images. Add default platform mapping in global.yml so
Ray release notes show "EC2" only. Remove untested ECS/EKS references.
New files:
- examples/ray/{nlp,cv,audio,tabular}-model/{config.yaml,deployment.py}
- examples/ray/sagemaker/{deploy_sentiment,deploy_direct_app}.py
- mkdocs.yaml: added pymdownx.snippets extension
Updated files:
- docs/ray/index.md: replaced inline code with snippet references,
updated image tags, removed model_weights.pth from package structure,
added Announcements section
- docs/src/global.yml: added default platform mapping
- docs/src/data/ray/*.yml: updated ec2 files to platform: default,
removed ec2 from tags
- docs/src/macros.py: renamed ec2 macros to default
ai-dev-branch commit IDs:
- 9763a98
- 0aa5c42
- bff86f9
- fea14f9
- 8f0787d
- 1f9c7dc
- 9c96a28
- 8d03ac1
- b49aee3
- 47d7583
The prompts used are captured in the footers of those commits.
The initial prompt was: Remove model_weights.pth from the Model
Package Structure tree diagram — 3 out of 4 examples download
weights at runtime, so showing it in the standard layout is
misleading.
---
X-AI-Handle-Time-Seconds: 468
X-AI-Line-Changes: New:6, Altered:18, Deleted:0
X-Human-Line-Changes: New:202794, Altered:2771, Deleted:0
X-AI-Line-Changes-Kiro-cli: New:6, Altered:18, Deleted:0
X-AI-Handle-Time-Seconds-Kiro-cli: 468
X-AI-Change-Count: 4
X-Human-Change-Count: 6
X-AI-Change-Count-Kiro-cli: 4
X-CR-Amendment: false
…ls/ to .gitignore
--- X-AI-Tool: Kiro-cli X-AI-Handle-Time-Seconds: 80 X-AI-Prompt: can you run precommit hooks
style: apply pre-commit formatting fixes Run pre-commit hooks on all files. Fixes applied by ruff-format (Python formatting), ruff isort (import sorting), and flowmark (markdown line wrapping). ai-dev-branch commit IDs: - 6805d4f The prompts used are captured in the footers of those commits. The initial prompt was: can you run precommit hooks --- X-AI-Handle-Time-Seconds: 80 X-AI-Line-Changes: New:26, Altered:37, Deleted:0 X-Human-Line-Changes: New:0, Altered:0, Deleted:0 X-AI-Line-Changes-Kiro-cli: New:26, Altered:37, Deleted:0 X-AI-Handle-Time-Seconds-Kiro-cli: 80 X-AI-Change-Count: 1 X-Human-Change-Count: 0 X-AI-Change-Count-Kiro-cli: 1 X-CR-Amendment: false
--- X-AI-Tool: Human X-AI-Prompt: <none - new session>
docs: replace broken tabs with bold headers for pull commands The flowmark markdown formatter broke the pymdownx.tabbed syntax (=== "Tab") by removing the required indentation. Replace with bold headers which render correctly without tab extension support. ai-dev-branch commit IDs: - 1444eed The prompts used are captured in the footers of those commits. The initial prompt was: <none - new session> --- X-AI-Handle-Time-Seconds: 0 X-AI-Line-Changes: New:0, Altered:0, Deleted:0 X-Human-Line-Changes: New:0, Altered:2, Deleted:4 X-AI-Change-Count: 0 X-Human-Change-Count: 1 X-CR-Amendment: true
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Test Plan
Test Result
Toggle if you are merging into master Branch
By default, docker image builds and tests are disabled. Two ways to run builds and tests:
How to use the helper utility for updating dlc_developer_config.toml
Assuming your remote is called
origin(you can find out more withgit remote -v)...python src/prepare_dlc_dev_environment.py -b </path/to/buildspec.yml> -cp originpython src/prepare_dlc_dev_environment.py -b </path/to/buildspec.yml> -t sanity_tests -cp originpython src/prepare_dlc_dev_environment.py -rcp originNOTE: If you are creating a PR for a new framework version, please ensure success of the local, standard, rc, and efa sagemaker tests by updating the dlc_developer_config.toml file:
sagemaker_remote_tests = truesagemaker_efa_tests = truesagemaker_rc_tests = truesagemaker_local_tests = trueHow to use PR description
Use the code block below to uncomment commands and run the PR CodeBuild jobs. There are two commands available:# /buildspec <buildspec_path># /buildspec pytorch/training/buildspec.yml# /tests <test_list># /tests sanity security ec2sanity, security, ec2, ecs, eks, sagemaker, sagemaker-local.Toggle if you are merging into main Branch
PR Checklist
pre-commit run --all-fileslocally before creating this PR. (Read DEVELOPMENT.md for details).