added einops for embedding models and simplified accuracy description#4207
added einops for embedding models and simplified accuracy description#4207dtrawins wants to merge 8 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Updates demo documentation around accuracy evaluation and model export, and adds a missing Python dependency (einops) needed by some embedding/export workflows.
Changes:
- Simplifies continuous batching accuracy demo instructions by linking to other deployment demos and updates the VLM evaluation command.
- Adds
einopsto the export-models demo Python requirements. - Replaces a long CLI help “Expected Output” block in the export-models README with a short compatibility note about
transformersversions.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
demos/continuous_batching/accuracy/README.md |
Simplifies server startup guidance (links to other demos) and adjusts VLM eval command; retains example outputs. |
demos/common/export_models/requirements.txt |
Adds einops dependency to export-model requirements. |
demos/common/export_models/README.md |
Removes verbose help output and adds a note about potential transformers version requirements. |
| ## Starting the model server | ||
|
|
||
| ### With Docker | ||
| ```bash | ||
| docker run -d --rm -p 8000:8000 -v $(pwd)/models:/workspace:ro openvino/model_server:latest --rest_port 8000 --config_path /workspace/config.json | ||
| ``` | ||
|
|
||
| ### On Baremetal | ||
| ```bash | ||
| ovms --rest_port 8000 --config_path ./models/config.json | ||
| ``` | ||
| Example of LLM and VLM models deployment is documented in other demos like | ||
| [Agentic usage for LLM models](../agentic_ai/README.md) | ||
| [Using VLM models](../vlm/README.md) |
| python -m lmms_eval \ | ||
| --model openai_compatible \ | ||
| --model_args model_version=OpenGVLab/InternVL2_5-8B,max_retries=1 \ | ||
| --model_args model_version=OpenVINO/InternVL2_5-8B_int4-ov,max_retries=1 \ | ||
| --tasks mme,mmmu_val \ | ||
| --batch_size 1 \ |
| --enable_tool_guided_generation | ||
| Enables enforcing tool schema during generation. Requires setting tool_parser | ||
| ``` | ||
| > Note: Exporting some models might require different transformers version than specified in requirements.txt Check [supported models](https://openvinotoolkit.github.io/openvino.genai/docs/supported-models/). If custom transformers version is required, install it afterwards via `pip install transformers==<version>` |
| @@ -14,33 +14,17 @@ Install the framework via pip: | |||
There was a problem hiding this comment.
do we want to check pip install command if no other command is checked?
| sentencepiece # Required by: transformers` | ||
| torchvision | ||
| requests | ||
| einops |
There was a problem hiding this comment.
Alibaba model still wasn't exported:
python3 export_model.py embeddings_ov --source_model Alibaba-NLP/gte-large-en-v1.5 --extra_quantization_params "--library sentence_transformers" --weight-format fp16 --config_file_path models/config_all.json
RuntimeError: Couldn't get TorchScript module by tracing.
Exception:
index 2314885530818453536 is out of bounds for dimension 0 with size 16
Please check correctness of provided 'example_input'. Sometimes models can be converted in scripted mode, please try running conversion without 'example_input'.
You can also provide TorchScript module that you obtained yourself, please refer to PyTorch documentation: https://pytorch.org/tutorials/beginner/Intro_to_TorchScript_tutorial.html.
Traceback (most recent call last):
File "/opt/home/k8sworker/ngroza/test/model_server/demos/common/export_models/export_model.py", line 687, in
export_embeddings_model_ov(args['model_repository_path'], args['source_model'], args['model_name'], args['precision'], template_parameters, args['config_file_path'], args['truncate'])
File "/opt/home/k8sworker/ngroza/test/model_server/demos/common/export_models/export_model.py", line 520, in export_embeddings_model_ov
raise ValueError("Failed to export embeddings model", source_model)
ValueError: ('Failed to export embeddings model', 'Alibaba-NLP/gte-large-en-v1.5')
There was a problem hiding this comment.
that is one of the models that require transformers<5
| python -m lmms_eval \ | ||
| --model openai_compatible \ | ||
| --model_args model_version=OpenGVLab/InternVL2_5-8B,max_retries=1 \ | ||
| --model_args model_version=OpenVINO/InternVL2_5-8B_int4-ov,max_retries=1 \ |
There was a problem hiding this comment.
there is no such model in OV collection: https://huggingface.co/OpenVINO/models?search=intern
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
|
||
| Install the framework via pip: | ||
| ```bash | ||
| ```text |
There was a problem hiding this comment.
Won't this break some CI automation for accuracy checks? @pgladkows
There was a problem hiding this comment.
no, accuracy checking will be disabled in demos
| ```text | ||
| export OPENAI_BASE_URL=http://localhost:8000/v3 | ||
| bfcl generate --model ovms-model-stream --test-category simple_python,multiple --temperature 0.0 --num-threads 100 -o --result-dir model_name_dir | ||
| bfcl generate --model ovms-model-stream --test-category simple_python,multiple,multi_turn_base --temperature 0.0 --num-threads 10 -o --result-dir model_name_dir |
There was a problem hiding this comment.
Won't this be to much for a demo? time-wise, it will take much longer to execute with multi turn.
Also you only add it for streaming path - shouldn't we align unary as well if we choose to go with multi turn?
| dest='dataset') | ||
| parser.add_argument('--embed_dim', type=int, default=None, help='Embedding dimension. Auto-detected if not provided.', | ||
| dest='embed_dim') | ||
| parser.add_argument('--max_tokens', type=int, default=999999, help='Max input tokens for truncation. default: 512', |
There was a problem hiding this comment.
default does not match help description
Co-authored-by: Trawinski, Dariusz <dariusz.trawinski@intel.com>
🛠 Summary
CVS-186324
🧪 Checklist
``