added einops for embedding models and simplified accuracy description by dtrawins · Pull Request #4207 · openvinotoolkit/model_server

dtrawins · 2026-05-13T15:07:25Z

🛠 Summary

CVS-186324

🧪 Checklist

Unit tests added.
The documentation updated.
Change follows security best practices.
``

Copilot

Pull request overview

Updates demo documentation around accuracy evaluation and model export, and adds a missing Python dependency (einops) needed by some embedding/export workflows.

Changes:

Simplifies continuous batching accuracy demo instructions by linking to other deployment demos and updates the VLM evaluation command.
Adds einops to the export-models demo Python requirements.
Replaces a long CLI help “Expected Output” block in the export-models README with a short compatibility note about transformers versions.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File	Description
`demos/continuous_batching/accuracy/README.md`	Simplifies server startup guidance (links to other demos) and adjusts VLM eval command; retains example outputs.
`demos/common/export_models/requirements.txt`	Adds `einops` dependency to export-model requirements.
`demos/common/export_models/README.md`	Removes verbose help output and adds a note about potential `transformers` version requirements.

 ## Starting the model server

-### With Docker
-```bash
-docker run -d --rm -p 8000:8000 -v $(pwd)/models:/workspace:ro openvino/model_server:latest --rest_port 8000 --config_path /workspace/config.json
-```

-### On Baremetal
-```bash
-ovms --rest_port 8000 --config_path ./models/config.json
-```
+Example of LLM and VLM models deployment is documented in other demos like
+[Agentic usage for LLM models](../agentic_ai/README.md) 
+[Using VLM models](../vlm/README.md)


 python -m lmms_eval \
    --model openai_compatible \
-    --model_args model_version=OpenGVLab/InternVL2_5-8B,max_retries=1 \
+    --model_args model_version=OpenVINO/InternVL2_5-8B_int4-ov,max_retries=1 \
    --tasks mme,mmmu_val \
    --batch_size 1 \


-  --enable_tool_guided_generation
-                        Enables enforcing tool schema during generation. Requires setting tool_parser
-```
+> Note: Exporting some models might require different transformers version than specified in requirements.txt Check [supported models](https://openvinotoolkit.github.io/openvino.genai/docs/supported-models/). If custom transformers version is required, install it afterwards via `pip install transformers==<version>`


ngrozae · 2026-05-18T10:13:35Z

do we want to check pip install command if no other command is checked?

ngrozae · 2026-05-18T10:29:23Z

 sentencepiece  # Required by: transformers`
 torchvision
 requests
+einops


Alibaba model still wasn't exported:
python3 export_model.py embeddings_ov --source_model Alibaba-NLP/gte-large-en-v1.5 --extra_quantization_params "--library sentence_transformers" --weight-format fp16 --config_file_path models/config_all.json

RuntimeError: Couldn't get TorchScript module by tracing.
Exception:
index 2314885530818453536 is out of bounds for dimension 0 with size 16
Please check correctness of provided 'example_input'. Sometimes models can be converted in scripted mode, please try running conversion without 'example_input'.
You can also provide TorchScript module that you obtained yourself, please refer to PyTorch documentation: https://pytorch.org/tutorials/beginner/Intro_to_TorchScript_tutorial.html.
Traceback (most recent call last):
File "/opt/home/k8sworker/ngroza/test/model_server/demos/common/export_models/export_model.py", line 687, in
export_embeddings_model_ov(args['model_repository_path'], args['source_model'], args['model_name'], args['precision'], template_parameters, args['config_file_path'], args['truncate'])
File "/opt/home/k8sworker/ngroza/test/model_server/demos/common/export_models/export_model.py", line 520, in export_embeddings_model_ov
raise ValueError("Failed to export embeddings model", source_model)
ValueError: ('Failed to export embeddings model', 'Alibaba-NLP/gte-large-en-v1.5')

that is one of the models that require transformers<5

pgladkows · 2026-05-18T11:20:08Z

 python -m lmms_eval \
    --model openai_compatible \
-    --model_args model_version=OpenGVLab/InternVL2_5-8B,max_retries=1 \
+    --model_args model_version=OpenVINO/InternVL2_5-8B_int4-ov,max_retries=1 \


there is no such model in OV collection: https://huggingface.co/OpenVINO/models?search=intern

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

mzegla · 2026-05-20T09:59:58Z


 Install the framework via pip:
-```bash
+```text


Won't this break some CI automation for accuracy checks? @pgladkows

no, accuracy checking will be disabled in demos

mzegla · 2026-05-20T10:02:49Z

 ```text
 export OPENAI_BASE_URL=http://localhost:8000/v3
-bfcl generate --model ovms-model-stream --test-category simple_python,multiple --temperature 0.0 --num-threads 100 -o --result-dir model_name_dir
+bfcl generate --model ovms-model-stream --test-category simple_python,multiple,multi_turn_base --temperature 0.0 --num-threads 10 -o --result-dir model_name_dir


Won't this be to much for a demo? time-wise, it will take much longer to execute with multi turn.
Also you only add it for streaming path - shouldn't we align unary as well if we choose to go with multi turn?

mzegla · 2026-05-20T10:03:39Z

                    dest='dataset')
+parser.add_argument('--embed_dim', type=int, default=None, help='Embedding dimension. Auto-detected if not provided.',
+                    dest='embed_dim')
+parser.add_argument('--max_tokens', type=int, default=999999, help='Max input tokens for truncation. default: 512',


default does not match help description

Co-authored-by: Trawinski, Dariusz <dariusz.trawinski@intel.com>

added einops for embedding models and simplified accuracy description

59fa97a

dtrawins requested review from Copilot and ngrozae May 13, 2026 15:07

Copilot started reviewing on behalf of dtrawins May 13, 2026 15:08 View session

Copilot AI reviewed May 13, 2026

View reviewed changes

dtrawins requested review from michalkulakowski and pgladkows May 15, 2026 13:51

ngrozae reviewed May 18, 2026

View reviewed changes

pgladkows reviewed May 18, 2026

View reviewed changes

review fixes

4fb1e5b

dtrawins requested review from ngrozae and pgladkows May 19, 2026 09:03

Apply suggestions from code review

f524f2f

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

michalkulakowski approved these changes May 19, 2026

View reviewed changes

dtrawins added 4 commits May 19, 2026 13:52

update readme

9f6683b

merge

2255a83

exception and skip tests for gte model

1b627f4

update to latest mteb

1b1eed2

dtrawins requested a review from mzegla May 20, 2026 09:48

mzegla reviewed May 20, 2026

View reviewed changes

dtrawins commented May 20, 2026

View reviewed changes

Comment thread demos/embeddings/README.md Outdated

dtrawins commented May 20, 2026

View reviewed changes

Comment thread demos/embeddings/README.md Outdated

Apply suggestions from code review

1dd3308

Co-authored-by: Trawinski, Dariusz <dariusz.trawinski@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added einops for embedding models and simplified accuracy description#4207

added einops for embedding models and simplified accuracy description#4207
dtrawins wants to merge 8 commits into
mainfrom
CVS-186324

dtrawins commented May 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

ngrozae May 18, 2026

Uh oh!

ngrozae May 18, 2026

Uh oh!

dtrawins May 19, 2026

Uh oh!

pgladkows May 18, 2026

Uh oh!

mzegla May 20, 2026

Uh oh!

pgladkows May 20, 2026

Uh oh!

mzegla May 20, 2026

Uh oh!

mzegla May 20, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

dtrawins commented May 13, 2026

🛠 Summary

🧪 Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants