-
-
Notifications
You must be signed in to change notification settings - Fork 8.5k
[New Model]: Snowflake Arctic Embed (Family) #16649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
Currently the following tests can be passed locally:
|
To get the model to pass CI, please add this model to the test files as mentioned here: https://docs.vllm.ai/en/latest/contributing/model/tests.html |
4b87ea0
to
5d5886e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks good to me, thanks for contributing. I left a suggestion and a question in the comments.
270c966
to
c9989a7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
ready to final review |
That sounds good |
I think we should also add a guide for vLLM users to set that in the meantime |
I will draft a document soon |
This pull request has merge conflicts that must be resolved before it can be |
18a7763
to
613153e
Compare
Head branch was pushed to by a user without write access
Thanks for reviewing |
Signed-off-by: Yang Wang <[email protected]>
Signed-off-by: Agata Dobrzyniewicz <[email protected]>
Signed-off-by: Mu Huai <[email protected]>
Summary
Background
Snowflake Arctic Embed (Family) has many architectures.
The following model architecture is BertModel, currently vllm supports
https://huggingface.co/Snowflake/snowflake-arctic-embed-xs
https://huggingface.co/Snowflake/snowflake-arctic-embed-s
https://huggingface.co/Snowflake/snowflake-arctic-embed-m
https://huggingface.co/Snowflake/snowflake-arctic-embed-l
https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5 <- is_matryoshka
The following model architecture is XLMRobertaModel, currently vllm supports
https://huggingface.co/Snowflake/snowflake-arctic-embed-l-v2.0 <- is_matryoshka
The following model architecture is NomicBertModel, which vllm does not support
https://huggingface.co/Snowflake/snowflake-arctic-embed-m-long
The following model architecture is GteModel, which vllm does not support
https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v2.0 <- is_matryoshka
And three models of v1.5 and v2.0 support matryoshka
NomicBertModel
GteModel
FIX #7792