Use async/multi-threaded requests #258

krrishdholakia · 2023-08-08T16:40:17Z

krrishdholakia
Aug 8, 2023

have y'all considered using async/ threading here to make this call faster?

https://github.com/explosion/spacy-llm/blob/f03da9094ee49626ae3aaccd3129e7c3237454ee/spacy_llm/models/rest/anthropic/model.py#L96C1-L99C10

Happy to make a PR to help out here. Working on a library to simplify LLM API calling - noticed y'all call the REST endpoints which is awesome!

rmitsch · 2023-08-09T08:50:39Z

rmitsch
Aug 9, 2023
Maintainer

Hi @krrishdholakia! Let me convert this into a discussion.

0 replies

rmitsch · 2023-08-09T09:06:34Z

rmitsch
Aug 9, 2023
Maintainer

Thanks for offering to submit a PR! We've considering threading here. The issue is that some LLM providers rate-limit or block requests sent with the same token if too many occur in parallel. I know for sure that OpenAI does this, I suspect that Anthropic and Cohere aren't much different in this regard.

We are aware that executing prompts one after the other is a very unsatisfactory solution. OpenAI supports batching in their deprecated /completions endpoint (you can see the built-in support for batching here), but not in their newer /chat/completions endpoint for whatever reason. Batching support by Cohere and Anthropic is also lacking. We're still holding out hope for these providers to finally add support for batching...

1 reply

rmitsch Aug 9, 2023
Maintainer

So in a nutshell: if you want to experiment with async/threading for Anthropic/Cohere calls, you're very welcome! I do strongly suspect you'll run into rate-limiting issues though, in which case we don't want to support this in spacy-llm.

psydok · 2025-02-21T22:05:59Z

psydok
Feb 21, 2025

I'm raising LLM myself. And I wouldn't want to run requests sequentially. Now it seems the only way out is to create many parallel tracks with different texts. But I'm running into an OOM error.

Can you please tell me what would be the easiest way to make the queries asynchronous? So that I don't end up rewriting the whole spacy-llm to async.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Use async/multi-threaded requests #258

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 3 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Use async/multi-threaded requests #258

Uh oh!

Uh oh!

krrishdholakia Aug 8, 2023

Replies: 3 comments · 1 reply

Uh oh!

rmitsch Aug 9, 2023 Maintainer

Uh oh!

Uh oh!

rmitsch Aug 9, 2023 Maintainer

Uh oh!

rmitsch Aug 9, 2023 Maintainer

Uh oh!

psydok Feb 21, 2025

krrishdholakia
Aug 8, 2023

Replies: 3 comments 1 reply

rmitsch
Aug 9, 2023
Maintainer

rmitsch
Aug 9, 2023
Maintainer

rmitsch Aug 9, 2023
Maintainer

psydok
Feb 21, 2025