Skip to content

Implement support to BatchAPIs to gather evidence #687

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 32 commits into from
Closed

Conversation

maykcaldas
Copy link
Collaborator

@maykcaldas maykcaldas commented Nov 14, 2024

This PR implements support to send requests to OpenAI and Anthropic batch APIs. Due to the parallel nature of gathering evidence and summarizing all candidate papers, we plan to use the batch API when possible.

The use of the batch API depends on Settings.use_batch_in_summary. Therefore, paperqa workflows would still be unchanged in case this setting is set to False (default). Currently, using a batch keeps the process busy while the batch isn't finished on the LLM provider side, which could take up to 24 hours. This scaling issue will be addressed in another PR.

Task list

  • Create a class to make batch calls to openai
  • Create a class to make batch calls to anthropic
  • Integrate the openai class to the get_evidence method
  • Integrate the anthropic class to the get_evidence method
  • Update get_summary_llm to decide which provider to use given the llm in the config
  • ❌ Use pytest.mark.vcr in the tests to avoid creating batches for every test
  • Implement mock servers for testing purposes

This class is used to submit batch calls to the OpenAI batch API
@maykcaldas maykcaldas self-assigned this Nov 14, 2024
maykcaldas and others added 9 commits November 15, 2024 09:10
also added a dependency group in pyproject.toml to install openai and anthropic only if the user wants to use batches, refactored the logic of sumarizing evidences in batch and moved the code to core.py
Also bugfix in tests and created Enums to avoid hardcoding the batch status identifiers
The timelimit and the pooling time for the batches are now in the Settings
@maykcaldas maykcaldas marked this pull request as ready for review November 19, 2024 17:55
@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Nov 19, 2024
@dosubot dosubot bot added the enhancement New feature or request label Nov 19, 2024
@maykcaldas maykcaldas requested a review from whitead November 19, 2024 17:56
@maykcaldas maykcaldas changed the title [WIP] Implement support to BatchAPIs to gather evidence Implement support to BatchAPIs to gather evidence Nov 19, 2024

for _, llm_result in results:
session.add_tokens(llm_result)

session.contexts += [r for r, _ in results if r is not None]
session.contexts += [r for r, _ in results]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did we cut the r is not None filter here? I would think that the results from gather_with_concurrency could still be None on failure, but maybe I'm wrong

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gets the Contexts from gather_with_concurrency or gather_with_batch. And both always return list of tuples with (Context, LLMResult). What can happen is to have an empty text in Context.text, but it seems to me that r is always an instance of Context.
Also, I didn't see any case of map_fxn_summary returning None while studying the code, and mypy also complains that r is None is always a True statement.

Maybe that's an edge case that I didn't see?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we correctly type hinted gather_with_concurrency then this would be resolved. @maykcaldas can you adjust it to be this?

T = TypeVar("T")


async def gather_with_concurrency(n: int, coros: Iterable[Awaitable[T]]) -> list[T]:
    ...
```

@maykcaldas maykcaldas requested a review from mskarlin November 19, 2024 19:46

for _, llm_result in results:
session.add_tokens(llm_result)

session.contexts += [r for r, _ in results if r is not None]
session.contexts += [r for r, _ in results]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we correctly type hinted gather_with_concurrency then this would be resolved. @maykcaldas can you adjust it to be this?

T = TypeVar("T")


async def gather_with_concurrency(n: int, coros: Iterable[Awaitable[T]]) -> list[T]:
    ...
```

@whitead
Copy link
Collaborator

whitead commented Mar 8, 2025

Closing for now - probably no longer relevant.

@whitead whitead closed this Mar 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request size:XXL This PR changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants