Removed `"cannot answer"` literals and added `reset` tool #698

jamesbraza · 2024-11-18T01:41:39Z

There many of places where we depended on the string literal "cannot answer" in the qa prompt, mainly the environment being done (prior to #684) or the answer being considered unsure.

This environment check of "cannot answer" also has some downsides:

A coupling of environment functionality to a caller-specified qa prompt
- We don't validate that "cannot answer" is present in the qa prompt
Added statefulness to our environment (checks for a string literal in answer)

So, what is "unsure"? Really it should be:

gen_answer tool call updates the answer
Given the answer, the agent (and not the environment's gen_answer tool) decides if an answer was successful
If not successful, agent keeps trying to get better evidence, until it gives up

To resolve this, we moved the unsure call directly to the complete tool. Now:

When unsure: agent just keeps running
When finally sure: agent calls complete(has_successful_answer=True)
If giving up: agent calls complete(has_successful_answer=False)

The "cannot answer" check was mostly easy to remove, other than the AnswerSettings.wipe_context_on_answer_failure, since we no longer have a way of checking unsure within gen_answer.

Since the agent controls unsureness now, we needed to make a new tool: reset, which basically performs the use case of wipe_context_on_answer_failure.

After this PR, we have:

Removed dependence on "cannot answer" string literal
Deprecates AnswerSettings.wipe_context_on_answer_failure
Agent defines unsureness, not the output of the environment's gen_answer tool
A "learnable" dimension, the agent controlling wiping contexts

mskarlin · 2024-11-18T15:06:07Z

paperqa/agents/env.py

@@ -49,7 +50,7 @@ def settings_to_tools(
    embedding_model = embedding_model or settings.get_embedding_model()
    tools: list[Tool] = []
    for tool_type in (
-        (PaperSearch, GatherEvidence, GenerateAnswer, Complete)
+        (PaperSearch, GatherEvidence, GenerateAnswer, Reset, Complete)


We'll need to update the tool constructor in the server too.

cc. @nadolskit

paperqa/agents/tools.py

mskarlin · 2024-11-18T15:12:07Z

paperqa/prompts.py

 qa_prompt = (
    "Answer the question below with the context.\n\n"
    "Context (with relevance scores):\n\n{context}\n\n----\n\n"
    "Question: {question}\n\n"
    "Write an answer based on the context. "
    "If the context provides insufficient information reply "
-    '"I cannot answer."'
+    f'"{CANNOT_ANSWER_PHRASE}." '


…wer_failure

…tcut to is_sure=False

…-exhaustive answers

jamesbraza added the enhancement New feature or request label Nov 18, 2024

jamesbraza requested review from whitead, sidnarayanan, mskarlin and nadolskit November 18, 2024 01:41

jamesbraza self-assigned this Nov 18, 2024

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Nov 18, 2024

mskarlin reviewed Nov 18, 2024

View reviewed changes

paperqa/agents/tools.py Outdated Show resolved Hide resolved

mskarlin reviewed Nov 18, 2024

View reviewed changes

jamesbraza force-pushed the removing-cannot-answer branch from 51482c3 to 8e3d78d Compare November 18, 2024 23:17

jamesbraza requested review from mskarlin and maykcaldas November 18, 2024 23:17

jamesbraza added 7 commits November 18, 2024 22:35

Made PQASession.is_sure and integrated into complete tool/env

0087127

Made reset tool and integrated into env

a89801c

Removed all unsure sentinels from source and tests

295e217

Updated configs and added deprecation warning for wipe_context_on_ans…

a10ac5f

…wer_failure

Created CANNOT_ANSWER_PHRASE to make tests more intuitive

e0abb2a

Updated wording on reset tool per good PR suggestion

1723205

Clarified the role of max_answer_attempts and removed it being a shor…

bad8195

…tcut to is_sure=False

jamesbraza force-pushed the removing-cannot-answer branch from 67ef8e5 to 8b6aa61 Compare November 19, 2024 06:40

jamesbraza mentioned this pull request Nov 19, 2024

Created complete tool to allow unsure answers #684

Merged

mskarlin approved these changes Nov 19, 2024

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Nov 19, 2024

Moved from is_sure to has_successful_answer to avoid discrediting non…

2e3edb3

…-exhaustive answers

jamesbraza force-pushed the removing-cannot-answer branch from 8b6aa61 to 2e3edb3 Compare November 19, 2024 18:19

jamesbraza merged commit 201d364 into main Nov 19, 2024
5 checks passed

jamesbraza deleted the removing-cannot-answer branch November 19, 2024 18:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Removed `"cannot answer"` literals and added `reset` tool #698

Removed `"cannot answer"` literals and added `reset` tool #698

Uh oh!

jamesbraza commented Nov 18, 2024 •

edited

Loading

Uh oh!

mskarlin Nov 18, 2024

Uh oh!

jamesbraza Nov 18, 2024

Uh oh!

Uh oh!

mskarlin Nov 18, 2024

Uh oh!

Uh oh!

Uh oh!

Removed "cannot answer" literals and added reset tool #698

Removed "cannot answer" literals and added reset tool #698

Uh oh!

Conversation

jamesbraza commented Nov 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mskarlin Nov 18, 2024

Choose a reason for hiding this comment

Uh oh!

jamesbraza Nov 18, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mskarlin Nov 18, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Removed `"cannot answer"` literals and added `reset` tool #698

Removed `"cannot answer"` literals and added `reset` tool #698

jamesbraza commented Nov 18, 2024 •

edited

Loading