Skip to content

Pull Request: Update Hugging Face Integration for LLM Inference #383

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

pravinpaudel
Copy link

@pravinpaudel pravinpaudel commented Jun 9, 2025

Overview

This PR implements the Hugging Face provider for text generation using the Hugging Face Inference API. It includes:

  • A HuggingFaceProvider class that handles text generation with configurable parameters.
  • A HuggingFaceEmbeddingProvider class for text embeddings using SentenceTransformer.
  • Proper error handling and API request structuring.

Implementation Details

  • Uses direct HTTP requests to the Hugging Face Inference API.
  • Configurable Parameters: temperature, top_p, and max tokens.
  • Default Model: microsoft/Phi-3-mini-4k-instruct" for text generation.
  • Default Embedding Model: "sentence-transformers/all-MiniLM-L6-v2".
  • Requires the HF_API_KEY environment variable.
  • Also update USE_HUGGINGFACE environment variable to true if you intend to use a HuggingFace model.

Testing

Please verify the implementation works with:

  • Different Hugging Face models. (Make sure the chosen model can be accessed via your API KEY)
  • Various input prompts and parameter configurations.
  • Error scenarios (invalid API key, network issues).

Dependencies

Added sentence_transformers package for embeddings. Hence, run the following command if you intend to use HuggingFace.

pip install -U sentence-transformers

Summary by CodeRabbit

  • New Features

    • Added integration with Hugging Face for model inference and embedding generation, allowing users to select Hugging Face as a provider.
    • New configuration options for Hugging Face usage, including environment variables for enabling the provider, setting the API key, and specifying the inference URL.
  • Chores

    • Updated sample environment variable file to include Hugging Face-related entries.

Copy link

coderabbitai bot commented Jun 9, 2025

Walkthrough

The changes introduce HuggingFace as a new provider for language model inference and embedding generation in the backend. This includes new provider classes, configuration options, environment variable samples, and updates to the agent manager logic to support selecting HuggingFace based on configuration or environment flags.

Changes

File(s) Change Summary
apps/backend/app/agent/providers/huggingface.py Added new module with HuggingFaceProvider and HuggingFaceEmbeddingProvider classes for HuggingFace support.
apps/backend/app/agent/manager.py Updated to support HuggingFace providers in provider selection logic and type annotations.
apps/backend/app/core/config.py Added HuggingFace-related fields to the Settings configuration class.
apps/backend/.env.sample Added HuggingFace configuration variables: usage flag, API key, and inference URL.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant AgentManager
    participant HuggingFaceProvider
    participant HuggingFace API

    User->>AgentManager: Request model inference
    AgentManager->>HuggingFaceProvider: Select provider (if use_huggingface)
    HuggingFaceProvider->>HuggingFace API: POST prompt and parameters
    HuggingFace API-->>HuggingFaceProvider: Return generated text
    HuggingFaceProvider-->>AgentManager: Return response
    AgentManager-->>User: Return generated text
Loading
sequenceDiagram
    participant User
    participant EmbeddingManager
    participant HuggingFaceEmbeddingProvider
    participant SentenceTransformer

    User->>EmbeddingManager: Request text embedding
    EmbeddingManager->>HuggingFaceEmbeddingProvider: Select provider (if use_huggingface)
    HuggingFaceEmbeddingProvider->>SentenceTransformer: Encode text
    SentenceTransformer-->>HuggingFaceEmbeddingProvider: Return embedding
    HuggingFaceEmbeddingProvider-->>EmbeddingManager: Return embedding
    EmbeddingManager-->>User: Return embedding
Loading

Poem

In the warren of code, a new path appears,
HuggingFace hops in, greeted with cheers!
Providers now choose with a flick of the ear,
Models and embeddings, HuggingFace draws near.
Configs and secrets, all neatly in place—
A leap for the backend, at a rabbit’s brisk pace!
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (3)
apps/backend/.env.sample (1)

6-9: Inconsistent environment variable naming convention.

The environment variables use mixed naming conventions - USE_HUGGINGFACE and HF_API_KEY are uppercase while hf_inference_url is lowercase. Consider using consistent naming for better maintainability.

Apply this diff to standardize the naming:

-hf_inference_url="https://api-inference.huggingface.co/models/"
+HF_INFERENCE_URL="https://api-inference.huggingface.co/models/"
apps/backend/app/agent/providers/huggingface.py (2)

4-4: Remove unused import.

The OpenAI import is not used in this file and should be removed.

Apply this diff:

-from openai import OpenAI
🧰 Tools
🪛 Ruff (0.11.9)

4-4: openai.OpenAI imported but unused

Remove unused import: openai.OpenAI

(F401)


48-52: Improve response validation with better error messages.

The response validation is good but could provide more specific error messages for debugging.

Apply this diff:

if not response_data or not isinstance(response_data, list) or len(response_data) == 0:
-   raise ProviderError("Invalid response format from Hugging Face API")
+   raise ProviderError(f"Invalid response format from Hugging Face API: expected non-empty list, got {type(response_data)}")
if "generated_text" not in response_data[0]:
-   raise ProviderError("Missing 'generated_text' in Hugging Face API response")
+   raise ProviderError(f"Missing 'generated_text' in Hugging Face API response. Available keys: {list(response_data[0].keys())}")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between efbc078 and 63187b9.

📒 Files selected for processing (4)
  • apps/backend/.env.sample (1 hunks)
  • apps/backend/app/agent/manager.py (3 hunks)
  • apps/backend/app/agent/providers/huggingface.py (1 hunks)
  • apps/backend/app/core/config.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.11.9)
apps/backend/app/agent/providers/huggingface.py

4-4: openai.OpenAI imported but unused

Remove unused import: openai.OpenAI

(F401)

🪛 Pylint (3.3.7)
apps/backend/app/agent/providers/huggingface.py

[error] 7-7: Attempted relative import beyond top-level package

(E0402)


[error] 8-8: Attempted relative import beyond top-level package

(E0402)


[refactor] 12-12: Too few public methods (1/2)

(R0903)


[refactor] 56-56: Too few public methods (1/2)

(R0903)

🔇 Additional comments (4)
apps/backend/app/agent/manager.py (2)

8-8: LGTM: Clean import addition.

The import statement properly adds the new HuggingFace provider classes.


31-31:

✅ Verification successful

Verify the default HuggingFace model availability.

The default model "microsoft/Phi-3-mini-4k-instruct" should be verified to ensure it's available via the HuggingFace Inference API.


🌐 Web query:

Is the model "microsoft/Phi-3-mini-4k-instruct" available on Hugging Face Inference API?

💡 Result:

Yes, the model "microsoft/Phi-3-mini-4k-instruct" is available on Hugging Face and can be accessed via the Hugging Face Inference API.

  • The official Hugging Face model card details the model and explicitly lists "4K [HF]" under its available formats, indicating support directly through Hugging Face[1].
  • The model is also included in announcement materials specifying its availability on Hugging Face, Azure AI Studio, and other platforms[5].
  • Multiple technical resources and guides reference using the model with Hugging Face, covering both Transformers library loading and ONNX Runtime support[7][10].

Therefore, "microsoft/Phi-3-mini-4k-instruct" is not only downloadable but also ready for inference via the Hugging Face Inference API and Hugging Face Spaces[1][5][7].

Citations:


Default Hugging Face model availability confirmed
The model "microsoft/Phi-3-mini-4k-instruct" is hosted on Hugging Face (https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) and fully supported by the Inference API. No changes required.

apps/backend/app/agent/providers/huggingface.py (2)

15-17: LGTM: Proper API key validation.

Good practice to validate the API key presence during initialization and raise a clear error.


65-66: LGTM: Proper async embedding generation.

Good use of thread pool for CPU-intensive embedding generation and proper conversion to list format.

Comment on lines +18 to +21
use_huggingface: bool = False
hf_api_key: Optional[str]
hf_inference_url: str = "https://router.huggingface.co/hf-inference/models/"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

URL inconsistency between config and environment sample.

The default hf_inference_url in config.py differs from the one in .env.sample:

  • Config: "https://router.huggingface.co/hf-inference/models/"
  • .env.sample: "https://api-inference.huggingface.co/models/"

This inconsistency could lead to configuration issues. Please verify which URL is correct and ensure consistency across both files.


🌐 Web query:

What is the correct base URL for Hugging Face Inference API in 2024?

💡 Result:

In 2024, the base URL for Hugging Face's Inference API is:

https://api-inference.huggingface.co/models/{model_id}

Replace {model_id} with the identifier of the model you wish to use. For example, to use the meta-llama/Meta-Llama-3-8B-Instruct model, the URL would be:

https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-8B-Instruct

Ensure you include your Hugging Face API token in the request headers for authentication. For detailed information on using the Inference API, refer to the Hugging Face documentation. (huggingface.co)

Citations:


Update default Hugging Face Inference API URL

Please align the default hf_inference_url in apps/backend/app/core/config.py with the official Hugging Face Inference API endpoint (https://api-inference.huggingface.co/models/), as confirmed by the docs. Also ensure .env.sample and any related documentation use the same URL.

Affected location:

  • apps/backend/app/core/config.py, line defining hf_inference_url

Suggested diff:

- hf_inference_url: str = "https://router.huggingface.co/hf-inference/models/"
+ hf_inference_url: str = "https://api-inference.huggingface.co/models/"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
use_huggingface: bool = False
hf_api_key: Optional[str]
hf_inference_url: str = "https://router.huggingface.co/hf-inference/models/"
use_huggingface: bool = False
hf_api_key: Optional[str]
hf_inference_url: str = "https://api-inference.huggingface.co/models/"
🤖 Prompt for AI Agents
In apps/backend/app/core/config.py around lines 18 to 21, the default
hf_inference_url is set to "https://router.huggingface.co/hf-inference/models/",
which is inconsistent with the official Hugging Face Inference API URL and the
.env.sample file. Update the hf_inference_url default value to
"https://api-inference.huggingface.co/models/" to match the official endpoint.
Also verify and update the .env.sample and any related documentation to use this
same URL for consistency.

Comment on lines +29 to +32
use_hf = kwargs.get("use_huggingface", os.getenv("USE_HUGGINGFACE") == "true")
if use_hf:
model = kwargs.get("model", "microsoft/Phi-3-mini-4k-instruct")
return HuggingFaceProvider(model_name=model)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve boolean environment variable parsing.

The current implementation only accepts lowercase "true" which is fragile. Consider using more robust boolean parsing that accepts common variations.

Apply this diff to improve boolean parsing:

-use_hf = kwargs.get("use_huggingface", os.getenv("USE_HUGGINGFACE") == "true")
+use_hf = kwargs.get("use_huggingface", os.getenv("USE_HUGGINGFACE", "false").lower() in ("true", "1", "yes", "on"))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
use_hf = kwargs.get("use_huggingface", os.getenv("USE_HUGGINGFACE") == "true")
if use_hf:
model = kwargs.get("model", "microsoft/Phi-3-mini-4k-instruct")
return HuggingFaceProvider(model_name=model)
use_hf = kwargs.get(
"use_huggingface",
os.getenv("USE_HUGGINGFACE", "false")
.lower() in ("true", "1", "yes", "on")
)
if use_hf:
model = kwargs.get("model", "microsoft/Phi-3-mini-4k-instruct")
return HuggingFaceProvider(model_name=model)
🤖 Prompt for AI Agents
In apps/backend/app/agent/manager.py around lines 29 to 32, the environment
variable USE_HUGGINGFACE is currently checked by comparing its value to the
lowercase string "true", which is fragile. Improve this by normalizing the
environment variable value to lowercase and checking if it matches any common
truthy strings like "true", "1", or "yes". Update the use_hf assignment to parse
the environment variable more robustly to handle these variations.

Comment on lines +64 to +67
use_hf = kwargs.get("use_huggingface", os.getenv("USE_HUGGINGFACE") == "true")
if use_hf:
model = kwargs.get("embedding_model", "sentence-transformers/all-MiniLM-L6-v2")
return HuggingFaceEmbeddingProvider(embedding_model=model)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consistent boolean parsing needed.

Same boolean parsing issue exists here as in the AgentManager. Apply the same improvement for consistency.

Apply this diff:

-use_hf = kwargs.get("use_huggingface", os.getenv("USE_HUGGINGFACE") == "true")
+use_hf = kwargs.get("use_huggingface", os.getenv("USE_HUGGINGFACE", "false").lower() in ("true", "1", "yes", "on"))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
use_hf = kwargs.get("use_huggingface", os.getenv("USE_HUGGINGFACE") == "true")
if use_hf:
model = kwargs.get("embedding_model", "sentence-transformers/all-MiniLM-L6-v2")
return HuggingFaceEmbeddingProvider(embedding_model=model)
use_hf = kwargs.get(
"use_huggingface",
os.getenv("USE_HUGGINGFACE", "false").lower() in ("true", "1", "yes", "on"),
)
if use_hf:
model = kwargs.get("embedding_model", "sentence-transformers/all-MiniLM-L6-v2")
return HuggingFaceEmbeddingProvider(embedding_model=model)
🤖 Prompt for AI Agents
In apps/backend/app/agent/manager.py around lines 64 to 67, the boolean parsing
for the "use_huggingface" flag is inconsistent and may cause errors. Update the
code to parse the environment variable "USE_HUGGINGFACE" using a consistent
method that correctly interprets string values like "true" or "false" as
booleans, matching the approach used in AgentManager. This ensures reliable and
uniform boolean handling across the codebase.

Comment on lines +41 to +45
response = requests.post(
f"{os.getenv('HF_INFERENCE_URL')}{self._model_name}",
headers=headers,
json=payload
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add timeout to HTTP request for security and reliability.

The HTTP request lacks a timeout parameter, which could lead to hanging requests and potential security issues.

Apply this diff:

response = requests.post(
    f"{os.getenv('HF_INFERENCE_URL')}{self._model_name}",
    headers=headers,
-   json=payload
+   json=payload,
+   timeout=30
)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
response = requests.post(
f"{os.getenv('HF_INFERENCE_URL')}{self._model_name}",
headers=headers,
json=payload
)
response = requests.post(
f"{os.getenv('HF_INFERENCE_URL')}{self._model_name}",
headers=headers,
json=payload,
timeout=30
)
🤖 Prompt for AI Agents
In apps/backend/app/agent/providers/huggingface.py around lines 41 to 45, the
requests.post call lacks a timeout parameter, which can cause the request to
hang indefinitely. Add a timeout argument to the requests.post call with a
reasonable value (e.g., timeout=10) to ensure the request fails gracefully if
the server does not respond in time.

self._api_key = api_key or os.getenv("HF_API_KEY")
if not self._api_key:
raise ProviderError("Hugging Face API key is missing")
self._model = SentenceTransformer(embedding_model)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for SentenceTransformer model loading.

The SentenceTransformer model loading could fail (network issues, invalid model name) but there's no explicit error handling during initialization.

Apply this diff:

-self._model = SentenceTransformer(embedding_model)
+try:
+    self._model = SentenceTransformer(embedding_model)
+except Exception as e:
+    raise ProviderError(f"Failed to load SentenceTransformer model '{embedding_model}': {e}") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self._model = SentenceTransformer(embedding_model)
try:
self._model = SentenceTransformer(embedding_model)
except Exception as e:
raise ProviderError(
f"Failed to load SentenceTransformer model '{embedding_model}': {e}"
) from e
🤖 Prompt for AI Agents
In apps/backend/app/agent/providers/huggingface.py at line 61, the
SentenceTransformer model loading lacks error handling, which can cause
unhandled exceptions if loading fails. Wrap the SentenceTransformer
initialization in a try-except block to catch exceptions such as network errors
or invalid model names, and handle them appropriately by logging the error or
raising a custom exception with a clear message.

opts = {
"temperature": generation_args.get("temperature", 0.7),
"top_p": generation_args.get("top_p", 0.9),
"max_tokens": generation_args.get("max_length", 20000),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Parameter name inconsistency.

The parameter is named max_length in the generation args but max_tokens in the options dictionary. This inconsistency could cause confusion.

Apply this diff for consistency:

-"max_tokens": generation_args.get("max_length", 20000),
+"max_tokens": generation_args.get("max_tokens", 20000),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"max_tokens": generation_args.get("max_length", 20000),
"max_tokens": generation_args.get("max_tokens", 20000),
🤖 Prompt for AI Agents
In apps/backend/app/agent/providers/huggingface.py at line 24, the parameter key
in the options dictionary is named "max_tokens" while the generation_args uses
"max_length". To fix this inconsistency, rename the key "max_tokens" to
"max_length" in the options dictionary to match the parameter name used in
generation_args, ensuring consistent naming throughout the code.

}
# Send the request to the Hugging Face Inference API
response = requests.post(
f"{os.getenv('HF_INFERENCE_URL')}{self._model_name}",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Environment variable name inconsistency.

The code uses HF_INFERENCE_URL but the config and .env.sample use different variations. This needs to be consistent across all files.


🏁 Script executed:

#!/bin/bash
# Check for environment variable usage consistency
echo "Checking HF_INFERENCE_URL usage:"
rg -n "HF_INFERENCE_URL|hf_inference_url" --type py

echo -e "\nChecking .env files:"
fd -e env -e sample | xargs cat | grep -i inference

Length of output: 476


🏁 Script executed:

#!/bin/bash
# Display the contents of config.py to verify how hf_inference_url is defined and its env var mapping
sed -n '1,200p' apps/backend/app/core/config.py

Length of output: 2000


#!/bin/bash
# Check for .env.sample for inference URL variable definitions
echo "Checking for .env.sample:"
if [[ -f .env.sample ]]; then
  echo ".env.sample contents (first 200 lines):"
  sed -n '1,200p' .env.sample
else
  echo ".env.sample not found at repository root"
fi

echo -e "\nSearching all files (including hidden) for 'inference_url':"
grep -R -n "inference_url" . --exclude-dir=.git --exclude-dir=node_modules

#!/bin/bash
# Locate and display any .env, .env.sample or similar files to verify inference URL vars
echo "Searching for environment files:"
find . -type f \( -name ".env.sample" -o -name ".env" -o -name ".env.*" \) \
  ! -path "./.git/*" ! -path "./node_modules/*" \
  -print \
  -exec echo "---- Contents of {} ----" \; \
  -exec sed -n '1,200p' {} \; \
  -exec echo "" \;

🏁 Script executed:

#!/bin/bash
# Locate and display any .env, .env.sample, or related environment files to verify variable names
find . -type f \( -iname ".env.sample" -o -iname ".env" -o -iname ".env.*" \) \
  -not -path "./.git/*" -not -path "./node_modules/*" | while read file; do
  echo "---- Contents of: $file ----"
  sed -n '1,200p' "$file"
  echo
done

Length of output: 806


Standardize HF_INFERENCE_URL across config, .env.sample, and provider

The backend config defines a hf_inference_url setting (env var HF_INFERENCE_URL), but the .env.sample uses lowercase hf_inference_url and the provider directly calls os.getenv. Let’s align everything:

• apps/backend/.env.sample

  • Rename the key to HF_INFERENCE_URL to match Pydantic’s expected env var.
    • apps/backend/app/agent/providers/huggingface.py
  • Import and use settings.hf_inference_url instead of os.getenv('HF_INFERENCE_URL') so all env var logic is centralized.

Locations to update:

  • apps/backend/.env.sample
  • apps/backend/app/agent/providers/huggingface.py

Suggested diffs:

--- a/apps/backend/.env.sample
+++ b/apps/backend/.env.sample
@@ # HuggingFace Configuration
-USE_HUGGINGFACE=false
-HF_API_KEY=""
-hf_inference_url="https://api-inference.huggingface.co/models/"+ echo
+USE_HUGGINGFACE=false
+HF_API_KEY=""
+HF_INFERENCE_URL="https://api-inference.huggingface.co/models/"
--- a/apps/backend/app/agent/providers/huggingface.py
+++ b/apps/backend/app/agent/providers/huggingface.py
@@
-from os import getenv
+from core.config import settings
@@ def _build_url(self) -> str:
-    return f"{os.getenv('HF_INFERENCE_URL')}{self._model_name}"
+    return f"{settings.hf_inference_url}{self._model_name}"

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In apps/backend/app/agent/providers/huggingface.py at line 42 and
apps/backend/.env.sample, standardize the environment variable name for the
inference URL to HF_INFERENCE_URL (uppercase) to match the backend config.
Update the .env.sample file to rename the key to HF_INFERENCE_URL, and in
huggingface.py replace direct os.getenv calls with importing and using
settings.hf_inference_url to centralize environment variable access and ensure
consistency.

@srbhr
Copy link
Owner

srbhr commented Jun 29, 2025

Thanks for the PR and feature addition. I'll discuss this with the team whether we'll be implementing this feature or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants