Pull Request: Update Hugging Face Integration for LLM Inference #383

pravinpaudel · 2025-06-09T23:15:07Z

Overview

This PR implements the Hugging Face provider for text generation using the Hugging Face Inference API. It includes:

A HuggingFaceProvider class that handles text generation with configurable parameters.
A HuggingFaceEmbeddingProvider class for text embeddings using SentenceTransformer.
Proper error handling and API request structuring.

Implementation Details

Uses direct HTTP requests to the Hugging Face Inference API.
Configurable Parameters: temperature, top_p, and max tokens.
Default Model: microsoft/Phi-3-mini-4k-instruct" for text generation.
Default Embedding Model: "sentence-transformers/all-MiniLM-L6-v2".
Requires the HF_API_KEY environment variable.
Also update USE_HUGGINGFACE environment variable to true if you intend to use a HuggingFace model.

Testing

Please verify the implementation works with:

Different Hugging Face models. (Make sure the chosen model can be accessed via your API KEY)
Various input prompts and parameter configurations.
Error scenarios (invalid API key, network issues).

Dependencies

Added sentence_transformers package for embeddings. Hence, run the following command if you intend to use HuggingFace.

pip install -U sentence-transformers

Summary by CodeRabbit

New Features
- Added integration with Hugging Face for model inference and embedding generation, allowing users to select Hugging Face as a provider.
- New configuration options for Hugging Face usage, including environment variables for enabling the provider, setting the API key, and specifying the inference URL.
Chores
- Updated sample environment variable file to include Hugging Face-related entries.

…ialization in HuggingFaceProvider

…equest URL

coderabbitai · 2025-06-09T23:15:14Z

Walkthrough

The changes introduce HuggingFace as a new provider for language model inference and embedding generation in the backend. This includes new provider classes, configuration options, environment variable samples, and updates to the agent manager logic to support selecting HuggingFace based on configuration or environment flags.

Changes

File(s)	Change Summary
apps/backend/app/agent/providers/huggingface.py	Added new module with `HuggingFaceProvider` and `HuggingFaceEmbeddingProvider` classes for HuggingFace support.
apps/backend/app/agent/manager.py	Updated to support HuggingFace providers in provider selection logic and type annotations.
apps/backend/app/core/config.py	Added HuggingFace-related fields to the `Settings` configuration class.
apps/backend/.env.sample	Added HuggingFace configuration variables: usage flag, API key, and inference URL.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant AgentManager
    participant HuggingFaceProvider
    participant HuggingFace API

    User->>AgentManager: Request model inference
    AgentManager->>HuggingFaceProvider: Select provider (if use_huggingface)
    HuggingFaceProvider->>HuggingFace API: POST prompt and parameters
    HuggingFace API-->>HuggingFaceProvider: Return generated text
    HuggingFaceProvider-->>AgentManager: Return response
    AgentManager-->>User: Return generated text

sequenceDiagram
    participant User
    participant EmbeddingManager
    participant HuggingFaceEmbeddingProvider
    participant SentenceTransformer

    User->>EmbeddingManager: Request text embedding
    EmbeddingManager->>HuggingFaceEmbeddingProvider: Select provider (if use_huggingface)
    HuggingFaceEmbeddingProvider->>SentenceTransformer: Encode text
    SentenceTransformer-->>HuggingFaceEmbeddingProvider: Return embedding
    HuggingFaceEmbeddingProvider-->>EmbeddingManager: Return embedding
    EmbeddingManager-->>User: Return embedding

Poem

In the warren of code, a new path appears,
HuggingFace hops in, greeted with cheers!
Providers now choose with a flick of the ear,
Models and embeddings, HuggingFace draws near.
Configs and secrets, all neatly in place—
A leap for the backend, at a rabbit’s brisk pace!
🐇✨

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 7

🧹 Nitpick comments (3)

apps/backend/.env.sample (1)
6-9: Inconsistent environment variable naming convention.

The environment variables use mixed naming conventions - USE_HUGGINGFACE and HF_API_KEY are uppercase while hf_inference_url is lowercase. Consider using consistent naming for better maintainability.

Apply this diff to standardize the naming:
-hf_inference_url="https://api-inference.huggingface.co/models/"
+HF_INFERENCE_URL="https://api-inference.huggingface.co/models/"
apps/backend/app/agent/providers/huggingface.py (2)
4-4: Remove unused import.

The OpenAI import is not used in this file and should be removed.

Apply this diff:
-from openai import OpenAI
🧰 Tools

🪛 Ruff (0.11.9)

4-4: openai.OpenAI imported but unused

Remove unused import: openai.OpenAI

(F401)

48-52: Improve response validation with better error messages.

The response validation is good but could provide more specific error messages for debugging.

Apply this diff:
if not response_data or not isinstance(response_data, list) or len(response_data) == 0:
-   raise ProviderError("Invalid response format from Hugging Face API")
+   raise ProviderError(f"Invalid response format from Hugging Face API: expected non-empty list, got {type(response_data)}")
if "generated_text" not in response_data[0]:
-   raise ProviderError("Missing 'generated_text' in Hugging Face API response")
+   raise ProviderError(f"Missing 'generated_text' in Hugging Face API response. Available keys: {list(response_data[0].keys())}")

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between efbc078 and 63187b9.

📒 Files selected for processing (4)

apps/backend/.env.sample (1 hunks)
apps/backend/app/agent/manager.py (3 hunks)
apps/backend/app/agent/providers/huggingface.py (1 hunks)
apps/backend/app/core/config.py (1 hunks)

🧰 Additional context used

🪛 Ruff (0.11.9)

apps/backend/app/agent/providers/huggingface.py

4-4: openai.OpenAI imported but unused

Remove unused import: openai.OpenAI

(F401)

🪛 Pylint (3.3.7)

apps/backend/app/agent/providers/huggingface.py

[error] 7-7: Attempted relative import beyond top-level package

(E0402)

[error] 8-8: Attempted relative import beyond top-level package

(E0402)

[refactor] 12-12: Too few public methods (1/2)

(R0903)

[refactor] 56-56: Too few public methods (1/2)

(R0903)

🔇 Additional comments (4)

apps/backend/app/agent/manager.py (2)

8-8: LGTM: Clean import addition.

The import statement properly adds the new HuggingFace provider classes.

31-31:
✅ Verification successful

Verify the default HuggingFace model availability.

The default model "microsoft/Phi-3-mini-4k-instruct" should be verified to ensure it's available via the HuggingFace Inference API.

🌐 Web query:
Is the model "microsoft/Phi-3-mini-4k-instruct" available on Hugging Face Inference API?
💡 Result:

Yes, the model "microsoft/Phi-3-mini-4k-instruct" is available on Hugging Face and can be accessed via the Hugging Face Inference API.

The official Hugging Face model card details the model and explicitly lists "4K [HF]" under its available formats, indicating support directly through Hugging Face[1].

The model is also included in announcement materials specifying its availability on Hugging Face, Azure AI Studio, and other platforms[5].

Multiple technical resources and guides reference using the model with Hugging Face, covering both Transformers library loading and ONNX Runtime support[7][10].

Therefore, "microsoft/Phi-3-mini-4k-instruct" is not only downloadable but also ready for inference via the Hugging Face Inference API and Hugging Face Spaces[1][5][7].

Citations:

1: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct

2: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf

3: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx

4: https://huggingface.co/spaces/way2mhemanth/microsoft-Phi-3-mini-4k-instruct

5: https://azure.microsoft.com/en-us/blog/introducing-phi-3-redefining-whats-possible-with-slms/

6: https://techcommunity.microsoft.com/blog/machinelearningblog/affordable-innovation-unveiling-the-pricing-of-phi-3-slms-on-models-as-a-service/4156495

7: https://replicate.com/microsoft/phi-3-mini-4k-instruct/readme

8: https://huggingface.co/Sreenington/Phi-3-mini-4k-instruct-AWQ

9: https://github.com/marketplace/models/azureml/Phi-3-mini-4k-instruct

10: https://github.com/amd/RyzenAI-SW/blob/main/example/llm/cpu/Phi_3_mini_4k_instruct.md
Default Hugging Face model availability confirmed
The model "microsoft/Phi-3-mini-4k-instruct" is hosted on Hugging Face (https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) and fully supported by the Inference API. No changes required.
apps/backend/app/agent/providers/huggingface.py (2)

15-17: LGTM: Proper API key validation.

Good practice to validate the API key presence during initialization and raise a clear error.

65-66: LGTM: Proper async embedding generation.

Good use of thread pool for CPU-intensive embedding generation and proper conversion to list format.

coderabbitai · 2025-06-09T23:23:31Z

apps/backend/app/core/config.py

+    use_huggingface: bool = False 
+    hf_api_key: Optional[str]
+    hf_inference_url: str = "https://router.huggingface.co/hf-inference/models/"
+


💡 Verification agent

🧩 Analysis chain

URL inconsistency between config and environment sample.

The default hf_inference_url in config.py differs from the one in .env.sample:

Config: "https://router.huggingface.co/hf-inference/models/"

.env.sample: "https://api-inference.huggingface.co/models/"

This inconsistency could lead to configuration issues. Please verify which URL is correct and ensure consistency across both files.

🌐 Web query:

What is the correct base URL for Hugging Face Inference API in 2024?

💡 Result:

In 2024, the base URL for Hugging Face's Inference API is:

https://api-inference.huggingface.co/models/{model_id}

Replace {model_id} with the identifier of the model you wish to use. For example, to use the meta-llama/Meta-Llama-3-8B-Instruct model, the URL would be:

https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-8B-Instruct

Ensure you include your Hugging Face API token in the request headers for authentication. For detailed information on using the Inference API, refer to the Hugging Face documentation. (huggingface.co)

Citations:

1: https://huggingface.co/docs/huggingface_hub/main/package_reference/inference_client?utm_source=openai

Update default Hugging Face Inference API URL

Please align the default hf_inference_url in apps/backend/app/core/config.py with the official Hugging Face Inference API endpoint (https://api-inference.huggingface.co/models/), as confirmed by the docs. Also ensure .env.sample and any related documentation use the same URL.

Affected location:

apps/backend/app/core/config.py, line defining hf_inference_url

Suggested diff:

- hf_inference_url: str = "https://router.huggingface.co/hf-inference/models/" + hf_inference_url: str = "https://api-inference.huggingface.co/models/"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

use_huggingface: bool = False

hf_api_key: Optional[str]

hf_inference_url: str = "https://router.huggingface.co/hf-inference/models/"

use_huggingface: bool = False

hf_api_key: Optional[str]

hf_inference_url: str = "https://api-inference.huggingface.co/models/"

🤖 Prompt for AI Agents

In apps/backend/app/core/config.py around lines 18 to 21, the default hf_inference_url is set to "https://router.huggingface.co/hf-inference/models/", which is inconsistent with the official Hugging Face Inference API URL and the .env.sample file. Update the hf_inference_url default value to "https://api-inference.huggingface.co/models/" to match the official endpoint. Also verify and update the .env.sample and any related documentation to use this same URL for consistency.

coderabbitai · 2025-06-09T23:23:32Z

apps/backend/app/agent/manager.py

+        use_hf = kwargs.get("use_huggingface", os.getenv("USE_HUGGINGFACE") == "true")
+        if use_hf:
+            model = kwargs.get("model", "microsoft/Phi-3-mini-4k-instruct")
+            return HuggingFaceProvider(model_name=model)


🛠️ Refactor suggestion

Improve boolean environment variable parsing.

The current implementation only accepts lowercase "true" which is fragile. Consider using more robust boolean parsing that accepts common variations.

Apply this diff to improve boolean parsing:

-use_hf = kwargs.get("use_huggingface", os.getenv("USE_HUGGINGFACE") == "true") +use_hf = kwargs.get("use_huggingface", os.getenv("USE_HUGGINGFACE", "false").lower() in ("true", "1", "yes", "on"))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

use_hf = kwargs.get("use_huggingface", os.getenv("USE_HUGGINGFACE") == "true")

if use_hf:

model = kwargs.get("model", "microsoft/Phi-3-mini-4k-instruct")

return HuggingFaceProvider(model_name=model)

use_hf = kwargs.get(

"use_huggingface",

os.getenv("USE_HUGGINGFACE", "false")

.lower() in ("true", "1", "yes", "on")

)

if use_hf:

model = kwargs.get("model", "microsoft/Phi-3-mini-4k-instruct")

return HuggingFaceProvider(model_name=model)

🤖 Prompt for AI Agents

In apps/backend/app/agent/manager.py around lines 29 to 32, the environment variable USE_HUGGINGFACE is currently checked by comparing its value to the lowercase string "true", which is fragile. Improve this by normalizing the environment variable value to lowercase and checking if it matches any common truthy strings like "true", "1", or "yes". Update the use_hf assignment to parse the environment variable more robustly to handle these variations.

coderabbitai · 2025-06-09T23:23:32Z

apps/backend/app/agent/manager.py

+        use_hf = kwargs.get("use_huggingface", os.getenv("USE_HUGGINGFACE") == "true")
+        if use_hf:
+            model = kwargs.get("embedding_model", "sentence-transformers/all-MiniLM-L6-v2")
+            return HuggingFaceEmbeddingProvider(embedding_model=model)


🛠️ Refactor suggestion

Consistent boolean parsing needed.

Same boolean parsing issue exists here as in the AgentManager. Apply the same improvement for consistency.

Apply this diff:

-use_hf = kwargs.get("use_huggingface", os.getenv("USE_HUGGINGFACE") == "true") +use_hf = kwargs.get("use_huggingface", os.getenv("USE_HUGGINGFACE", "false").lower() in ("true", "1", "yes", "on"))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

use_hf = kwargs.get("use_huggingface", os.getenv("USE_HUGGINGFACE") == "true")

if use_hf:

model = kwargs.get("embedding_model", "sentence-transformers/all-MiniLM-L6-v2")

return HuggingFaceEmbeddingProvider(embedding_model=model)

use_hf = kwargs.get(

"use_huggingface",

os.getenv("USE_HUGGINGFACE", "false").lower() in ("true", "1", "yes", "on"),

)

if use_hf:

model = kwargs.get("embedding_model", "sentence-transformers/all-MiniLM-L6-v2")

return HuggingFaceEmbeddingProvider(embedding_model=model)

🤖 Prompt for AI Agents

In apps/backend/app/agent/manager.py around lines 64 to 67, the boolean parsing for the "use_huggingface" flag is inconsistent and may cause errors. Update the code to parse the environment variable "USE_HUGGINGFACE" using a consistent method that correctly interprets string values like "true" or "false" as booleans, matching the approach used in AgentManager. This ensures reliable and uniform boolean handling across the codebase.

coderabbitai · 2025-06-09T23:23:32Z

apps/backend/app/agent/providers/huggingface.py

+            response = requests.post(
+                f"{os.getenv('HF_INFERENCE_URL')}{self._model_name}",
+                headers=headers,
+                json=payload
+            )


🛠️ Refactor suggestion

Add timeout to HTTP request for security and reliability.

The HTTP request lacks a timeout parameter, which could lead to hanging requests and potential security issues.

Apply this diff:

response = requests.post( f"{os.getenv('HF_INFERENCE_URL')}{self._model_name}", headers=headers, - json=payload + json=payload, + timeout=30 )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

response = requests.post(

f"{os.getenv('HF_INFERENCE_URL')}{self._model_name}",

headers=headers,

json=payload

)

response = requests.post(

f"{os.getenv('HF_INFERENCE_URL')}{self._model_name}",

headers=headers,

json=payload,

timeout=30

)

🤖 Prompt for AI Agents

In apps/backend/app/agent/providers/huggingface.py around lines 41 to 45, the requests.post call lacks a timeout parameter, which can cause the request to hang indefinitely. Add a timeout argument to the requests.post call with a reasonable value (e.g., timeout=10) to ensure the request fails gracefully if the server does not respond in time.

coderabbitai · 2025-06-09T23:23:32Z

apps/backend/app/agent/providers/huggingface.py

+        self._api_key = api_key or os.getenv("HF_API_KEY")
+        if not self._api_key:
+            raise ProviderError("Hugging Face API key is missing")
+        self._model = SentenceTransformer(embedding_model)


🛠️ Refactor suggestion

Add error handling for SentenceTransformer model loading.

The SentenceTransformer model loading could fail (network issues, invalid model name) but there's no explicit error handling during initialization.

Apply this diff:

-self._model = SentenceTransformer(embedding_model) +try: + self._model = SentenceTransformer(embedding_model) +except Exception as e: + raise ProviderError(f"Failed to load SentenceTransformer model '{embedding_model}': {e}") from e

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

self._model = SentenceTransformer(embedding_model)

try:

self._model = SentenceTransformer(embedding_model)

except Exception as e:

raise ProviderError(

f"Failed to load SentenceTransformer model '{embedding_model}': {e}"

) from e

🤖 Prompt for AI Agents

In apps/backend/app/agent/providers/huggingface.py at line 61, the SentenceTransformer model loading lacks error handling, which can cause unhandled exceptions if loading fails. Wrap the SentenceTransformer initialization in a try-except block to catch exceptions such as network errors or invalid model names, and handle them appropriately by logging the error or raising a custom exception with a clear message.

coderabbitai · 2025-06-09T23:23:32Z

apps/backend/app/agent/providers/huggingface.py

+        opts = {
+            "temperature": generation_args.get("temperature", 0.7),
+            "top_p": generation_args.get("top_p", 0.9),
+            "max_tokens": generation_args.get("max_length", 20000),


🛠️ Refactor suggestion

Parameter name inconsistency.

The parameter is named max_length in the generation args but max_tokens in the options dictionary. This inconsistency could cause confusion.

Apply this diff for consistency:

-"max_tokens": generation_args.get("max_length", 20000), +"max_tokens": generation_args.get("max_tokens", 20000),

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"max_tokens": generation_args.get("max_length", 20000),

"max_tokens": generation_args.get("max_tokens", 20000),

🤖 Prompt for AI Agents

In apps/backend/app/agent/providers/huggingface.py at line 24, the parameter key in the options dictionary is named "max_tokens" while the generation_args uses "max_length". To fix this inconsistency, rename the key "max_tokens" to "max_length" in the options dictionary to match the parameter name used in generation_args, ensuring consistent naming throughout the code.

coderabbitai · 2025-06-09T23:23:32Z

apps/backend/app/agent/providers/huggingface.py

+            }
+            # Send the request to the Hugging Face Inference API
+            response = requests.post(
+                f"{os.getenv('HF_INFERENCE_URL')}{self._model_name}",


💡 Verification agent

🧩 Analysis chain

Environment variable name inconsistency.

The code uses HF_INFERENCE_URL but the config and .env.sample use different variations. This needs to be consistent across all files.

🏁 Script executed:

#!/bin/bash # Check for environment variable usage consistency echo "Checking HF_INFERENCE_URL usage:" rg -n "HF_INFERENCE_URL|hf_inference_url" --type py echo -e "\nChecking .env files:" fd -e env -e sample | xargs cat | grep -i inference

Length of output: 476

🏁 Script executed:

#!/bin/bash # Display the contents of config.py to verify how hf_inference_url is defined and its env var mapping sed -n '1,200p' apps/backend/app/core/config.py

Length of output: 2000

#!/bin/bash # Check for .env.sample for inference URL variable definitions echo "Checking for .env.sample:" if [[ -f .env.sample ]]; then echo ".env.sample contents (first 200 lines):" sed -n '1,200p' .env.sample else echo ".env.sample not found at repository root" fi echo -e "\nSearching all files (including hidden) for 'inference_url':" grep -R -n "inference_url" . --exclude-dir=.git --exclude-dir=node_modules

#!/bin/bash # Locate and display any .env, .env.sample or similar files to verify inference URL vars echo "Searching for environment files:" find . -type f $ -name ".env.sample" -o -name ".env" -o -name ".env.*" $ \ ! -path "./.git/*" ! -path "./node_modules/*" \ -print \ -exec echo "---- Contents of {} ----" \; \ -exec sed -n '1,200p' {} \; \ -exec echo "" \;

🏁 Script executed:

#!/bin/bash # Locate and display any .env, .env.sample, or related environment files to verify variable names find . -type f $ -iname ".env.sample" -o -iname ".env" -o -iname ".env.*" $ \ -not -path "./.git/*" -not -path "./node_modules/*" | while read file; do echo "---- Contents of: $file ----" sed -n '1,200p' "$file" echo done

Length of output: 806

Standardize HF_INFERENCE_URL across config, .env.sample, and provider

The backend config defines a hf_inference_url setting (env var HF_INFERENCE_URL), but the .env.sample uses lowercase hf_inference_url and the provider directly calls os.getenv. Let’s align everything:

• apps/backend/.env.sample

Rename the key to HF_INFERENCE_URL to match Pydantic’s expected env var.
• apps/backend/app/agent/providers/huggingface.py

Import and use settings.hf_inference_url instead of os.getenv('HF_INFERENCE_URL') so all env var logic is centralized.

Locations to update:

apps/backend/.env.sample

apps/backend/app/agent/providers/huggingface.py

Suggested diffs:

--- a/apps/backend/.env.sample +++ b/apps/backend/.env.sample @@ # HuggingFace Configuration -USE_HUGGINGFACE=false -HF_API_KEY="" -hf_inference_url="https://api-inference.huggingface.co/models/"+ echo +USE_HUGGINGFACE=false +HF_API_KEY="" +HF_INFERENCE_URL="https://api-inference.huggingface.co/models/"

--- a/apps/backend/app/agent/providers/huggingface.py +++ b/apps/backend/app/agent/providers/huggingface.py @@ -from os import getenv +from core.config import settings @@ def _build_url(self) -> str: - return f"{os.getenv('HF_INFERENCE_URL')}{self._model_name}" + return f"{settings.hf_inference_url}{self._model_name}"

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In apps/backend/app/agent/providers/huggingface.py at line 42 and apps/backend/.env.sample, standardize the environment variable name for the inference URL to HF_INFERENCE_URL (uppercase) to match the backend config. Update the .env.sample file to rename the key to HF_INFERENCE_URL, and in huggingface.py replace direct os.getenv calls with importing and using settings.hf_inference_url to centralize environment variable access and ensure consistency.

srbhr · 2025-06-29T07:55:25Z

Thanks for the PR and feature addition. I'll discuss this with the team whether we'll be implementing this feature or not.

pravinpaudel added 4 commits June 9, 2025 19:11

Added HuggingFace provider integration and configuration options

547a652

fix: update default model name in HuggingFaceProvider constructor

e800225

fix: improve error handling and remove unnecessary OpenAI client init…

72b5751

…ialization in HuggingFaceProvider

fix: add Hugging Face inference URL to configuration and update API r…

63187b9

…equest URL

coderabbitai bot reviewed Jun 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Pull Request: Update Hugging Face Integration for LLM Inference #383

Pull Request: Update Hugging Face Integration for LLM Inference #383

Uh oh!

pravinpaudel commented Jun 9, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jun 9, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jun 9, 2025

Uh oh!

coderabbitai bot Jun 9, 2025

Uh oh!

coderabbitai bot Jun 9, 2025

Uh oh!

coderabbitai bot Jun 9, 2025

Uh oh!

coderabbitai bot Jun 9, 2025

Uh oh!

coderabbitai bot Jun 9, 2025

Uh oh!

coderabbitai bot Jun 9, 2025

Uh oh!

srbhr commented Jun 29, 2025

Uh oh!

Uh oh!

	"max_tokens": generation_args.get("max_length", 20000),
	"max_tokens": generation_args.get("max_tokens", 20000),

Uh oh!

Pull Request: Update Hugging Face Integration for LLM Inference #383

Are you sure you want to change the base?

Pull Request: Update Hugging Face Integration for LLM Inference #383

Uh oh!

Conversation

pravinpaudel commented Jun 9, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Implementation Details

Testing

Dependencies

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

srbhr commented Jun 29, 2025

Uh oh!

Uh oh!

pravinpaudel commented Jun 9, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jun 9, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)