Skip to content

Commit ae98ae6

Browse files
Merge branch 'develop' into 'fb-optic-2152/zoom-presets-option-not-optimized-for-dark-mode'
Workflow run: https://github.com/HumanSignal/label-studio/actions/runs/14932391987
2 parents 9e5f355 + 524f640 commit ae98ae6

File tree

32 files changed

+464
-130
lines changed

32 files changed

+464
-130
lines changed

docs/source/guide/helm_values.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ short: Available Helm values
44
tier: all
55
type: guide
66
order: 72
7-
order_enterprise: 72
7+
order_enterprise: 74
88
meta_title: Available Helm values for Label Studio Helm Chart
99
meta_description: For cases when you want to customize your Label Studio Kubernetes deployment, review these available Helm values that you can set in your Helm chart.
1010
section: "Install & Setup"

docs/source/guide/install_k8s_airgapped.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ short: Airgapped Server
44
tier: all
55
type: guide
66
order: 71
7-
order_enterprise: 71
7+
order_enterprise: 73
88
meta_title: Install Label Studio without public internet access
99
meta_description: Install Label Studio without public internet access to create machine learning and data science projects in an airgapped environment.
1010
section: "Install & Setup"

docs/source/guide/install_prompts.md

Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
---
2+
title: Install Prompts in an on-prem environment (optional)
3+
short: Install Prompts
4+
type: guide
5+
tier: enterprise
6+
order: 0
7+
order_enterprise: 71
8+
meta_title: Install Prompts
9+
meta_description: Install Prompts in a Label Studio Enterprise on-prem environment
10+
section: "Install & Setup"
11+
parent: "install_k8s"
12+
parent_enterprise: "install_enterprise_k8s"
13+
---
14+
15+
Installing Prompts in an on-prem environment requires installing Adala, our data labeling agent microservice.
16+
17+
You only need to complete these steps if you want to use Prompts. For more information, see our [Prompts overview](prompts_overview).
18+
19+
20+
## Prerequisites
21+
22+
- Kubernetes cluster **v1.24** or later
23+
- Helm **v3.8.0** or later
24+
- Docker CLI (for logging into Docker Hub)
25+
26+
## Resource requirements
27+
28+
Before installing, ensure your Kubernetes cluster can provide the following minimum resources for Adala:
29+
30+
| Resource | Requirement |
31+
| --- | --- |
32+
| CPU | 6 cores |
33+
| Memory | 12 GB |
34+
35+
## 1. Authenticate to Docker Hub and validate access
36+
37+
You will need your Docker Hub username and password. If you do not have them, [request access from the HumanSignal team](mailto:[email protected]).
38+
39+
Log in to DockerHub to access the private OCI repository:
40+
41+
```bash
42+
43+
docker login -u CUSTOMER_USERNAME
44+
```
45+
46+
When prompted, enter your Docker Hub password.
47+
48+
Then verify your credentials and access:
49+
50+
```bash
51+
helm pull oci://registry-1.docker.io/heartexlabs/adala
52+
```
53+
54+
Expected output:
55+
56+
```bash
57+
Pulled: registry-1.docker.io/heartexlabs/adala:X.X.X
58+
Digest: sha256:***************************************************
59+
```
60+
61+
## 2. Create a Kubernetes secret for image pulling
62+
63+
Create a Kubernetes secret to allow your cluster to pull private Adala images:
64+
65+
```bash
66+
kubectl create secret docker-registry heartex-pull-key \
67+
--docker-server=https://index.docker.io/v2/ \
68+
--docker-username=CUSTOMER_USERNAME \
69+
--docker-password=CUSTOMER_PASSWORD
70+
```
71+
72+
## 3. Prepare your custom values file
73+
74+
Create a file named `custom.values.yaml` with the following contents:
75+
76+
```yaml
77+
adala-app:
78+
deployment:
79+
image:
80+
tag: 20250428.151611-master-592e818
81+
pullSecrets:
82+
- heartex-pull-key
83+
adala-worker:
84+
deployment:
85+
image:
86+
tag: 20250428.151611-master-592e818
87+
pullSecrets:
88+
- heartex-pull-key
89+
```
90+
91+
!!! note
92+
Replace the `image.tag` with the appropriate version if necessary.
93+
94+
95+
## 4. Create a dedicated namespace for Adala
96+
97+
Create a dedicated namespace `prompt` for Adala:
98+
99+
```bash
100+
kubectl create namespace prompt
101+
```
102+
103+
## 5. Install the Adala Helm chart
104+
105+
Run the following command to install **Adala** using your custom values:
106+
107+
```bash
108+
helm install lse oci://registry-1.docker.io/heartexlabs/adala --values custom.values.yaml
109+
```
110+
111+
## 6. Validate that Adala is running
112+
113+
Check if all pods in the `prompt` namespace are in the **Running** or **Completed** state:
114+
115+
```bash
116+
kubectl get pods -n prompt
117+
```
118+
119+
You should see output where all pods have `STATUS` set to `Running`, for example:
120+
121+
```
122+
NAME READY STATUS RESTARTS AGE
123+
adala-adala-app-d4564ffd7-gtmhx 1/1 Running 0 100m
124+
adala-adala-kafka-controller-0 1/1 Running 0 110m
125+
adala-adala-kafka-controller-1 1/1 Running 0 111m
126+
adala-adala-kafka-controller-2 1/1 Running 0 113m
127+
adala-adala-redis-master-0 1/1 Running 0 125m
128+
adala-adala-worker-5d87f97f76-mq952 1/1 Running 0 111m
129+
130+
```
131+
132+
If any pod is not running, you can investigate further:
133+
134+
```bash
135+
136+
kubectl describe pod <pod-name> -n prompt
137+
```
138+
139+
or
140+
141+
```bash
142+
kubectl logs <pod-name> -n prompt
143+
```
144+
145+
## 7. Update the Label Studio `values.yaml` file
146+
147+
You will need to update the `global` section of your Label Studio Enterprise `values.yaml` file to include the following:
148+
149+
* Add the Adala endpoint, which will allow Label Studio to connect to Adala.
150+
* Add the Prompts feature flag, to enable Prompts visibility within Label Studio.
151+
152+
153+
```yaml
154+
global:
155+
extraEnvironmentVars:
156+
PROMPTER_ADALA_URL: http://adala-adala-app.prompt:8000
157+
featureFlags:
158+
fflag_feat_all_dia_835_prompter_workflow_long: true
159+
```
160+
161+
Note the following for `PROMPTER_ADALA_URL`:
162+
163+
- `prompt` is the namespace where Adala is installed.
164+
- `adala-adala-app` is the name of the Adala service automatically created by the Helm release.
165+
- Port `8000` is the default port where Adala listens.
166+
167+
After updating the values file, redeploy Label Studio to apply the changes.

docs/source/guide/prompts_overview.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,10 +38,13 @@ With Prompts, you can:
3838
| **Network access** | If you are using a firewall or restricting network access to your OpenAI models, you will need to allow the following IPs: <br>3.219.3.197 <br>34.237.73.3 <br>4.216.17.242 |
3939
| **Required permissions** | **Owners, Administrators, Managers** -- Can create Prompt models and update projects with auto-annotations. Managers can only apply models to projects in which they are already a member. <br><br>**Reviewers and Annotators** -- No access to the Prompts tool, but can see the predictions generated by the prompts from within the project (depending on your [project settings](project_settings_lse)). |
4040
| **ML backend support** | Prompts should not be used with a project that is connected to an ML backend, as this can affect how certain evaluation metrics are calculated. |
41-
| **Enterprise vs. Open Source** | Label Studio Enterprise (Cloud only)<br />Starter Cloud|
41+
| **Enterprise vs. Open Source** | Label Studio Enterprise <br />Starter Cloud|
4242

4343
</div>
4444

45+
!!! note
46+
For information on installing Prompts for on-prem environments, see [Install Prompts](install_prompts).
47+
4548
## Supported base models
4649

4750
<div class="noheader rowheader">

label_studio/feature_flags.json

Lines changed: 30 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2364,6 +2364,33 @@
23642364
"version": 4,
23652365
"deleted": false
23662366
},
2367+
"fflag_feat_front_leap_2036_annotations_summary": {
2368+
"key": "fflag_feat_front_leap_2036_annotations_summary",
2369+
"on": false,
2370+
"prerequisites": [],
2371+
"targets": [],
2372+
"contextTargets": [],
2373+
"rules": [],
2374+
"fallthrough": {
2375+
"variation": 0
2376+
},
2377+
"offVariation": 1,
2378+
"variations": [
2379+
true,
2380+
false
2381+
],
2382+
"clientSideAvailability": {
2383+
"usingMobileKey": false,
2384+
"usingEnvironmentId": false
2385+
},
2386+
"clientSide": false,
2387+
"salt": "a0c67bbfb44743d6903c9445033c59c3",
2388+
"trackEvents": false,
2389+
"trackEventsFallthrough": false,
2390+
"debugEventsUntilDate": null,
2391+
"version": 2,
2392+
"deleted": false
2393+
},
23672394
"fflag_feat_front_leap_482_self_serve_short": {
23682395
"key": "fflag_feat_front_leap_482_self_serve_short",
23692396
"on": false,
@@ -4054,13 +4081,13 @@
40544081
},
40554082
"fflag_fix_leap_2052_detect_empty_filters_at_next_task_endpoint_short": {
40564083
"key": "fflag_fix_leap_2052_detect_empty_filters_at_next_task_endpoint_short",
4057-
"on": false,
4084+
"on": true,
40584085
"prerequisites": [],
40594086
"targets": [],
40604087
"contextTargets": [],
40614088
"rules": [],
40624089
"fallthrough": {
4063-
"variation": 1
4090+
"variation": 0
40644091
},
40654092
"offVariation": 1,
40664093
"variations": [
@@ -4076,7 +4103,7 @@
40764103
"trackEvents": false,
40774104
"trackEventsFallthrough": false,
40784105
"debugEventsUntilDate": null,
4079-
"version": 2,
4106+
"version": 3,
40804107
"deleted": false
40814108
},
40824109
"fflag_fix_leap_246_multi_object_hotkeys_160124_short": {

label_studio/io_storages/README.md

Lines changed: 26 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -184,4 +184,29 @@ The Storage Proxy API behavior can be configured using the following environment
184184
| `RESOLVER_PROXY_MAX_RANGE_SIZE` | Maximum size in bytes for a single range request | 7*1024*1024 |
185185
| `RESOLVER_PROXY_CACHE_TIMEOUT` | Cache TTL in seconds for proxy responses | 3600 |
186186

187-
These optimizations ensure that the Proxy API remains responsive and resource-efficient, even when handling large files or many concurrent requests.
187+
These optimizations ensure that the Proxy API remains responsive and resource-efficient, even when handling large files or many concurrent requests.
188+
189+
## Multiple Storages and URL Resolving
190+
191+
There are use cases where multiple storages can/must be used in a single project. This can cause some confusion as to which storage gets used when. Here are some common cases and how to set up mutliple storages properly.
192+
193+
### Case 1 - Tasks Referencing Other Buckets
194+
* bucket-A containing JSON tasks
195+
* bucket-B containing images/text/other data
196+
* Tasks synced from bucket-A have references to data in bucket-B
197+
198+
##### How To Setup
199+
* Add storage 1 for bucket-A
200+
* Add storage 2 for bucket-B (might be same or different credentials than bucket-A)
201+
* Sync storage 1
202+
* All references to data in bucket-B will be resolved using storage 2 automatically
203+
204+
### Case 2 - Buckets with Different Credentials
205+
* bucket-A accessible by credentials 1
206+
* bucket-B accessible by credentials 2
207+
208+
##### How To Setup
209+
* Add storage 1 for bucket-A with credentials 1
210+
* Add storage 2 for bucket-B with credentials 2
211+
* Sync both storages
212+
* The appropriate storage will be used to resolve urls/generate presigned URLs

label_studio/io_storages/base_models.py

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
from django.utils import timezone
2828
from django.utils.translation import gettext_lazy as _
2929
from django_rq import job
30-
from io_storages.utils import get_uri_via_regex
30+
from io_storages.utils import get_uri_via_regex, parse_bucket_uri
3131
from rq.job import Job
3232
from tasks.models import Annotation, Task
3333
from tasks.serializers import AnnotationSerializer, PredictionSerializer
@@ -255,8 +255,19 @@ def can_resolve_scheme(self, url: Union[str, None]) -> bool:
255255
return False
256256
# TODO: Search for occurrences inside string, e.g. for cases like "gs://bucket/file.pdf" or "<embed src='gs://bucket/file.pdf'/>"
257257
_, prefix = get_uri_via_regex(url, prefixes=(self.url_scheme,))
258-
if prefix == self.url_scheme:
259-
return True
258+
bucket_uri = parse_bucket_uri(url, self)
259+
260+
# If there is a prefix and the bucket matches the storage's bucket/container/path
261+
if prefix == self.url_scheme and bucket_uri:
262+
# bucket is used for s3 and gcs
263+
if hasattr(self, 'bucket') and bucket_uri.bucket == self.bucket:
264+
return True
265+
# container is used for azure blob
266+
if hasattr(self, 'container') and bucket_uri.bucket == self.container:
267+
return True
268+
# path is used for redis
269+
if hasattr(self, 'path') and bucket_uri.bucket == self.path:
270+
return True
260271
# if not found any occurrences - this Storage can't resolve url
261272
return False
262273

label_studio/io_storages/functions.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,5 @@ def get_storage_by_url(url: Union[str, List, Dict], storage_objects: Iterable[Im
5454
for storage_object in storage_objects:
5555
if storage_object.can_resolve_url(url):
5656
# note: only first found storage_object will be used for link resolving
57-
# probably we need to use more advanced can_resolve_url mechanics
58-
# that takes into account not only prefixes, but bucket path too
57+
# can_resolve_url now checks both the scheme and the bucket to ensure the correct storage is used
5958
return storage_object

label_studio/io_storages/s3/serializers.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,8 @@ def validate(self, data):
6868
except TypeError as e:
6969
logger.info(f'It seems access keys are incorrect: {e}', exc_info=True)
7070
raise ValidationError('It seems access keys are incorrect')
71+
except KeyError:
72+
raise ValidationError(f'{storage.url_scheme}://{storage.bucket}/{storage.prefix} not found.')
7173
return data
7274

7375

label_studio/organizations/api.py

Lines changed: 20 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -117,8 +117,8 @@ class OrganizationMemberListAPI(generics.ListAPIView):
117117
pagination_class = OrganizationMemberListPagination
118118

119119
def _get_created_projects_map(self):
120-
members = self.get_queryset()
121-
user_ids = members.values_list('user_id', flat=True)
120+
members = self.paginate_queryset(self.filter_queryset(self.get_queryset()))
121+
user_ids = [member.user_id for member in members]
122122
projects = (
123123
Project.objects.filter(created_by_id__in=user_ids, organization=self.request.user.active_organization)
124124
.values('created_by_id', 'id', 'title')
@@ -135,24 +135,29 @@ def _get_created_projects_map(self):
135135
return projects_map
136136

137137
def _get_contributed_to_projects_map(self):
138-
members = self.get_queryset()
139-
user_ids = members.values_list('user_id', flat=True)
140-
projects = (
141-
Annotation.objects.filter(
142-
completed_by__in=user_ids, project__organization=self.request.user.active_organization
143-
)
144-
.values('completed_by', 'project__id', 'project__title')
138+
members = self.paginate_queryset(self.filter_queryset(self.get_queryset()))
139+
user_ids = [member.user_id for member in members]
140+
org_project_ids = Project.objects.filter(organization=self.request.user.active_organization).values_list(
141+
'id', flat=True
142+
)
143+
annotations = (
144+
Annotation.objects.filter(completed_by__in=list(user_ids), project__in=list(org_project_ids))
145+
.values('completed_by', 'project_id')
145146
.distinct()
146147
)
147-
projects_map = {}
148-
for project in projects:
149-
projects_map.setdefault(project['completed_by'], []).append(
148+
project_ids = [annotation['project_id'] for annotation in annotations]
149+
projects_map = Project.objects.in_bulk(id_list=project_ids, field_name='id')
150+
151+
contributed_to_projects_map = {}
152+
for annotation in annotations:
153+
project = projects_map[annotation['project_id']]
154+
contributed_to_projects_map.setdefault(annotation['completed_by'], []).append(
150155
{
151-
'id': project['project__id'],
152-
'title': project['project__title'],
156+
'id': project.id,
157+
'title': project.title,
153158
}
154159
)
155-
return projects_map
160+
return contributed_to_projects_map
156161

157162
def get_serializer_context(self):
158163
context = super().get_serializer_context()

0 commit comments

Comments
 (0)