Skip to content

[OPIK-1512]: [SDK] Long running jobs span start and end batch creation with feature flag #2387

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 24 commits into
base: OPIK-1511-long-running-traces
Choose a base branch
from

Conversation

yaricom
Copy link
Member

@yaricom yaricom commented Jun 4, 2025

Details

The Python SDK needs to support the ingestion of long running jobs, for instance taking 500 seconds.

For that:

  1. It should create spans when they start by taking benefit of the relaxation of the required input fields.
  2. Then, it should flush the complete details of the spans at finalisation.

For these two operations, It will use the batch createSpans endpoint as an upsert endpoint, by providing a proper timestamp in the last_updated_at field for all spans on each of those calls.

After this PR, the following code snippet will send the spans batch to the server exactly as provided.

import os

os.environ["OPIK_LOG_START_TRACE"] = "True"
os.environ["OPIK_LOG_START_SPAN"] = "True"

import time

from opik.decorator import tracker


@tracker.track
def f_inner(x):

    time.sleep(1)
    return "inner-output"

@tracker.track
def f_outer(x):
    f_inner("inner-input")
    time.sleep(1)
    return "outer-output"

for i in range(1):
    f_outer(f"outer-input-{i}")


tracker.flush_tracker()

The spans batch sent to server as result of the code snippet execution:

{
  "spans": [
    {
      "id": "019740a7-0ac2-751d-9da7-498e0f749c70",
      "project_name": "Default Project",
      "trace_id": "019740a7-0ac1-77a7-8452-dfcce8804ee2",
      "name": "f_outer",
      "type": "general",
      "start_time": "2025-06-05T15:12:58.561472Z",
      "input": {
        "x": "outer-input-0"
      },
      "last_updated_at": "2025-06-05T15:12:58.566395Z"
    },
    {
      "id": "019740a7-0ac6-77ad-a6f2-02a79a81f849",
      "project_name": "Default Project",
      "trace_id": "019740a7-0ac1-77a7-8452-dfcce8804ee2",
      "name": "f_inner",
      "type": "general",
      "start_time": "2025-06-05T15:12:58.566663Z",
      "input": {
        "x": "inner-input"
      },
      "last_updated_at": "2025-06-05T15:12:58.566672Z"
    },
    {
      "id": "019740a7-0ac6-77ad-a6f2-02a79a81f849",
      "project_name": "Default Project",
      "trace_id": "019740a7-0ac1-77a7-8452-dfcce8804ee2",
      "parent_span_id": "019740a7-0ac2-751d-9da7-498e0f749c70",
      "name": "f_inner",
      "type": "general",
      "start_time": "2025-06-05T15:12:58.566663Z",
      "end_time": "2025-06-05T15:12:59.572681Z",
      "input": {
        "x": "inner-input"
      },
      "output": {
        "output": "inner-output"
      },
      "last_updated_at": "2025-06-05T15:12:59.572754Z"
    },
    {
      "id": "019740a7-0ac2-751d-9da7-498e0f749c70",
      "project_name": "Default Project",
      "trace_id": "019740a7-0ac1-77a7-8452-dfcce8804ee2",
      "name": "f_outer",
      "type": "general",
      "start_time": "2025-06-05T15:12:58.561472Z",
      "end_time": "2025-06-05T15:13:00.575364Z",
      "input": {
        "x": "outer-input-0"
      },
      "output": {
        "output": "outer-output"
      },
      "last_updated_at": "2025-06-05T15:13:00.575462Z"
    }
  ]
}

The traces batch sent to server:

{
  "traces": [
    {
      "id": "019740a7-0ac1-77a7-8452-dfcce8804ee2",
      "project_name": "Default Project",
      "name": "f_outer",
      "start_time": "2025-06-05T15:12:58.561453Z",
      "input": {
        "x": "outer-input-0"
      },
      "last_updated_at": "2025-06-05T15:12:58.566413Z"
    },
    {
      "id": "019740a7-0ac1-77a7-8452-dfcce8804ee2",
      "project_name": "Default Project",
      "name": "f_outer",
      "start_time": "2025-06-05T15:12:58.561453Z",
      "end_time": "2025-06-05T15:13:00.575530Z",
      "input": {
        "x": "outer-input-0"
      },
      "output": {
        "output": "outer-output"
      },
      "last_updated_at": "2025-06-05T15:13:00.575550Z"
    }
  ]
}

Issues

Resolves #1610

Testing

Implemented related test cases.

yaricom added 2 commits June 4, 2025 18:22
…for improved parameter handling; updated OpikClient and decorator logic to support logging of span start and end based on configuration.
…ce start based on configuration across integrations.
Copy link
Contributor

github-actions bot commented Jun 4, 2025

SDK Unit Tests Results

600 tests   595 ✅  29s ⏱️
  1 suites    0 💤
  1 files      5 ❌

For more details on these failures, see this check.

Results for commit cc8e061.

♻️ This comment has been updated with latest results.

yaricom added 19 commits June 4, 2025 19:29
…nd `_after_call` into dedicated methods with enhanced exception handling and error logging.
…tion and improve maintainability. Added support for attachments during span creation and updated tests accordingly.
…ng consistency, and refactored attachment handling logic in span creation. Updated associated tests.
…ted end time handling in `fake_message_factory`.
@yaricom yaricom marked this pull request as ready for review June 5, 2025 15:21
@yaricom yaricom requested a review from a team as a code owner June 5, 2025 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant