Skip to content

[MLOB-2522] Add --llmobs flag for instrumenting Lambdas with LLM Observability #1603

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 9, 2025

Conversation

sabrenner
Copy link
Contributor

@sabrenner sabrenner commented Apr 2, 2025

What and why?

This PR adds a --llmobs flag to the AWS Lambda instrument/uninstrument commands, which takes in a string. This sets

Some additional changes/explanations/clarifications

  1. Our documentation will highlight intended usage as:
datadog-ci lambda instrument -f <YOUR_LAMBDA_FUNCTION_NAME> -r <AWS_REGION> -v 106 -e 73 --llmobs <YOUR_ML_APP>

i.e, this should be used to instrument one layer, with both the language layer and extension layer specified (layer versions will be auto-populated).

  1. With both layers, these three variables are the only ones needed to enable LLM Observability. DD_LLMOBS_AGENTLESS_ENABLED="false" will use the extension layer's agent as a proxy, enabled by feat: add llmobs proxy paths to trace agent datadog-lambda-extension#628

How?

Adds an --llmobs option to read, and parses it onto the llmobsMlApp setting, which if it is set, sets DD_LLMOBS_ENABLED, DD_LLMOBS_AGENTLESS_ENABLED, and DD_LLMOBS_ML_APP. This is my first time contributing, so I did my best to add tests where I saw appropriate. Happy to add more where I should if I missed some spots!

Review checklist

  • Feature or bugfix MUST have appropriate tests (unit, integration)

@sabrenner sabrenner added serverless Related to [cloud-run, lambda, stepfunctions] documentation Improvements or additions to documentation labels Apr 2, 2025
@datadog-datadog-prod-us1
Copy link

Datadog Report

Branch report: sabrenner/lambda-instrument-llmobs
Commit report: f9cfa24
Test service: datadog-ci-tests

✅ 0 Failed, 1272 Passed, 0 Skipped, 1m 46.23s Total duration (34.48s time saved)

@sabrenner sabrenner marked this pull request as ready for review April 7, 2025 14:18
@sabrenner sabrenner requested review from a team as code owners April 7, 2025 14:18
@sabrenner sabrenner requested a review from hannahqjiang April 7, 2025 14:18
@@ -106,6 +106,7 @@ You can pass the following arguments to `instrument` to specify its behavior. Th
| `--upload-git-metadata` | `-u` | Whether to enable Git metadata uploading, as a part of source code integration. Git metadata uploading is only required if you don't have the Datadog Github Integration installed. | `true` |
| `--no-upload-git-metadata` | | Disables Git metadata uploading, as a part of source code integration. Use this flag if you have the Datadog Github Integration installed, as it renders Git metadata uploading unnecessary. | |
| `--apm-flush-deadline` | | Used to determine when to submit spans before a timeout occurs, in milliseconds. When the remaining time in an AWS Lambda invocation is less than the value set, the tracer attempts to submit the current active spans and all finished spans. Supported for NodeJS and Python. Defaults to `100` milliseconds. | |
| `--llmobs` | | If specified, enables LLM Observability for the instrumented function(s) with the provided ML application name. | |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, what happens if you don't set the app name but enable LLMObs? as in, just the two env vars

      "DD_LLMOBS_ENABLED": "true",
      "DD_LLMOBS_AGENTLESS_ENABLED": "false"

What would happen to the ML_APP name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah good question. when I tried doing this in the tests, ie

const code = await cli.run(
  [
    'lambda',
    'instrument',
    // ...
    '--llmobs',
    // 'my-ml-app',
  ],
  context
)

i got a mismatched output of

Unknown Syntax Error: Command not found; did you mean one of:
...
While running lambda instrument -f arn:aws:lambda:us-east-1:123456789012:function:lambda-hello-world --dry-run --layerVersion 10 --logLevel debug --service middletier --env staging --version 0.2 --extra-tags layer:api,team:intake --no-source-code-integration --llmobs

and assumed that when using Options.String('llmobs') expects a value or otherwise causes a failure somewhere. But, i'm not too well-versed on this behavior here.

but, in general, if ml_app is not set, the LLMObs SDKs will throw/raise at runtime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, I didn't expected it to work in the CI command, just curious about what would happen in LLMObs in general, but

if ml_app is not set, the LLMObs SDKs will throw/raise at runtime.

answers my question.

I'm just curious because it's interesting to see a product that requires a pair of env vars to work properly, normally I'd just see the enabling and that's it – thanks!

@sabrenner sabrenner merged commit fea1d86 into master Apr 9, 2025
15 checks passed
@sabrenner sabrenner deleted the sabrenner/lambda-instrument-llmobs branch April 9, 2025 18:07
@mtalec mtalec mentioned this pull request Apr 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation serverless Related to [cloud-run, lambda, stepfunctions]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants