Skip to content

Latest commit

 

History

History
259 lines (198 loc) · 10.5 KB

File metadata and controls

259 lines (198 loc) · 10.5 KB

Metadata Generator

Status
Stability alpha: metrics
Issues Open issues Closed issues
Code Owners @dmitryax

Every component's documentation should include a brief description of the component and guidance on how to use it. There is also some information about the component (or metadata) that should be included to help end-users understand the current state of the component and whether it is right for their use case. Examples of this metadata about a component are:

  • its stability level
  • the distributions containing it
  • the types of pipelines it supports
  • metrics emitted in the case of a scraping receiver, a scraper, or a connector

The metadata generator defines a schema for specifying this information to ensure it is complete and well-formed. The metadata generator is then able to ingest the metadata, validate it against the schema and produce documentation in a standardized format. An example of how this generated documentation looks can be found in documentation.md.

Using the Metadata Generator

In order for a component to benefit from the metadata generator (mdatagen) these requirements need to be met:

  1. A yaml file containing the metadata that needs to be included in the component
  2. The component should declare a go:generate mdatagen directive which tells mdatagen what to generate

As an example, here is a minimal metadata.yaml for the OTLP receiver:

type: otlp
status:
  class: receiver
  stability:
    beta: [logs]
    stable: [metrics, traces]

Detailed information about the schema of metadata.yaml can be found in metadata-schema.yaml.

The go:generate mdatagen directive is usually defined in a doc.go file in the same package as the component, for example:

//go:generate mdatagen metadata.yaml

package main

Below are some more examples that can be used for reference:

  • The ElasticSearch receiver has an extensive metadata.yaml
  • The host metrics receiver has internal subcomponents, each with their own metadata.yaml and doc.go. See cpuscraper for example.

You can run cd cmd/mdatagen && $(GOCMD) install . to install the mdatagen tool in GOBIN and then run mdatagen metadata.yaml to generate documentation for a specific component or you can run make generate to generate documentation for all components.

Component Config Documentation

The metadata generator supports automatic generation of configuration schemas for components. This generates JSON Schema files that enable IDE autocompletion, validation, and documentation for component configuration. In the future it will also generate Go config structs and human-readable documentation for configuration options

To define a configuration schema, add a config section to your metadata.yaml:

type: myreceiver
status:
  class: receiver
  stability:
    beta: [metrics, traces]

config:
  type: object
  properties:
    endpoint:
      type: string
      description: The endpoint to listen on
      default: "localhost:4317"
    timeout:
      type: string
      format: duration
      description: Request timeout duration
      default: "30s"
    tls:
      $ref: go.opentelemetry.io/collector/config/configtls.server_config
  required: [endpoint]

The config section is based on JSON Schema standard (draft 2020-12) and supports:

  • Standard JSON Schema types: string, number, integer, boolean, object, array, null
  • Validation constraints: minLength, maxLength, pattern, minimum, maximum, enum, etc.
  • References: Internal ($ref: definition_name), external ($ref: package.path.type), or relative ($ref: ./internal/config.type)
  • Reusable definitions: Define common schemas in $defs and reference them with $ref
  • Schema composition: Use allOf for complex configurations

Metrics Builder Configuration

For receivers, scrapers, and other components that emit metrics, mdatagen can generate metrics builder configuration from metadata.yaml.

type: myreceiver
status:
  class: receiver
  stability:
    beta: [metrics]

resource_attributes:
  transport:
    description: Transport used by the request.
    type: string
    enabled: true

attributes:
  status_code:
    description: Response status code.
    type: int
    requirement_level: opt_in

metrics:
  http.server.request.count:
    enabled: true
    description: Number of received requests.
    unit: "{request}"
    sum:
      value_type: int
      monotonic: true
      aggregation_temporality: cumulative
    attributes: [status_code]

This lets users:

  • enable or disable individual metrics
  • enable or disable resource attributes
  • use metrics_include and metrics_exclude on resource attributes to only emit metrics with matching resource attribute values

Metric Reaggregation Configuration

Set reaggregation_enabled: true to let users reduce metric cardinality by dropping selected metric attributes and aggregating the resulting datapoints.

reaggregation_enabled: true

attributes:
  transport:
    description: Transport used by the request.
    type: string
    requirement_level: recommended
  status_code:
    description: Response status code.
    type: int
    requirement_level: opt_in

This adds two per-metric settings for metrics that declare attributes:

  • attributes: the subset of metric attributes to keep in the emitted metric stream
  • aggregation_strategy: how collapsed datapoints are merged, using sum, avg, min, or max

Defaults:

  • sum metrics use sum; gauge metrics use avg
  • required attributes are always kept
  • recommended and conditionally_required attributes are kept by default, but users can remove them
  • opt_in attributes are omitted by default, so that dimension is aggregated unless the user adds it

Example user configuration:

receivers:
  myreceiver:
    metrics:
      http.server.request.count:
        enabled: true
        aggregation_strategy: sum
        attributes: [transport]

In this example, datapoints that only differ by status_code are aggregated together, while transport remains part of the output identity.

Feature Gates Documentation

The metadata generator supports automatic documentation generation for feature gates used by components. Feature gates are documented by adding a feature_gates section to your metadata.yaml:

type: mycomponent
status:
  class: receiver
  stability:
    beta: [metrics, traces]

feature_gates:
  - id: mycomponent.newFeature
    description: 'Enables new feature functionality that improves performance'
    stage: alpha
    from_version: 'v0.100.0'
    reference_url: 'https://github.com/open-telemetry/opentelemetry-collector/issues/12345'

  - id: mycomponent.stableFeature
    description: 'A feature that has reached stability'
    stage: stable
    from_version: 'v0.90.0'
    to_version: 'v0.95.0'
    reference_url: 'https://github.com/open-telemetry/opentelemetry-collector/issues/11111'

This will generate a "Feature Gates" section in the component's documentation.md file with a table containing:

  • Feature Gate: The gate identifier
  • Stage: The lifecycle stage (alpha, beta, stable, deprecated)
  • Description: Brief description of what the gate controls
  • From Version: Version when the gate was introduced
  • To Version: Version when stable/deprecated gates will be removed (if applicable)
  • Reference: Link to additional contextual information

The feature gate definitions should correspond to actual gates registered in your component code using the Feature Gates API.

Generate multiple metadata packages

By default, mdatagen will generate a package called metadata in the internal directory. If you want to generate a package with a different name, you can use the generated_package_name configuration field to provide an alternate name.

type: otlp
generated_package_name: customname
status:
  class: receiver
  stability:
    beta: [logs]
    stable: [metrics, traces]

The most common scenario for this would be making major changes to a receiver's metadata without breaking what exists. In this scenario, mdatagen could produce separate packages for different metadata specs in the same receiver:

//go:generate mdatagen metadata.yaml
//go:generate mdatagen custom.yaml

package main

With two different packages generated, the behaviour for which metadata is used can be easily controlled via featuregate or a similar mechanism.

Contributing to the Metadata Generator

The code for generating the documentation can be found in loader.go and the templates for rendering the documentation can be found in templates. When making updates to the metadata generator or introducing support for new functionality:

  1. Ensure the metadata-schema.yaml and metadata.yaml files reflect the changes.
  2. Run make mdatagen-test.
  3. Make sure all tests are passing including generated tests.
  4. Run make generate.