Skip to content

Clarify plugin-first OCR extension docs#1651

Open
jimmyzhuu wants to merge 1 commit intomicrosoft:mainfrom
jimmyzhuu:codex/docs-plugin-first-ocr
Open

Clarify plugin-first OCR extension docs#1651
jimmyzhuu wants to merge 1 commit intomicrosoft:mainfrom
jimmyzhuu:codex/docs-plugin-first-ocr

Conversation

@jimmyzhuu
Copy link
Copy Markdown

Summary

This PR makes the plugin extension path clearer in the documentation, especially for OCR-related integrations.

It focuses on three small documentation improvements:

  • clarify in the main README that plugins are the recommended path for optional or backend-specific functionality
  • update the markitdown-ocr README so its usage guidance matches the current product surface more closely
  • document in the sample plugin README that plugins can read optional keyword arguments from register_converters(markitdown, **kwargs)

Problem

While exploring OCR extension paths, I found a few places where the current docs can be read as more permissive or more direct than the current implementation actually is:

  • the main README does not explicitly say when plugin-first is the preferred extension model
  • the markitdown-ocr README includes a CLI example that suggests the built-in CLI can fully configure an LLM client for the plugin
  • the sample plugin docs do not explicitly mention that plugin authors can read optional kwargs passed through MarkItDown(enable_plugins=True, **kwargs)

These are small issues, but together they make it harder to understand how third-party OCR or backend-specific extensions should be integrated.

What this PR changes

Main README

Adds a short note in the Plugins section explaining that plugins are the recommended extension path when an extension:

  • adds non-default dependencies
  • depends on external services or model runtimes
  • changes converter behavior only for opt-in users

packages/markitdown-ocr/README.md

  • removes the command-line example that could be interpreted as fully configuring an LLM-backed plugin via the built-in CLI
  • clarifies that the plugin is primarily configured through the Python API, where llm_client and llm_model can be passed directly
  • adds a short sentence positioning the OCR package as an example of the broader plugin-first extension model

packages/markitdown-sample-plugin/README.md

Adds a short note that plugin authors can read optional configuration from register_converters(markitdown, **kwargs).

Non-goals

This PR does not:

  • change any runtime behavior
  • add new plugin APIs
  • add any new dependencies
  • add or integrate any provider-specific OCR backend

Why this design

I kept this PR documentation-only to make the current extension model easier to understand without changing behavior.

Backward compatibility

No behavior changes. Documentation only.

Related context

Related discussion: #1650

@jimmyzhuu
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant