Skip to content

Chore: Importing from OneNote: Add debug tool for inspecting .one files#15084

Merged
laurent22 merged 14 commits intolaurent22:devfrom
personalizedrefrigerator:pr/chore/tool-for-inspecting-one-files
Apr 14, 2026
Merged

Chore: Importing from OneNote: Add debug tool for inspecting .one files#15084
laurent22 merged 14 commits intolaurent22:devfrom
personalizedrefrigerator:pr/chore/tool-for-inspecting-one-files

Conversation

@personalizedrefrigerator
Copy link
Copy Markdown
Collaborator

Problem

With the current onenote-converter/ logic, it's difficult to inspect the content of a .one file. Manually inspecting the parsed .one file can be useful when debugging OneNote import failures.

Solution

Add a tool to onenote-converter/ to display the structures parsed from a .one file.

This is helpful, for example, to compare the OneNote file structures parsed by Joplin to the Onetastic debug XML posted by @juliusgodo in #13549. (Onetastic is a OneNote extension).

Sample output

self@fedora:~/Documents/joplin/packages/onenote-converter$ cargo run -- ./test-data/onenote-2016/OneWithFileData.one --section
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.05s
     Running `target/debug/inspect ./test-data/onenote-2016/OneWithFileData.one --section`
Reading ./test-data/onenote-2016/OneWithFileData.one
Section {
    display_name: "OneWithFileData",
    page_series: [
        PageSeries {
            pages: [
                Page {
                    entity_id: Guid {D9FB6B6A-AB0C-42DF-A684-4FADF122B557},
                    title: Some(
                        Title {
                            contents: [
...

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 13, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3f986e93-00c4-4afe-abbf-c6fe681dcd8b

📥 Commits

Reviewing files that changed from the base of the PR and between d753e62 and 099fff3.

📒 Files selected for processing (3)
  • packages/onenote-converter/README.md
  • packages/onenote-converter/parser/src/local_onestore/file_node/file_node.rs
  • packages/onenote-converter/parser/src/local_onestore/objects/revision_manifest_list.rs
✅ Files skipped from review due to trivial changes (1)
  • packages/onenote-converter/README.md

📝 Walkthrough

Walkthrough

Adds an inspect CLI for dumping OneStore/section data, new debug helper type, enhanced UTF‑16 error conversion, adjusted visibility and Debug bounds for OneStore/Object, improved PropertySet debug formatting, small parsing comment/log tweaks, README docs, and a cspell entry.

Changes

Cohort / File(s) Summary
CLI Inspection Tool
packages/onenote-converter/parser/src/bin/inspect.rs, packages/onenote-converter/parser/Cargo.toml, packages/onenote-converter/README.md
New inspect binary and Cargo [[bin]] entry; CLI parses args (--onestore/--section), reads file, invokes parser entrypoints, prints debug output; README adds “Inspecting .one files” usage and stability note.
Parser API & Traits
packages/onenote-converter/parser/src/onenote/mod.rs, packages/onenote-converter/parser/src/onestore/mod.rs
Added Parser::parse_onestore_raw(&mut, &[u8]) -> Result<Rc<dyn OneStore>>; OneStore trait now requires std::fmt::Debug.
Object Visibility
packages/onenote-converter/parser/src/onestore/object.rs
Made Object type public (pub struct Object) while restricting id() and props() to crate‑private.
Debug & Formatting Helpers
packages/onenote-converter/parser-utils/src/debug.rs, packages/onenote-converter/parser/src/shared/prop_set.rs
Added pub mod debug and pub struct DebugOutput<'a>(&'a str) with From<&str> and Debug impl; PropertySet::Debug now heuristically decodes PropertyValue::Vec as UTF‑16 for display and emits values via DebugOutput.
Error Handling
packages/onenote-converter/parser-utils/src/errors.rs, packages/onenote-converter/parser-utils/src/lib.rs
Added From<widestring::error::Utf16Error> for Error and ErrorKind::Utf16LibError; UTF‑16 conversion now maps errors instead of unwrapping.
Local OneStore Parsing Tweaks
packages/onenote-converter/parser/src/local_onestore/.../object_group_list.rs, packages/onenote-converter/parser/src/local_onestore/file_node/file_node.rs, packages/onenote-converter/parser/src/local_onestore/objects/revision_manifest_list.rs
Removed a logging call in favour of an inline comment for DataSignatureGroupDefinitionFND; made RevisionRoleDeclarationFND::revision_role field pub; conditionalised a TODO log to include role and only log when role != 0x1.
Tools
packages/tools/cspell/dictionary4.txt
Added onestore spelling entry.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI as Inspect Binary
    participant FS as File System
    participant Parser as Parser
    participant Out as Stdout/Stderr

    User->>CLI: run inspect input.one [--onestore|--section]
    CLI->>CLI: parse arguments -> Config

    alt invalid config
        CLI->>Out: print usage to stderr (exit 1)
    else valid config
        CLI->>FS: read file bytes
        alt read error
            CLI->>Out: print error + usage to stderr (exit 2)
        else read success
            CLI->>Out: print "Reading ..." to stderr
            CLI->>Parser: parse_onestore_raw(data) or parse_section_from_data(data)
            alt parse error
                Parser-->>CLI: Err
                CLI->>Out: print "Parse error: ..." to stderr (exit 3)
            else parse success
                Parser-->>CLI: OneStore or Section
                CLI->>Out: print debug-formatted result (exit 0)
            end
        end
    end
Loading

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 1 warning)

Check name Status Explanation Resolution
Pr Description Must Follow Guidelines ❌ Error PR description lacks a dedicated Test Plan or verification steps section required by guidelines. Add a Test Plan section detailing invocation methods, expected output verification, error handling tests, and comparison against known good samples.
Docstring Coverage ⚠️ Warning Docstring coverage is 27.78% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding a debug tool for inspecting OneNote files to the onenote-converter package.
Description check ✅ Passed The description clearly explains the problem, solution, and includes sample output demonstrating the tool's functionality and usage.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added enhancement Feature requests and code enhancements documentation Documentation, web site, README import Related to importing files such as ENEX, JEX, etc. labels Apr 13, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/onenote-converter/README.md (1)

98-100: Tighten CLI examples: add fence languages and consistent run context.

Both fenced blocks should declare bash (MD040), and examples should use the same working-directory assumption to avoid ambiguity.

Proposed README patch
-```
-bash$ cargo run -- ./test-data/ink.one --onestore
-```
+```bash
+cd parser/
+cargo run -- ./test-data/ink.one --onestore
+```

-```
-bash$ cd parser/
-bash$ cargo run -- ./test-data/ink.one --section
-```
+```bash
+cd parser/
+cargo run -- ./test-data/ink.one --section
+```

Also applies to: 103-106

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/onenote-converter/README.md` around lines 98 - 100, Update the
README.md CLI examples to use fenced code blocks with the bash language and make
the working-directory consistent by adding an explicit cd parser/ before running
cargo; specifically change the two examples that run cargo run
./test-data/ink.one --onestore and cargo run ./test-data/ink.one --section so
each is wrapped in ```bash fences and includes the cd parser/ line prior to the
cargo run command.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/onenote-converter/parser/src/bin/inspect.rs`:
- Line 56: Update the CLI usage string so it lists all accepted flags
consistently: change the eprintln! call that currently prints "Usage:
{program_name} <input_file> [--section]" to include both "--onestore" and
"--section" (e.g., "[--onestore] [--section]") to match the argument parsing
logic that checks for the "--onestore" and "--section" flags; ensure the change
is made where the usage is printed (the eprintln! invocation) so users see the
correct flag options.

---

Nitpick comments:
In `@packages/onenote-converter/README.md`:
- Around line 98-100: Update the README.md CLI examples to use fenced code
blocks with the bash language and make the working-directory consistent by
adding an explicit cd parser/ before running cargo; specifically change the two
examples that run cargo run ./test-data/ink.one --onestore and cargo run
./test-data/ink.one --section so each is wrapped in ```bash fences and includes
the cd parser/ line prior to the cargo run command.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 827744f8-f441-4455-bbd4-9ae037ade867

📥 Commits

Reviewing files that changed from the base of the PR and between 1bbd603 and 8c96f33.

📒 Files selected for processing (12)
  • packages/onenote-converter/README.md
  • packages/onenote-converter/parser-utils/src/debug.rs
  • packages/onenote-converter/parser-utils/src/errors.rs
  • packages/onenote-converter/parser-utils/src/lib.rs
  • packages/onenote-converter/parser/Cargo.toml
  • packages/onenote-converter/parser/src/bin/inspect.rs
  • packages/onenote-converter/parser/src/local_onestore/objects/object_group_list.rs
  • packages/onenote-converter/parser/src/onenote/mod.rs
  • packages/onenote-converter/parser/src/onestore/mod.rs
  • packages/onenote-converter/parser/src/onestore/object.rs
  • packages/onenote-converter/parser/src/shared/prop_set.rs
  • packages/tools/cspell/dictionary4.txt

Comment thread packages/onenote-converter/parser/src/bin/inspect.rs Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/onenote-converter/README.md`:
- Around line 98-105: Add the missing fenced code block language tags so both
command examples are tagged as bash: update the two triple-backtick blocks that
contain "bash$ cargo run -- ./test-data/ink.one --onestore" and "bash$ cargo run
-- ./test-data/ink.one --section" to start with ```bash (i.e., replace ``` with
```bash for those blocks) to satisfy markdownlint MD040.
- Around line 97-105: Update the README examples to make the cargo invocation
explicit for the inspect binary: replace the ambiguous "cargo run --
./test-data/..." calls with "cargo run --bin inspect -- ./test-data/..." for
both the --onestore and --section examples, and if this is a workspace where the
inspect binary lives in a different crate, add the appropriate -p <crate>
selector (e.g., the parser crate) so the command is copy/paste reliable; ensure
both example lines referencing the inspect binary are updated consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1390c987-ac51-4601-82b5-d368108a3cd1

📥 Commits

Reviewing files that changed from the base of the PR and between 6d05536 and d753e62.

📒 Files selected for processing (1)
  • packages/onenote-converter/README.md

Comment thread packages/onenote-converter/README.md
Comment thread packages/onenote-converter/README.md Outdated
Comment on lines +76 to -77
// Marks the end of a signature block. Ignored.
// See https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-onestore/0fa4c886-011a-4c19-9651-9a69e43a19c6
iterator.next();
log!("Ignoring DataSignatureGroupDefinitionFND");
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change removes an unnecessary log statement that cluttered the tool's output.

Comment on lines +63 to +71

// According to MS-ONESTORE 2.1.12, revision_role *should* always be 0x1
if data.base.revision_role != 0x1 {
// TODO: Find a test .one file that uses this and implement:
log_warn!(
"TO-DO: Apply the new role and context to the revision (role {:x})",
data.base.revision_role
);
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change removes an unnecessary warning statement that cluttered the tool's output. The warning is now only emitted if the assigned revision role is different from the expected value documented in MS-ONESTORE 2.1.12.


impl Debug for PropertySet {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
fn format_value(value: &PropertyValue) -> String {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change includes the string representation of some Vec()s in the debug output (Vecs sometimes store Strings). This is useful when searching for a particular string in the lower-level --onestore debug output.

Comment on lines -29 to +30
Ok(value.to_string().unwrap())
value.to_string().map_err(|err| err.into())
Copy link
Copy Markdown
Collaborator Author

@personalizedrefrigerator personalizedrefrigerator Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change prevents the UTF-16 decoder from panicing when encountering invalid Unicode. It instead returns an error that can be handled by the caller. This allows the debug tool to safely include string representations of certain fields that might not include valid UTF-16.

@laurent22 laurent22 merged commit 0a94d02 into laurent22:dev Apr 14, 2026
12 checks passed
@personalizedrefrigerator personalizedrefrigerator deleted the pr/chore/tool-for-inspecting-one-files branch April 14, 2026 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Documentation, web site, README enhancement Feature requests and code enhancements import Related to importing files such as ENEX, JEX, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants