Desktop: Importing from OneNote: Fix large `.one` files with many attachments fail to import by personalizedrefrigerator · Pull Request #15117 · laurent22/joplin

personalizedrefrigerator · 2026-04-16T04:29:03Z

Problem

Previously, Joplin would attempt to load entire .one files into memory before import. This could cause import to fail for large .one files. This could be a problem, for example, for .one files that include many large attachments, since attachments are bundled within the file.

In particular, .one files larger than 2 GB would fail to import completely, with a "RangeError: File size (3180223840) is greater than 2 GiB".

Notes:

This pull request is related to .one files. .onepkg and .zip files are archives that can include one or more .one files. As such, .onepkg and .zip files larger than 2 GiB could previously be imported, provided that none of the .one files they contain are large.
See also: A possibly-related related forum post for an attempt to import a 1.95 GB notebook.

Solution

Refactor to avoid loading full .one files into memory during import.

Remaining issues:

Although all attachments are no longer loaded at once, each attachment is still read into memory before writing to disk. This could be a problem for large (e.g. 2 GB) attachments.
The importer still runs in the main process and doesn't yield to the JavaScript event loop. If an import takes longer than 30 seconds, Joplin shows a "window unresponsive" error dialog. A possible fix for a future pull request would be to move the importer to a web worker.

Testing

Windows:

Create a roughly 3 GB .one file and import it.
Verify that the file fails to import in a previous version of Joplin.
Verify that the import completes successfully.

Screencast.from.2026-04-15.21-24-47.webm

Refactors reader.rs

…mport/fix-large-file-handling

This reverts commit d3db685.

This reverts commit 3b83843.

Credit: Issue found during a Claude Code code review.

Credit: Issue flagged during a Claude Code code review.

Credit: Issues were flagged by Claude Code during a review. (The implemented solutions are my own)

coderabbitai · 2026-04-16T04:39:07Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b4d2d9f0-bee7-4f9c-af37-825b98cb5338

📥 Commits

Reviewing files that changed from the base of the PR and between a1c23f4 and b5511be.

📒 Files selected for processing (1)

packages/onenote-converter/parser-utils/node_functions.js

🚧 Files skipped from review as they are similar to previous changes (1)

packages/onenote-converter/parser-utils/node_functions.js

📝 Walkthrough

Walkthrough

Replaces eager in-memory file reads with streamed, seekable I/O using u64/BigInt sizes across Node/WASM/native layers; Reader and FileBlob adopt file-backed, lazy-loading models; FileHandle exposes byte_length(); new error variant for integer conversion added.

Changes

Cohort / File(s)	Summary
File API `packages/onenote-converter/parser-utils/src/file_api/api.rs`, `packages/onenote-converter/parser-utils/src/file_api/native_driver.rs`, `packages/onenote-converter/parser-utils/src/file_api/wasm_driver.rs`	`FileHandle` now requires `fn byte_length(&self) -> u64`. Native driver returns `BufReader<fs::File>` implementing `byte_length()`. WASM externs changed from `usize`→`u64` for `read`/`size`; internal offset/seek/read logic updated to u64.
Reader & Backing Model `packages/onenote-converter/parser-utils/src/reader.rs`	Major rewrite: unified `ReaderData` for slice or file-backed data; reads return owned `Vec<u8>`; offsets/lengths use `u64`; added `Seek` impl, `save_position`/`restore_position`, `peek_u8`, `as_data_ref`/`ReaderDataRef` and position APIs.
JS Node Bindings `packages/onenote-converter/parser-utils/node_functions.js`	`fileReader(path)` callbacks updated: `read` receives `BigInt` params and converts to Number for `fs.readSync`; `size()` now returns `BigInt`.
FileBlob / Lazy Loading `packages/onenote-converter/parser/src/shared/file_data_ref.rs`	`FileBlob` becomes loader-backed (`Rc<dyn FileDataLoader>`) with explicit `size`; adds `FileDataLoader` trait and impls; `as_ref()` removed in favour of `len()`/`load()`; equality based on size + loader identity.
Parser API & Call Sites `packages/onenote-converter/parser/src/onenote/mod.rs`, `packages/onenote-converter/parser/src/onestore/mod.rs`, `packages/onenote-converter/parser/src/bin/inspect.rs`	Parser APIs now accept paths/Readers (borrowed `&str`) instead of preloaded byte slices; binaries no longer pre-read files; parser rewinds readers via `seek` for retry parsing instead of cloning.
Chunk/Reader Resolution `packages/onenote-converter/parser/src/local_onestore/common/file_chunk_reference.rs`, `packages/onenote-converter/parser/src/local_onestore/file_structure/...`, `packages/onenote-converter/parser/src/local_onestore/file_node/...`	`FileChunkReference` API changed from `resolve_to_reader(...) -> Reader` to `seek_reader_to(&mut Reader) -> Result<()>`; callers now save/restore positions and perform in-place seeking with u64 offsets.
Data/Blob Reads `packages/onenote-converter/parser/src/fsshttpb/data/...`, `packages/onenote-converter/parser/src/shared/property.rs`	Removed redundant `.to_vec()` copies: reader now returns owned `Vec<u8>` used directly; debug formatting adjusted to use stored sizes or direct lengths.
Compact Integers & Utilities `packages/onenote-converter/parser/src/shared/compact_u64.rs`	Parsing switched to use Reader APIs (`get_u8`, `read`) and u64-aware logic; tests updated to use `Reader::from`.
Embedded Resources & Images `packages/onenote-converter/parser/src/onenote/embedded_file.rs`, `packages/onenote-converter/parser/src/onenote/image.rs`, `packages/onenote-converter/renderer/src/page/...`, `packages/onenote-converter/renderer/src/lib.rs`	Accessors changed from borrowed slices/`Option<Rc<_>>` to fallible `Result<Vec<u8>>`/`Result<Option<Vec<u8>>>`; renderers updated to propagate errors (`?`); removed `PartialOrd` derives where applicable.
Errors `packages/onenote-converter/parser-utils/src/errors.rs`	`From<std::io::Error>` now maps `ErrorKind::UnexpectedEof` explicitly using the I/O error message; added `impl From<TryFromIntError> for Error` and new `ErrorKind::TypeConversionFailed { err: TryFromIntError }`.
Miscellaneous various local_onestore files	Looping/fragment parsing and padding/advance logic adjusted to use `seek_reader_to()` and consistent u64/usize casts for remaining/advance calculations.

Sequence Diagram(s)

sequenceDiagram
  participant JS as Node (node_functions.js)
  participant WASM as WASM bindings
  participant Rust as parser-utils (Reader / FileHandle)
  participant FS as Filesystem

  JS->>WASM: fileReader(path) -> create handle
  JS-->>WASM: size() returns BigInt (rgba(0,128,0,0.5))
  JS->>WASM: read(bigPosition, bigLength) (BigInt)
  WASM->>Rust: read(offset:u64, size:u64)
  Rust->>Rust: Reader.seek/read -> Vec<u8>
  Rust->>FS: fs.read via native driver (BufReader)
  Rust-->>WASM: Uint8Array (rgba(0,0,255,0.5))
  WASM-->>JS: bytes returned

Possibly related PRs

Chore: Importing from OneNote: Add debug tool for inspecting .one files #15084 — modifies the inspect CLI; earlier PR added the inspect binary which this change replaces with streamed path-based parsing.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 46.05% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the main change: fixing large `.one` file imports by addressing a 2 GB size limitation problem.
Description check	✅ Passed	The description is comprehensive and directly related to the changeset, explaining the problem (large `.one` files failing to import), the solution (refactoring to avoid in-memory loading), and known limitations.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (5)

packages/onenote-converter/parser/src/shared/compact_u64.rs (1)
23-82: Please add regression cases for every width branch.

This parser now has separate decoding and EOF paths for 7/14/21/28/35/42/49/64-bit values, but the exercised tests in this patch still only cover the zero case. A small table-driven suite here would make the refactor much safer.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/onenote-converter/parser/src/shared/compact_u64.rs` around lines 23
- 82, Add table-driven regression tests that cover every width branch in
CompactU64::parse: test the zero-case, the 7-bit branch (first_byte & 1), 14-bit
(first_byte & 2), 21-bit (first_byte & 4), 28-bit (8), 35-bit (16), 42-bit (32),
49-bit (64) and the 64-bit full path (first_byte & 128). For each case construct
a Reader input that exercises both normal decoding and EOF/error behavior (e.g.,
truncated byte slices) and assert the returned CompactU64 value or the error;
reference the parse function and Reader.get_u8/get_u64/read methods to locate
where to craft inputs and expected outputs.
packages/onenote-converter/renderer/src/page/image.rs (1)
11-15: Stream image payloads straight to disk.

Line 11 still materialises the whole attachment before Line 15 writes it, so one very large image can still hit the same memory ceiling this PR is trying to remove. A chunked write path here would close most of the remaining gap for single huge attachments.

Based on learnings: "The fix uses streamFileDataUri to read large files in 3 MB chunks ... so the full file content is never in memory at once."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/onenote-converter/renderer/src/page/image.rs` around lines 11 - 15,
The current code calls image.data()? which materializes the entire payload
before writing; change to a streaming/chunked write so large images are never
fully in memory: use determine_image_filename(image)? to get filename and
fs_driver().join(&self.output, &filename) for path, then open/create a writable
file handle via the fs_driver API (or add a write_file_stream method) and
iterate over the image's streaming API (e.g., streamFileDataUri or whatever
yields chunks from image) writing each chunk to the file handle until finished;
remove the image.data()? call and replace fs_driver().write_file(&path,
&data[..])? with the chunked write loop so only a small buffer is allocated at a
time.
packages/onenote-converter/parser-utils/src/file_api/native_driver.rs (1)
91-95: Consider propagating metadata errors instead of defaulting to 0.

If metadata() fails on an already-opened file, returning 0 will cause subsequent seeks to fail with misleading "out-of-bounds" errors. Whilst this scenario is rare, propagating the actual error would aid debugging.

Note: This would require changing the FileHandle trait's byte_length() signature to return Result<u64>, which may be a larger refactor.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/onenote-converter/parser-utils/src/file_api/native_driver.rs` around
lines 91 - 95, The current FileHandle implementation for BufReader<fs::File>
swallows metadata() errors by returning 0 in byte_length(), causing misleading
out-of-bounds seeks; update the FileHandle::byte_length signature to return
Result<u64, std::io::Error> (or the crate’s Error type) and change the impl for
BufReader<fs::File> to propagate get_ref().metadata()?.len() as Ok(len) (or
map_ok) so metadata errors are returned instead of defaulting to 0; update
callers of FileHandle::byte_length (and any trait impls) to handle the Result
accordingly.
packages/onenote-converter/parser/src/shared/file_data_ref.rs (2)
15-18: PartialEq now means “same loader instance”, not “same blob”.

Two FileBlobs with identical bytes but different loaders now compare unequal, including independently-created empty blobs. If callers still treat this as a value type, that behavioural shift is going to be surprising. I’d either drop PartialEq or make the identity check explicit via a separate helper.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/onenote-converter/parser/src/shared/file_data_ref.rs` around lines
15 - 18, The current PartialEq impl on FileBlob compares loader identity
(Rc::ptr_eq) rather than blob bytes, which makes value-equal blobs compare
unequal; remove the impl PartialEq for FileBlob to avoid surprising behavior and
instead add an explicit identity helper like FileBlob::is_same_loader(&self,
other: &FileBlob) -> bool that returns Rc::ptr_eq(&self.loader, &other.loader);
if true value-equality semantics are required elsewhere, add a separate method
FileBlob::equals_bytes(&self, other: &FileBlob) -> bool that compares size and
actual bytes (reading via the loader) rather than reintroducing PartialEq.
21-23: Avoid letting size drift from the loader output.

FileBlob now has two sources of truth for length: the stored size and whatever load() returns. If they ever disagree, len() lies and the mismatch is only discovered much later. Consider deriving the size from the loader or validating data.len() == self.size in load().

Also applies to: 53-65
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/onenote-converter/parser/src/shared/file_data_ref.rs` around lines
21 - 23, FileBlob currently trusts a stored size field while load() from
FileDataLoader can return a different byte vector, leading len() being wrong;
update FileBlob::load() to read the data via the FileDataLoader::load()
implementation, then validate and reconcile the size: after obtaining data from
load(), either set self.size = data.len() (deriving the canonical size from the
loader output) or return an error if data.len() != self.size (choose consistent
behavior for your domain), and ensure FileBlob::len() uses the reconciled
self.size; refer to the FileDataLoader trait, FileBlob::load(), and
FileBlob::len() when making the change.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/onenote-converter/parser-utils/node_functions.js`:
- Around line 35-49: The read() and size() helpers currently coerce
bigPosition/bigLength to Number which can silently lose precision; preserve
BigInt end-to-end by passing bigPosition (and bigLength where supported) as
BigInt to fs.readSync (or validate and explicitly range-check before coercing to
Number), and have size() return the BigInt obtained from fs.fstatSync(fd, {
bigint: true }) rather than constructing BigInt(size) from a lost-precision
Number; update references in read, size, and any fstat usage to use the bigint
path and only convert to Number when you have validated the value fits safely.

In `@packages/onenote-converter/parser-utils/src/reader.rs`:
- Around line 152-159: The buffer-backed branch of as_data_ref (when matching
ReaderData::BufferRef in reader.rs) slices buffer[start..start + size] without
bounds checking and can panic; add a check comparing size to self.remaining()
(using self.data_offset) and if size > remaining() return Err(UnexpectedEof)
(consistent with read_exact behavior) instead of slicing, otherwise proceed to
create ReaderDataRef::Vec from the slice; ensure the same validation logic
applies conceptually to the BufferRef path to prevent malformed .one files from
causing a panic.
- Around line 230-236: The From<Box<dyn FileHandle>> impl for Reader<'a> assumes
the handle is at offset 0 but sets data_offset = 0 without synchronising the
actual FileHandle; update this constructor to either (A) seek/rewind the
underlying handle to position 0 and read its current position to set data_offset
consistently (use the FileHandle seek/rewind API before wrapping in
ReaderData::File), or (B) replace the From impl with a fallible constructor
(e.g., Reader::from_file(handle) -> Result<Reader, Error>) that queries and/or
seeks the handle to enforce the invariant and returns an error on failure;
ensure you update data_len via value.byte_length() and set data_offset from the
actual file cursor after the seek/query so Reader bookkeeping and the real file
cursor stay in sync.
- Around line 263-275: The FilePointer branch in ReaderDataRef uses
file.borrow_mut() and returns early on errors from file.seek or file.read_exact,
leaving the shared file cursor moved; fix by performing the seek/read inside a
sub-scope/closure or using a local Result (e.g. let read_result = (|| { ...
})();) so the `?` inside that closure returns an Err into read_result instead of
exiting the function, then always restore the cursor with
file.seek(SeekFrom::Start(original_offset))?; update the
ReaderDataRef::FilePointer code around original_offset, file.seek, and
read_result so cursor restoration runs regardless of read success or failure.

In `@packages/onenote-converter/parser/src/fsshttpb/data/binary_item.rs`:
- Around line 15-16: The code casts CompactU64::parse(...).value() (named size)
directly to usize before calling reader.read, which can truncate on platforms
where usize is smaller than u64; change the unchecked cast to a checked
conversion (e.g. use usize::try_from(size) or size.try_into()) and return an
error if the conversion fails, then pass the safely converted usize into
reader.read; update the CompactU64::parse usage and reader.read(size) call sites
(variables size and reader.read) to use the checked conversion and an
appropriate parse/error return on overflow.

---

Nitpick comments:
In `@packages/onenote-converter/parser-utils/src/file_api/native_driver.rs`:
- Around line 91-95: The current FileHandle implementation for
BufReader<fs::File> swallows metadata() errors by returning 0 in byte_length(),
causing misleading out-of-bounds seeks; update the FileHandle::byte_length
signature to return Result<u64, std::io::Error> (or the crate’s Error type) and
change the impl for BufReader<fs::File> to propagate get_ref().metadata()?.len()
as Ok(len) (or map_ok) so metadata errors are returned instead of defaulting to
0; update callers of FileHandle::byte_length (and any trait impls) to handle the
Result accordingly.

In `@packages/onenote-converter/parser/src/shared/compact_u64.rs`:
- Around line 23-82: Add table-driven regression tests that cover every width
branch in CompactU64::parse: test the zero-case, the 7-bit branch (first_byte &
1), 14-bit (first_byte & 2), 21-bit (first_byte & 4), 28-bit (8), 35-bit (16),
42-bit (32), 49-bit (64) and the 64-bit full path (first_byte & 128). For each
case construct a Reader input that exercises both normal decoding and EOF/error
behavior (e.g., truncated byte slices) and assert the returned CompactU64 value
or the error; reference the parse function and Reader.get_u8/get_u64/read
methods to locate where to craft inputs and expected outputs.

In `@packages/onenote-converter/parser/src/shared/file_data_ref.rs`:
- Around line 15-18: The current PartialEq impl on FileBlob compares loader
identity (Rc::ptr_eq) rather than blob bytes, which makes value-equal blobs
compare unequal; remove the impl PartialEq for FileBlob to avoid surprising
behavior and instead add an explicit identity helper like
FileBlob::is_same_loader(&self, other: &FileBlob) -> bool that returns
Rc::ptr_eq(&self.loader, &other.loader); if true value-equality semantics are
required elsewhere, add a separate method FileBlob::equals_bytes(&self, other:
&FileBlob) -> bool that compares size and actual bytes (reading via the loader)
rather than reintroducing PartialEq.
- Around line 21-23: FileBlob currently trusts a stored size field while load()
from FileDataLoader can return a different byte vector, leading len() being
wrong; update FileBlob::load() to read the data via the FileDataLoader::load()
implementation, then validate and reconcile the size: after obtaining data from
load(), either set self.size = data.len() (deriving the canonical size from the
loader output) or return an error if data.len() != self.size (choose consistent
behavior for your domain), and ensure FileBlob::len() uses the reconciled
self.size; refer to the FileDataLoader trait, FileBlob::load(), and
FileBlob::len() when making the change.

In `@packages/onenote-converter/renderer/src/page/image.rs`:
- Around line 11-15: The current code calls image.data()? which materializes the
entire payload before writing; change to a streaming/chunked write so large
images are never fully in memory: use determine_image_filename(image)? to get
filename and fs_driver().join(&self.output, &filename) for path, then
open/create a writable file handle via the fs_driver API (or add a
write_file_stream method) and iterate over the image's streaming API (e.g.,
streamFileDataUri or whatever yields chunks from image) writing each chunk to
the file handle until finished; remove the image.data()? call and replace
fs_driver().write_file(&path, &data[..])? with the chunked write loop so only a
small buffer is allocated at a time.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1fb11b20-240c-46de-bbc5-f07d09094c32

📥 Commits

Reviewing files that changed from the base of the PR and between 3279ea2 and 885f524.

📒 Files selected for processing (26)

packages/onenote-converter/parser-utils/node_functions.js
packages/onenote-converter/parser-utils/src/errors.rs
packages/onenote-converter/parser-utils/src/file_api/api.rs
packages/onenote-converter/parser-utils/src/file_api/native_driver.rs
packages/onenote-converter/parser-utils/src/file_api/wasm_driver.rs
packages/onenote-converter/parser-utils/src/reader.rs
packages/onenote-converter/parser/src/bin/inspect.rs
packages/onenote-converter/parser/src/fsshttpb/data/binary_item.rs
packages/onenote-converter/parser/src/fsshttpb/data/stream_object.rs
packages/onenote-converter/parser/src/fsshttpb/data_element/data_element_fragment.rs
packages/onenote-converter/parser/src/fsshttpb/data_element/object_data_blob.rs
packages/onenote-converter/parser/src/local_onestore/common/file_chunk_reference.rs
packages/onenote-converter/parser/src/local_onestore/file_node/file_node.rs
packages/onenote-converter/parser/src/local_onestore/file_structure/file_node_list.rs
packages/onenote-converter/parser/src/local_onestore/file_structure/file_node_list_fragment.rs
packages/onenote-converter/parser/src/local_onestore/one_store_file.rs
packages/onenote-converter/parser/src/onenote/embedded_file.rs
packages/onenote-converter/parser/src/onenote/image.rs
packages/onenote-converter/parser/src/onenote/mod.rs
packages/onenote-converter/parser/src/onestore/mod.rs
packages/onenote-converter/parser/src/shared/compact_u64.rs
packages/onenote-converter/parser/src/shared/file_data_ref.rs
packages/onenote-converter/parser/src/shared/property.rs
packages/onenote-converter/renderer/src/lib.rs
packages/onenote-converter/renderer/src/page/embedded_file.rs
packages/onenote-converter/renderer/src/page/image.rs

vinschger · 2026-04-16T04:47:45Z

has this mow been fixed in the current beta? where can I download it for macos silicone? then I will test it again for my 1.95 onepkg import into joplin.

personalizedrefrigerator · 2026-04-16T05:07:08Z

has this mow been fixed in the current beta?

This has not yet been fixed in the current beta. (This change will need to be merged before it is available in a prerelease).

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

packages/onenote-converter/parser-utils/src/reader.rs (2)

152-164: ⚠️ Potential issue | 🟠 Major

Reject oversized as_data_ref requests before slicing.

read_exact() and advance() already normalise short input to UnexpectedEof, but Line 158 still panics on buffer-backed readers when size > remaining(), and Line 160 can still hand back a deferred invalid file range. Please validate size up front here as well.

Proposed fix

 pub fn as_data_ref(&mut self, size: usize) -> Result<ReaderDataRef> {
+    if self.remaining() < size as u64 {
+        return Err(
+            ErrorKind::UnexpectedEof("Unexpected EOF (Reader.as_data_ref)".into()).into(),
+        );
+    }
+
     match &mut self.data {
         ReaderData::BufferRef { buffer } => {
             let start = self.data_offset as usize;
             // Cloning needs to be done early with BufferRef, since we don't own the original
             // data. Large data should generally use `ReaderData::File`.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/onenote-converter/parser-utils/src/reader.rs` around lines 152 -
164, In Reader::as_data_ref, reject requests where size > self.remaining() up
front to avoid panics and invalid file ranges: before matching on self.data (and
before slicing buffer[start..start+size] or returning a FilePointer), check if
size <= self.remaining() and return the same UnexpectedEof error variant used by
read_exact()/advance() when the request is too large; apply this validation for
both ReaderData::BufferRef (so slicing can't panic) and ReaderData::File (so the
returned FilePointer does not expose an invalid offset/size).

230-236: ⚠️ Potential issue | 🟠 Major

Synchronise the file handle before wrapping it.

This constructor sets data_offset to 0, but never rewinds or checks the underlying FileHandle. If the caller passes a pre-seeked handle, the invariant documented on Lines 52-53 is broken immediately and later reads/peeks start from the wrong place.

Proposed fix

-impl<'a> From<Box<dyn FileHandle>> for Reader<'a> {
-    fn from(value: Box<dyn FileHandle>) -> Self {
-        Self {
+impl<'a> Reader<'a> {
+    pub fn from_file_handle(mut value: Box<dyn FileHandle>) -> Result<Self> {
+        value.seek(SeekFrom::Start(0))?;
+        Ok(Self {
             data_len: value.byte_length(),
             data_offset: 0,
             data: ReaderData::File(Rc::new(RefCell::new(value))),
-        }
+        })
     }
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/onenote-converter/parser-utils/src/reader.rs` around lines 230 -
236, The From<Box<dyn FileHandle>> for Reader<'a> impl must ensure the incoming
FileHandle is positioned at the start before assuming data_offset = 0; rewind or
seek the boxed handle to position 0 on the FileHandle trait (e.g. call its
rewind/seek-to-start method) before wrapping it in ReaderData::File and setting
data_offset to 0, and handle/report any error from that operation so the Reader
invariant (start at file beginning) holds for later reads/peeks.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/onenote-converter/parser/src/shared/compact_u64.rs`:
- Around line 23-24: Change the parse function to accept a mutable reference to
the Reader (e.g. fn parse(reader: &mut Reader) -> Result<CompactU64>) because
Reader::get_u8() and Reader::read() require &mut self; update the function
signature for CompactU64::parse and any call sites (tests already pass &mut
Reader) so internal calls to reader.get_u8() and reader.read() compile.

---

Duplicate comments:
In `@packages/onenote-converter/parser-utils/src/reader.rs`:
- Around line 152-164: In Reader::as_data_ref, reject requests where size >
self.remaining() up front to avoid panics and invalid file ranges: before
matching on self.data (and before slicing buffer[start..start+size] or returning
a FilePointer), check if size <= self.remaining() and return the same
UnexpectedEof error variant used by read_exact()/advance() when the request is
too large; apply this validation for both ReaderData::BufferRef (so slicing
can't panic) and ReaderData::File (so the returned FilePointer does not expose
an invalid offset/size).
- Around line 230-236: The From<Box<dyn FileHandle>> for Reader<'a> impl must
ensure the incoming FileHandle is positioned at the start before assuming
data_offset = 0; rewind or seek the boxed handle to position 0 on the FileHandle
trait (e.g. call its rewind/seek-to-start method) before wrapping it in
ReaderData::File and setting data_offset to 0, and handle/report any error from
that operation so the Reader invariant (start at file beginning) holds for later
reads/peeks.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: badd912d-42c7-481a-a9ad-ed5bb1249633

📥 Commits

Reviewing files that changed from the base of the PR and between 885f524 and d3b2fd5.

📒 Files selected for processing (4)

packages/onenote-converter/parser-utils/src/errors.rs
packages/onenote-converter/parser-utils/src/reader.rs
packages/onenote-converter/parser/src/fsshttpb/data/binary_item.rs
packages/onenote-converter/parser/src/shared/compact_u64.rs

🚧 Files skipped from review as they are similar to previous changes (2)

packages/onenote-converter/parser/src/fsshttpb/data/binary_item.rs
packages/onenote-converter/parser-utils/src/errors.rs

coderabbitai

🧹 Nitpick comments (1)

packages/onenote-converter/parser/src/onenote/mod.rs (1)
78-83: Breaking API change: parse_onestore_raw now accepts a path instead of raw bytes.

The signature changed from parse_onestore_raw(&mut self, data: &[u8]) to parse_onestore_raw(&mut self, path: &str). While the doc comment indicates this is "for debugging purposes", any existing callers passing raw byte data will break.

If raw byte parsing is still needed for debugging or testing, consider keeping a separate method or adding an overload that accepts &[u8].
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/onenote-converter/parser/src/onenote/mod.rs` around lines 78 - 83,
The public API was changed by replacing parse_onestore_raw(&mut self, data:
&[u8]) with a path-based variant; restore backwards compatibility by keeping the
existing path-based version (which uses fs_driver().open_file and
Reader::try_from) and add back an overload parse_onestore_raw(&mut self, data:
&[u8]) that constructs a Reader from the byte slice and delegates to
parse_onestore (using Reader::try_from or a byte-cursor equivalent), so callers
that pass raw bytes continue to work while the path-based helper remains
available for debugging.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/onenote-converter/parser/src/onenote/mod.rs`:
- Around line 78-83: The public API was changed by replacing
parse_onestore_raw(&mut self, data: &[u8]) with a path-based variant; restore
backwards compatibility by keeping the existing path-based version (which uses
fs_driver().open_file and Reader::try_from) and add back an overload
parse_onestore_raw(&mut self, data: &[u8]) that constructs a Reader from the
byte slice and delegates to parse_onestore (using Reader::try_from or a
byte-cursor equivalent), so callers that pass raw bytes continue to work while
the path-based helper remains available for debugging.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 89d44d2e-f0a2-4cec-998e-7baea4a17cac

📥 Commits

Reviewing files that changed from the base of the PR and between d3b2fd5 and a1c23f4.

📒 Files selected for processing (2)

packages/onenote-converter/parser-utils/src/reader.rs
packages/onenote-converter/parser/src/onenote/mod.rs

personalizedrefrigerator added 18 commits April 13, 2026 22:32

Desktop: WIP: Fix OneNote import fails for large files

792431b

Refactors reader.rs

WIP: Mostly working

baeb361

Merge remote-tracking branch 'upstream/dev' into pr/desktop/onenote-i…

5ba19da

…mport/fix-large-file-handling

Fix failing test

1cb38f2

Refactor

2967bb0

WIP: Switch from usize to u64

d3db685

Revert "WIP: Switch from usize to u64"

3b83843

This reverts commit d3db685.

Refactor CompactU64 to avoid using peek

2f41920

cargo fmt

dd909b5

Improve error message

5ed19b6

Reapply "WIP: Switch from usize to u64"

1f41b5b

This reverts commit 3b83843.

Fix IPC

83e76ba

Don't load attachments into memory until just before writing

4305f31

Simplify file loading logic for now

26a2eab

Document invariant

0ff6f26

Apply review feedback: Remove unnecessary cast

bb95f56

Credit: Issue found during a Claude Code code review.

Remove debug "reading large amount of data" panic

1bc9da6

Credit: Issue flagged during a Claude Code code review.

Apply suggestions from code review

885f524

Credit: Issues were flagged by Claude Code during a review. (The implemented solutions are my own)

coderabbitai Bot added bug It's a bug desktop All desktop platforms import Related to importing files such as ENEX, JEX, etc. performance Performance issues labels Apr 16, 2026

coderabbitai Bot reviewed Apr 16, 2026

View reviewed changes

personalizedrefrigerator added 4 commits April 15, 2026 22:07

Fix issue flagged by Coderabbit: Fix error return behavior

1cbf398

Improve compact_u64 tests

d94b23e

Apply review feedback: Use a checked cast in BinaryItem

d3b2fd5

Apply review feedback: Add early EOF bounds check to as_data_ref

e3a260d

coderabbitai Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread packages/onenote-converter/parser/src/shared/compact_u64.rs

Apply review feedback: Don't assume that the file is initially at zero

a1c23f4

coderabbitai Bot reviewed Apr 16, 2026

View reviewed changes

Move BigInt conversion sooner

b5511be

laurent22 merged commit 4b9918a into laurent22:dev Apr 16, 2026
12 checks passed

personalizedrefrigerator mentioned this pull request Apr 24, 2026

Desktop: Importing from OneNote: Support large attachments #15195

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Desktop: Importing from OneNote: Fix large `.one` files with many attachments fail to import#15117

Desktop: Importing from OneNote: Fix large `.one` files with many attachments fail to import#15117
laurent22 merged 24 commits intolaurent22:devfrom
personalizedrefrigerator:pr/desktop/onenote-import/fix-large-file-handling

personalizedrefrigerator commented Apr 16, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Apr 16, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vinschger commented Apr 16, 2026

Uh oh!

personalizedrefrigerator commented Apr 16, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

personalizedrefrigerator commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Testing

Uh oh!

coderabbitai Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vinschger commented Apr 16, 2026

Uh oh!

personalizedrefrigerator commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

personalizedrefrigerator commented Apr 16, 2026 •

edited

Loading

coderabbitai Bot commented Apr 16, 2026 •

edited

Loading

personalizedrefrigerator commented Apr 16, 2026 •

edited

Loading