Skip to content

[BUG] Transfer Manager removes entries from FileCache if CircuitBreaker trips while downloading blob #18658

Open
@rayshrey

Description

@rayshrey

Describe the bug

In TransferManager, the fetchBlob function always calls compute method of the FileCache even if the entry is already present in the FileCache.

This causes an unforeseen scenario when CircuitBreaker starts tripping because the flow in FileCache compute is such that we first compute the key and then check if the circuit breaker is tripping. If yes, we remove the computed entry from the cache. As a result of this, even if the entry is already present in FileCache, when fetchBlob is called, we recompute again but since CircuitBreaker is tripping, the file entry already present is evicted.

The above scenario can result in cases when the IndexInput is not yet closed but since the entry is evicted from the FileCache the underlying exception points in a different direction which can derail debugging as it points towards an altogether different area.

Example exception when IndexInput is not closed but file is deleted.

Caused by: java.io.EOFException: seek past EOF (pos=7987177): MemorySegmentIndexInput(path="/<path>/var/es/data/nodes/0/indices/000010UHjS4EilSI2zEIIZ7IGZ1Q/1/index/_2t.cfs_block_1")

Related component

Storage:Remote

To Reproduce

Prerequisite for reproduction - High JVM on node so that Circuit Breaker trips.

Expected behavior

For entries already present in FileCache that are not closed, TransferManager should simply return those entries instead of recomputing.

Additional Details

N/A

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    🆕 New

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions