Skip to content

fix: improve error handling for deleted apps#413

Open
ajasnosz wants to merge 1 commit intodevelopfrom
fix/deleted-apps-error-handle
Open

fix: improve error handling for deleted apps#413
ajasnosz wants to merge 1 commit intodevelopfrom
fix/deleted-apps-error-handle

Conversation

@ajasnosz
Copy link
Copy Markdown
Collaborator

Fix excessive log noise when the nozzle processes firehose events referencing deleted CF apps. Previously, every event for a deleted app produced a log line — either an ERROR (when IsResourceNotFound failed to match) or a repeated INFO. With high event volume, this could generate thousands of identical log entries per deleted app.

Changes

cache/boltdb.go

  • Added new sentinel error ErrMissingAlreadyCached to distinguish between two distinct cache-miss scenarios:
  • ErrMissingAndIgnored — returned on the first discovery when getAppFromRemote confirms the app is deleted (CF API returns ResourceNotFound). The app GUID is added to the missingApps set at this point.
  • ErrMissingAlreadyCached — returned on subsequent lookups when the app GUID is already present in the missingApps set. This is returned from getAppFromCache instead of reusing ErrMissingAndIgnored.
  • Updated getAppFromCache to return ErrMissingAlreadyCached (instead of ErrMissingAndIgnored) when the app is already recorded in the missingApps map.

events/events.go

  • Updated AnnotateWithAppData error handling to check for ErrMissingAlreadyCached first and skip logging entirely for that case, eliminating repeated log entries for the same deleted app.
  • The first discovery (ErrMissingAndIgnored) still logs once at INFO level so operators have visibility into the event.
  • The existing IsResourceNotFound and transient error branches are unchanged.

@ajasnosz ajasnosz requested a review from sbylica-splunk March 20, 2026 14:15
@ajasnosz ajasnosz deployed to workflow-approval March 20, 2026 14:15 — with GitHub Actions Active
// Not able to find the app from remote. App may be deleted.
// Check if the app is available in boltdb cache

if IsResourceNotFound(err) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this branch never happens, we replace this error with ErrMissingAndIgnored during AnnotateWithAppData higher up.

Context("When orphan app is requested", func() {

It("Should found app in cache", func() {
It("Should not find deleted app after cache invalidation", func() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really get it, so previously if orphan app was requested it would be found in cache anyway?

if IsResourceNotFound(err) {
// App is confirmed deleted by CF — clean up stale BoltDB entry
// and record in missingApps so we don't keep hitting the API.
c.removeAppFromDatabase(appGuid)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have logger message imo

}

func (c *Boltdb) removeAppFromDatabase(appGuid string) {
c.appdb.Update(func(tx *bolt.Tx) error {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a log message here too

}
return nil
})
for _, k := range staleKeys {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error handling here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants