Skip to content

Implement In-Memory Storage for v2 Storage Layer #305

@jigar-joshi-nirmata

Description

@jigar-joshi-nirmata

🎯 Objective

Implement a complete in-memory storage backend for the v2 storage layer to provide a lightweight, zero-dependency alternative to PostgreSQL for development, testing, and small deployments.

📋 Background

The v2 storage layer introduced a generic IRepository[T] interface that allows multiple storage backends to be implemented in a pluggable fashion. Currently, only PostgreSQL is implemented (PR is up at: #304) . An in-memory storage option would:

  1. Enable faster development - No database setup required
  2. Simplify testing - Isolated, fast test environments
  3. Support small deployments - Where persistence isn't critical
  4. Match existing patterns - We already have pkg/storage/inmemory/ for v1

🏗️ Architecture

Current State

pkg/v2/storage/
├── IRepository.go          # Generic interface
├── filter.go               # Query filtering
├── errors.go              # Error types
└── postgres/              # PostgreSQL implementation ✅
    ├── repository.go
    ├── create.go
    ├── retrieve.go
    ├── update.go
    └── delete.go

Proposed State

pkg/v2/storage/
├── IRepository.go          # Generic interface
├── filter.go               # Query filtering
├── errors.go              # Error types
├── postgres/              # PostgreSQL implementation ✅
│   └── ...
└── inmemory/              # In-memory implementation ✅ (NEW)
    ├── repository.go      # Map-based storage with RWMutex
    ├── create.go          # Strict create semantics
    ├── retrieve.go        # Get/List with concurrent reads
    ├── update.go          # Strict update semantics
    └── delete.go          # Delete with proper validation

🔧 Implementation Details

Data Structure

type InMemoryRepository[T metav1.Object] struct {
    mu           sync.RWMutex      // Thread-safe concurrent access
    db           map[string]T      // In-memory storage
    namespaced   bool              // Resource scope
    resourceType string            // For logging
    gr           schema.GroupResource // For K8s errors
}

Key Generation Strategy

  • Namespaced resources: "namespace/name" (e.g., "default/my-report")
  • Cluster-scoped resources: "name" (e.g., "my-cluster-report")

Thread Safety

  • Read operations (Get, List): Use RLock() for concurrent reads
  • Write operations (Create, Update, Delete): Use Lock() for exclusive access
  • Critical for web servers handling concurrent HTTP requests

Semantics

Follows strict Kubernetes API semantics:

  • Create: Fails if resource already exists (returns ErrAlreadyExists)
  • Update: Fails if resource doesn't exist (returns ErrNotFound)
  • Delete: Fails if resource doesn't exist (returns ErrNotFound)
  • Get: Returns resource or ErrNotFound
  • List: Returns empty slice if no matches (never errors for empty results)

🎁 Benefits

For Development

  • ✅ Zero configuration - no database setup
  • ✅ Instant startup - no migrations needed
  • ✅ Easy debugging - simple map structure

For Testing

  • ✅ Isolated tests - each test gets its own instance
  • ✅ Fast execution - no network/disk I/O
  • ✅ Deterministic - no database state leakage

For Small Deployments

  • ✅ No external dependencies
  • ✅ Lower resource usage
  • ✅ Simpler deployment

🔄 Comparison with PostgreSQL

Feature In-Memory PostgreSQL
Persistence No Yes
Concurrency RWMutex Connection pool
Setup None Database required
Dependencies Zero PostgreSQL
Performance Fastest Network overhead
Use Case Dev/Test/Small Production
Data Loss Risk High (restart = data loss) Low
Scalability Memory-limited Database-limited

🐛 Additional Fix Required

While implementing this, we discovered that the PostgreSQL implementation has broken imports:

// ❌ These packages don't exist:
serverMetrics "github.com/kyverno/reports-server/pkg/server/metrics"
storageMetrics "github.com/kyverno/reports-server/pkg/storage/metrics"

These need to be removed until the metrics functions are properly implemented.

✅ Acceptance Criteria

  • Implement InMemoryRepository[T] struct with thread-safe map storage
  • Implement all IRepository[T] methods (Create, Get, List, Update, Delete)
  • Use sync.RWMutex for proper concurrent access
  • Follow strict Kubernetes API semantics
  • Handle both namespaced and cluster-scoped resources
  • No linter errors
  • Code builds successfully
  • Fix broken metrics imports in postgres implementation
  • Add unit tests for concurrent access
  • Add integration tests
  • Update documentation

📚 Example Usage

// Create repository for PolicyReports (namespaced)
repo := inmemory.NewInMemoryRepository[*v1alpha2.PolicyReport](
    "PolicyReport",
    true, // namespaced
    schema.GroupResource{Group: "wgpolicyk8s.io", Resource: "policyreports"},
)

// CRUD operations
report := &v1alpha2.PolicyReport{}
report.SetName("my-report")
report.SetNamespace("default")

err := repo.Create(ctx, report)
report, err := repo.Get(ctx, storage.NewFilter("my-report", "default"))
reports, err := repo.List(ctx, storage.Filter{Namespace: "default"})
err = repo.Update(ctx, report)
err = repo.Delete(ctx, storage.NewFilter("my-report", "default"))

🔮 Future Enhancements

  1. Metrics Integration: Add metrics once the functions are implemented
  2. Size Limits: Add configurable memory limits and eviction policies
  3. Persistence Options: Optional snapshot/restore capabilities
  4. Cluster ID Support: Similar to PostgreSQL implementation
  5. Performance Monitoring: Track operation latencies

🔗 Related

  • Original v1 in-memory: pkg/storage/inmemory/
  • V2 interface: pkg/v2/storage/IRepository.go
  • PostgreSQL implementation: pkg/v2/storage/postgres/

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions