panurus

Keystore Cleanup Service

The Keystore Cleanup Service provides automatic deletion of cryptographic keys from the keystore for tokens that have been deleted (spent, expired, or invalidated). This ensures that the keystore doesn’t accumulate stale keys indefinitely, improving security and reducing storage overhead.

Overview

The cleanup system consists of four main components:

  1. Manager: Orchestrates the cleanup process with periodic scanning and distributed coordination
  2. SKI Extraction System: Pluggable architecture for deriving Subject Key Identifiers (SKIs) from owner identities
  3. Storage Interface: Provides database operations for querying deleted tokens and tracking cleanup state
  4. Keystore Interface: Abstracts key deletion operations

Architecture

Cleanup Manager

The Manager runs in the background and periodically scans for deleted tokens that are eligible for key cleanup. It uses distributed locking (PostgreSQL advisory locks) to ensure only one replica in a multi-instance deployment performs cleanup at a time.

Key Features:

Lifecycle:

SKI Extraction Architecture

The SKI (Subject Key Identifier) extraction system uses a pluggable provider architecture to support different identity types. This allows identity-type-specific logic to be encapsulated in separate providers.

Components:

  1. TypedSKIProvider Interface: Defines the contract for identity-type-specific SKI extraction
    type TypedSKIProvider interface {
        // GetSKIsFromIdentity derives one or more SKIs from an identity's raw bytes
        GetSKIsFromIdentity(ctx context.Context, identity []byte) ([]string, error)
    }
    
  2. SKIExtractor: Orchestrates SKI extraction by maintaining a registry of providers
    • Maps identity types (e.g., “idemix”, “x509”) to their specific providers
    • Routes extraction requests to the appropriate provider
    • Falls back to default provider for unknown types
    • Thread-safe for concurrent use after initialization
  3. Built-in Providers:
    • IdemixSKIProvider: Extracts SKI from Idemix NymPublicKey
    • IdemixNymSKIProvider: Extracts SKI from Idemix pseudonym identities
    • NoopSKIProvider: Returns empty SKI list (used for X.509)
    • FallbackSKIProvider: Computes SHA256 hash of identity bytes as SKI (default)

Provider Registration:

extractor := NewSKIExtractor()
extractor.RegisterProvider("idemix", idemix.NewSKIProvider())
extractor.RegisterProvider("idemixnym", idemixnym.NewSKIProvider(identityStore))
extractor.RegisterProvider("x509", NewNoopSKIProvider())
// Fallback provider is used for any unregistered types

SKI Derivation Process:

  1. SKIExtractor receives identity bytes and type
  2. Looks up registered provider for that type
  3. If found, delegates to type-specific provider
  4. If not found, uses fallback provider (SHA256 hash)
  5. Returns list of SKI strings in hexadecimal format

Interfaces

Storage Interface

The Storage interface abstracts database operations needed for cleanup:

type Storage interface {
    // AcquireCleanupLeadership obtains distributed lock for leader election
    AcquireCleanupLeadership(ctx context.Context, lockID int64) (Leadership, bool, error)
    
    // GetDeletedTokensPendingSKICleanup queries deleted tokens older than TTL that haven't been cleaned
    GetDeletedTokensPendingSKICleanup(ctx context.Context, olderThan time.Duration, limit int) ([]DeletedToken, error)
    
    // MarkTokenCleaned records successful key cleanup to prevent reprocessing
    MarkTokenCleaned(ctx context.Context, txID string, index uint64, cleanedBy string) error
}

type Leadership interface {
    // Close releases the leadership lock
    Close() error
}

type DeletedToken struct {
    TxID          string
    Index         uint64
    OwnerIdentity []byte
    OwnerType     string
    DeletedAt     time.Time
}

SKI Provider Interface

type SKIProvider interface {
    // GetSKIsFromIdentity derives one or more SKIs from an owner identity
    GetSKIsFromIdentity(ctx context.Context, identity []byte, identityType string) ([]string, error)
}

Keystore Interfaces

type Keystore interface {
    // Delete removes the key with the given identifier
    Delete(id string) error
    // Close closes the keystore
    Close() error
}

type KeystoreProvider interface {
    // Keystore returns the keystore for the given TMS
    Keystore(tmsID token.TMSID) (Keystore, error)
}

Configuration

Cleanup behavior is controlled via configuration (see Configuration):

services:
  storage:
    cleanup:
      enabled: false             # Disabled by default - must be explicitly enabled
      ttl: 24h                   # Minimum age before cleanup
      scanInterval: 1h           # How often to scan
      batchSize: 100            # Max tokens per scan
      workerCount: 1            # Parallel workers (default: 1)
      advisoryLockID: 0x74746b636c65616e  # Lock ID for leader election ("ttkclean" in hex)
      instanceID: "cleanup-1"   # Instance identifier (auto-generated if empty)

Configuration Details:

Configuration Loading:

The configuration is loaded from the TMS configuration using the key services.storage.cleanup. The LoadConfig() function merges provided values with defaults, preserving defaults for any unset values.

Usage

Creating a Cleanup Manager

import (
    "github.com/LFDT-Panurus/panurus/token"
    "github.com/LFDT-Panurus/panurus/token/services/storage/services/cleanup"
)

// Create configuration
config := cleanup.Config{
    Enabled:         true,
    TTL:             24 * time.Hour,
    ScanInterval:    1 * time.Hour,
    BatchSize:       100,
    WorkerCount:     4,
    AdvisoryLockID:  0x74746b636c65616e,
    InstanceID:      "cleanup-instance-1",
}

// Create TMS ID
tmsID := token.TMSID{
    Network:   "fabric",
    Channel:   "mychannel",
    Namespace: "token-chaincode",
}

// Create manager
manager := cleanup.NewManager(
    logger,              // logging.Logger
    storage,             // Storage interface implementation
    skiProvider,         // SKIProvider interface implementation
    keystoreProvider,    // KeystoreProvider interface implementation
    tmsID,               // Token Management System ID
    config,              // Configuration
)

// Start cleanup (runs initial sweep immediately)
if err := manager.Start(); err != nil {
    return err
}

// Stop cleanup gracefully
defer manager.Stop()

Registering Custom SKI Providers

To add support for a custom identity type:

// Implement TypedSKIProvider interface
type CustomSKIProvider struct {
    // your fields
}

func (p *CustomSKIProvider) GetSKIsFromIdentity(ctx context.Context, identity []byte) ([]string, error) {
    // Parse identity and extract SKIs
    // Return SKIs as hexadecimal strings
    return []string{"ski1", "ski2"}, nil
}

// Register with extractor
extractor := cleanup.NewSKIExtractor()
extractor.RegisterProvider("custom-type", &CustomSKIProvider{})

Integration with TMS

The cleanup service is typically integrated automatically through the service manager:

import (
    "github.com/LFDT-Panurus/panurus/token/services/storage/services/cleanup"
)

// Create service manager (manages cleanup instances per TMS)
cleanupManager := cleanup.NewServiceManager(
    configuration,           // Configuration interface
    identityStorageProvider, // Identity storage provider
    tokensProvider,          // Tokens service manager
)

// Service manager automatically:
// - Creates cleanup manager per TMS
// - Registers built-in SKI providers (idemix, idemixnym, x509)
// - Sets up storage and keystore adapters
// - Starts the cleanup manager

Cleanup Process Flow

  1. Startup: Manager starts and runs initial sweep immediately
  2. Leadership Acquisition: Manager attempts to acquire PostgreSQL advisory lock
  3. Query Eligible Tokens: If leadership acquired, query deleted tokens older than TTL that haven’t been cleaned yet
  4. Create Worker Pool: Spawn configured number of workers for this sweep
  5. Distribute Work: Fan out tokens to worker pool via channel
  6. Per Token Processing:
    • Get keystore for TMS
    • Derive SKIs from owner identity using appropriate provider
    • Delete each SKI from keystore
    • Mark token as cleaned in database (even on partial success)
  7. Release Leadership: Close leadership lock
  8. Wait: Sleep until next scan interval
  9. Repeat: Go to step 2

Timing Details:

Error Handling

The cleanup service handles errors gracefully with specific retry behavior:

Key Deletion Errors

Rationale

This error handling strategy balances reliability with forward progress:

Other Errors

Token Lifecycle States

A token transitions through the following states related to cleanup:

The cleanup service only processes tokens in the “Eligible for Cleanup” state.

Database Schema

The cleanup service uses a dedicated tracking table to record cleanup operations:

Token SKI Cleanups Table:

CREATE TABLE IF NOT EXISTS token_ski_cleanups (
    tx_id TEXT NOT NULL,
    idx INT NOT NULL,
    cleaned_at TIMESTAMP NOT NULL,
    cleaned_by TEXT NOT NULL,
    PRIMARY KEY (tx_id, idx),
    FOREIGN KEY (tx_id, idx) REFERENCES tokens
);
CREATE INDEX IF NOT EXISTS idx_cleaned_at_token_ski_cleanups ON token_ski_cleanups ( cleaned_at );

This table tracks when each token’s cryptographic keys were cleaned from the keystore, preventing reprocessing. The cleaned_by field records which cleanup instance performed the operation, useful for debugging in multi-instance deployments.

The token_ski_cleanups table is automatically created by the schema initialization and does not require manual database alterations.

Distributed Deployment

PostgreSQL

SQLite

Configuration Guidelines

Default Values

Tuning Recommendations

  1. For High-Volume Environments:
    • Increase batchSize to 200-500 for more tokens per sweep
    • Increase workerCount to 4-16 for faster parallel processing
    • Decrease scanInterval to 30m for more frequent cleanup
  2. For Resource-Constrained Systems:
    • Keep workerCount at 1 to minimize CPU usage
    • Increase scanInterval to 2-4h to reduce database load
    • Keep default batchSize to limit memory usage
  3. For Security-Sensitive Deployments:
    • Decrease ttl to 12h for faster key removal
    • Decrease scanInterval to 30m for more frequent cleanup
    • Monitor cleanup metrics to ensure timely processing
  4. For Multi-Instance Deployments:
    • PostgreSQL Required: Multi-instance deployments require PostgreSQL for distributed coordination
    • Keep default advisoryLockID unless running multiple independent cleanup systems
    • Consider setting explicit instanceID values for easier debugging and monitoring

Performance Considerations

Monitoring

Key metrics to monitor:

Log Levels:

Security Considerations

Comparison with Recovery Service

Feature Recovery Service Cleanup Service
Purpose Re-register finality listeners Delete stale cryptographic keys
Frequency Every 5 seconds Every 1 hour
TTL 30 seconds 24 hours
Target Pending transactions Deleted tokens
Urgency High (affects finality) Low (housekeeping)
Batch Size 100 100
Workers 4 1 (default)
Initial Sweep Immediate Immediate
Leadership Per sweep Per sweep

See Also