Safety Guarantees

Sluice runs inside your Celery workers. A bug in our SDK could take down your payment queue, your email pipeline, or your ML inference jobs. We take that responsibility seriously.

The production-cannot-fail philosophy

The SDK operates under five non-negotiable safety rules:

1. Never crash the worker

All runtime code is wrapped in try/except. Network failures, API errors, serialization issues, and unexpected exceptions are caught and logged — they never propagate to your Celery worker.

# What you'll see in logs if something goes wrong:
[sluice] Failed to set up Celery integration. Monitoring will not be active.
         See: https://docs.sluice.sh/troubleshooting/setup

The only exception the SDK raises is SluiceConfigError, and that happens at startup during init() — before your worker starts processing tasks. If you misconfigure the API key or connection ID, you’ll know immediately.

2. Never slow your tasks

The SDK captures events asynchronously. Event forwarding happens in a background thread — it doesn’t block task execution. Your tasks run at the same speed with or without Sluice.

3. Never leak memory

The internal event buffer is bounded. If the Sluice API is unreachable and the buffer fills up, the SDK drops the oldest events rather than growing unboundedly. Your worker’s memory footprint stays stable.

4. Never block the event loop

For Celery workers using gevent or eventlet concurrency, the SDK avoids blocking I/O in the event loop. Network calls use non-blocking HTTP transport.

5. Never log sensitive data

Task arguments and return values are not sent to Sluice by default. The event stream captures task metadata — name, state, queue, timestamps, errors — but not the actual data your tasks process.

Sluice V0 does not support opt-in argument/result capture. This is planned for a future release with per-task redaction controls.

What happens when things go wrong

Scenario	SDK behavior
Sluice API is unreachable	Events buffer in memory, retry with backoff. If buffer fills, oldest events are dropped. Worker is unaffected.
API key is invalid	`SluiceConfigError` at startup. Worker starts normally, monitoring is not active.
Network timeout	Retry with exponential backoff. No impact on task processing.
Invalid event data	Event is skipped and logged. Other events continue normally.
`init()` called twice	Warning logged, second call is ignored.
Celery upgrade changes event format	SDK gracefully handles unknown fields via the `extensions` map.

Graceful degradation

If the SDK encounters an unrecoverable error during setup, it logs the error and disables itself. Your Celery worker continues running exactly as it would without Sluice installed — there is no scenario where having the SDK installed causes your worker to fail to start or process tasks.

​The production-cannot-fail philosophy

​1. Never crash the worker

​2. Never slow your tasks

​3. Never leak memory

​4. Never block the event loop

​5. Never log sensitive data

​What happens when things go wrong

​Graceful degradation

The production-cannot-fail philosophy

1. Never crash the worker

2. Never slow your tasks

3. Never leak memory

4. Never block the event loop

5. Never log sensitive data

What happens when things go wrong

Graceful degradation