Rapid spam filtering system https://rspamd.com/

History

Vsevolod Stakhov 5c415d3338 [Fix] Add explicit console logging configuration for Docker container Add logging.inc to ensure rspamd logs are properly captured by Docker when running in foreground mode.		4 months ago
..
configs	[Fix] Add explicit console logging configuration for Docker container	4 months ago
scripts	[Test] Disable milter mode in proxy worker for integration tests	4 months ago
.dockerignore	[Test] Use locally built Rspamd in integration tests instead of prebuilt image	4 months ago
.gitignore	[Test] Add Docker-based integration test suite	4 months ago
Dockerfile.local	[Fix] Use DESTDIR pattern to fix hardcoded paths in rspamd binaries	4 months ago
Makefile	[Test] Add Docker-based integration test suite	4 months ago
README.md	[Test] Update integration tests to use rspamd-test-corpus	4 months ago
SUMMARY.md	[Test] Add Docker-based integration test suite	4 months ago
docker-compose.yml	[Fix] Add explicit console logging configuration for Docker container	4 months ago

README.md

Rspamd Integration and Load Testing

Comprehensive integration and load testing for Rspamd using Docker Compose.

Description

This test creates a complete Rspamd environment with:

Scanner workers for processing emails (with encryption)
Controller worker for management
Proxy worker for proxying requests (with encryption)
Fuzzy storage with encryption
Redis for data storage
Bayes classifier

The test performs the following steps:

Downloads email corpus from a given URL (or uses local test emails)
Trains Fuzzy storage on 10% of emails
Trains Bayes classifier on 10% of emails (spam and ham)
Scans the entire corpus
Validates that detection works correctly (~10% detection rate)

Requirements

Docker and Docker Compose
Python 3.8+
rspamadm (for key generation)

Features

This test uses AddressSanitizer (ASan) to detect:

Memory leaks
Buffer overflows
Use-after-free errors
Other memory issues

Docker image: rspamd/rspamd:asan-latest

Quick Start

1. Generate encryption keys

cd test/integration
./scripts/generate-keys.sh

2. Start environment

docker compose up -d

3. Check readiness

docker compose ps
docker compose logs rspamd

4. Run test

# With local corpus (uses test/functional/messages by default)
./scripts/integration-test.sh

# With rspamd-test-corpus (recommended)
# Download and extract the corpus
curl -L https://github.com/rspamd/rspamd-test-corpus/releases/latest/download/rspamd-test-corpus.zip -o corpus.zip
unzip corpus.zip
export CORPUS_DIR=$(pwd)/corpus/corpus
./scripts/integration-test.sh

# With local directory
export CORPUS_DIR=/path/to/emails
./scripts/integration-test.sh

5. Check for memory leaks

make check-asan

This script analyzes AddressSanitizer logs and reports any detected memory leaks.

6. Stop

docker compose down

Test Parameters

The test script uses environment variables for configuration:

# Configuration via environment variables
export RSPAMD_HOST=localhost        # Rspamd host
export CONTROLLER_PORT=50002        # Controller port
export PROXY_PORT=50004             # Proxy port
export PASSWORD=q1                  # API password
export PARALLEL=10                  # Parallel requests
export TRAIN_RATIO=0.1              # Training ratio (10%)
export TEST_PROXY=true              # Test via proxy worker
export CORPUS_DIR=/path/to/emails   # Corpus directory

./scripts/integration-test.sh

Project Structure

test/integration/
├── docker-compose.yml          # Docker Compose configuration
├── configs/                    # Rspamd configurations
│   ├── worker-normal.inc      # Scanner worker
│   ├── worker-controller.inc  # Controller worker
│   ├── worker-proxy.inc       # Proxy worker
│   ├── worker-fuzzy.inc       # Fuzzy storage worker
│   ├── fuzzy_check.conf       # fuzzy_check module
│   ├── redis.conf             # Redis settings
│   ├── statistic.conf         # Bayes classifier
│   ├── lsan.supp              # LeakSanitizer suppressions
│   └── fuzzy-keys.conf        # Encryption keys (generated)
├── scripts/
│   ├── generate-keys.sh       # Key generation
│   ├── integration-test.sh    # Test script
│   └── check-asan-logs.sh     # ASan log checker
├── data/                       # Data (corpus, results)
└── README.md

Configuration

Ports

50001 - Normal worker (scanning)
50002 - Controller (API)
50003 - Fuzzy storage
50004 - Proxy worker

Environment Variables

In docker-compose.yml you can configure:

REDIS_ADDR - Redis address
REDIS_PORT - Redis port
ASAN_OPTIONS - AddressSanitizer options
LSAN_OPTIONS - LeakSanitizer options

Encryption

Fuzzy storage uses encryption. Keys are generated automatically when running generate-keys.sh.

Results

Results are saved in data/results.json in the following format:

[
  {
    "file": "message1.eml",
    "score": 5.2,
    "symbols": {
      "FUZZY_SPAM": 2.5,
      "BAYES_SPAM": 3.0
    }
  },
  ...
]

Debugging

Check logs

# All logs
docker compose logs

# Only Rspamd
docker compose logs rspamd

# Follow logs
docker compose logs -f rspamd

Connect to container

docker compose exec rspamd /bin/sh

Check Rspamd operation

# Ping (Controller)
curl http://localhost:50002/ping

# Ping (Proxy)
curl http://localhost:50004/ping

# Statistics
curl -H "Password: q1" http://localhost:50002/stat

# Scan test email (via Controller)
curl -H "Password: q1" --data-binary @test.eml http://localhost:50002/checkv2

# Scan via Proxy
curl -H "Password: q1" --data-binary @test.eml http://localhost:50004/checkv2

Check Fuzzy storage

# Fuzzy statistics
curl -H "Password: q1" http://localhost:50002/fuzzystats

Test via Proxy

# Run test with proxy check
export TEST_PROXY=true
./scripts/integration-test.sh

# Results will be saved in:
# - data/scan_results.json (via controller)
# - data/proxy_results.json (via proxy)

Email Corpus

Using rspamd-test-corpus

The recommended way to run integration tests is with the rspamd-test-corpus repository.

This corpus contains:

~1000 base messages from SpamAssassin public corpus
Regression tests from real bug reports
Edge cases for corner case testing

Download the latest release:

curl -L https://github.com/rspamd/rspamd-test-corpus/releases/latest/download/rspamd-test-corpus.zip -o corpus.zip
unzip corpus.zip
export CORPUS_DIR=$(pwd)/corpus/corpus

Using Local Messages

By default, the test uses test/functional/messages directory. However, these messages are often too small or synthetic for realistic testing.

Adding Regression Tests

If you find a problematic email that causes a bug:

Report the issue on GitHub
Add the email to rspamd-test-corpus
The corpus will be automatically used in CI tests

CI/CD

See .github/workflows/integration-test.yml for automated runs in GitHub Actions.

The CI automatically downloads the latest corpus from rspamd-test-corpus repository.

AddressSanitizer

View ASan logs

# Logs are saved in data/asan.log*
cat data/asan.log*

# Automatic check
make check-asan

ASan Configuration

In docker-compose.yml the following options are configured:

ASAN_OPTIONS=detect_leaks=1:halt_on_error=0:abort_on_error=0:print_stats=1:log_path=/data/asan.log

detect_leaks=1 - detect memory leaks
halt_on_error=0 - don't stop on first error
abort_on_error=0 - don't call abort()
print_stats=1 - print statistics
log_path=/data/asan.log - log file path

Suppress False Positives

Edit configs/lsan.supp:

leak:function_name_to_suppress

Troubleshooting

Rspamd doesn't start

Check that keys are generated: ls configs/fuzzy-keys.conf
Check logs: docker compose logs rspamd
Check ASan logs: cat data/asan.log*

Redis unavailable

docker compose exec redis redis-cli ping

Low detection rate

Increase corpus size
Verify training completed successfully
Check Rspamd logs

Performance

For load testing you can:

Increase number of scanner workers in configs/worker-normal.inc
Increase corpus size
Run multiple parallel test instances