This documentation is part of the "Projects with Books" initiative at zenOSmosis.
The source code for this project is available on GitHub.
Production Deployment Considerations
Loading…
Production Deployment Considerations
Relevant source files
Purpose and Scope
This document provides technical guidance for deploying the Docker MQTT Mosquitto with Cloudflare Tunnel system in production environments. It covers security hardening, scalability, high availability, resource management, monitoring, and disaster recovery strategies that extend beyond the basic development setup.
For information about the advanced security features in the alternative branch, see Protected Branch Features. For monitoring implementation details, see Monitoring and Health Checks. For secret management during development, see Version Control Best Practices.
The development configuration provided in docker-compose.yml:1-18 is suitable for testing and proof-of-concept deployments, but requires several modifications for production use. This document identifies these gaps and provides implementation guidance.
Security Hardening
Authentication and Authorization
The default configuration in mosquitto.conf2 sets allow_anonymous true, which permits unauthenticated client connections. This is unacceptable for production deployments.
Production Configuration Requirements:
| Security Control | Development | Production Required |
|---|---|---|
| Anonymous Access | Enabled | Disabled |
| Password Authentication | None | Required |
| ACL (Access Control Lists) | None | Topic-level restrictions |
| TLS Encryption | Cloudflare-managed | End-to-end recommended |
| Connection Limits | Unlimited | Rate-limited |
Implementing Password Authentication
Modify mosquitto.conf:1-6 to include authentication:
listener 1883
allow_anonymous false
password_file /mosquitto/config/password_file
listener 9001
protocol websockets
allow_anonymous false
password_file /mosquitto/config/password_file
Create the password file using mosquitto_passwd:
Update docker-compose.yml:7-8 to mount the password file:
Implementing Topic-Based ACL
Create an ACL file to restrict topic access per user. The [protected-no-wildcard branch](https://github.com/jzombie/docker-mqtt-mosquitto-cloudflare-tunnel/blob/59f1274c/protected-no-wildcard branch) demonstrates a username-based topic hierarchy where the first topic level represents the username.
Production ACL Pattern:
# Admin users - full access
user admin
topic readwrite #
# IoT devices - restricted to device-specific topics
user device001
topic readwrite devices/device001/#
user device002
topic readwrite devices/device002/#
# Read-only monitoring users
user monitor
topic read #
Sources: mosquitto.conf:1-6 README.md:9-10
Secret Management
The development approach using .env files is insufficient for production. The CLOUDFLARE_TUNNEL_TOKEN in docker-compose.yml17 grants tunnel routing access and must be protected.
Production Secret Management Flow
Kubernetes Secret Management Example:
Docker Swarm Secret Management Example:
Sources: docker-compose.yml:14-17 .env.sample1 .gitignore1
Network Security
The current configuration in docker-compose.yml:1-18 does not implement network policies. Production deployments should restrict inter-container communication.
graph TB
subgraph "External Network"
CF_EDGE["Cloudflare Edge Network"]
end
subgraph "Docker Host"
subgraph "frontend_network"
CFD["cloudflared\ncontainer_name: cloudflared\nimage: cloudflare/cloudflared:latest"]
end
subgraph "backend_network"
MOSQ["mosquitto\ncontainer_name: mosquitto\nimage: eclipse-mosquitto:latest"]
end
subgraph "Shared Network: tunnel_network"
CFD_BRIDGE["cloudflared bridge"]
MOSQ_BRIDGE["mosquitto bridge"]
end
end
CF_EDGE <-->|outbound tunnel| CFD
CFD_BRIDGE <-->|port 9001| MOSQ_BRIDGE
CFD -.->|connected to tunnel_network| CFD_BRIDGE
MOSQ -.->|connected to tunnel_network| MOSQ_BRIDGE
Docker Network Segmentation
Production docker-compose.yml Network Configuration:
Key Security Enhancements:
read_only: true: Container filesystems are read-onlysecurity_opt: no-new-privileges: Prevents privilege escalationtmpfs: /tmp: Writable temporary directory- Named volumes: Persistent data storage
- Static IP addresses: Predictable network topology
Sources: docker-compose.yml:1-18
Persistent Storage and Data Management
The development configuration in docker-compose.yml:4-9 does not provision persistent volumes for Mosquitto data. Message retention, subscriptions, and authentication data require persistent storage.
graph LR
subgraph "Host Filesystem"
COMPOSE["docker-compose.yml"]
MOSQ_CONF_FILE["mosquitto.conf"]
end
subgraph "Docker Volumes"
DATA_VOL["mosquitto_data\n(retained messages,\nsubscriptions)"]
LOG_VOL["mosquitto_log\n(broker logs)"]
CONFIG_VOL["config volume\n(mounted configs)"]
end
subgraph "mosquitto Container"
MOSQ_PROC["mosquitto process"]
DATA_MOUNT["/mosquitto/data"]
LOG_MOUNT["/mosquitto/log"]
CONF_MOUNT["/mosquitto/config"]
end
subgraph "Backup Systems"
BACKUP_CRON["cron job"]
BACKUP_STORAGE["Remote Storage\n(S3, NFS, etc.)"]
end
MOSQ_CONF_FILE -->|bind mount| CONFIG_VOL
CONFIG_VOL -->|mounted at| CONF_MOUNT
DATA_VOL -->|mounted at| DATA_MOUNT
LOG_VOL -->|mounted at| LOG_MOUNT
CONF_MOUNT --> MOSQ_PROC
DATA_MOUNT --> MOSQ_PROC
LOG_MOUNT --> MOSQ_PROC
DATA_VOL -.->|backup| BACKUP_CRON
LOG_VOL -.->|backup| BACKUP_CRON
BACKUP_CRON -->|archive| BACKUP_STORAGE
Storage Architecture
Volume Configuration for Production
Modify docker-compose.yml:7-8 to include persistent volumes:
Backup Strategy
Automated Backup Script:
Restoration Procedure:
Sources: docker-compose.yml:7-8
Resource Management and Scaling
Container Resource Limits
The current docker-compose.yml:4-18 does not specify resource constraints. Production deployments must prevent resource exhaustion.
Production Resource Configuration:
graph TB
subgraph "Host Resources"
CPU["CPU Cores"]
MEMORY["System Memory"]
DISK["Disk I/O"]
end
subgraph "Container Allocations"
MOSQ_CPU["mosquitto\nCPUs: 1.0\nMemory: 1GB\nreservations.memory: 512MB"]
CFD_CPU["cloudflared\nCPUs: 0.5\nMemory: 512MB\nreservations.memory: 256MB"]
end
subgraph "Monitoring Limits"
ALERTS["Resource Alerts\n(>80% usage)"]
METRICS["Prometheus Metrics\ncontainer_memory_usage_bytes\ncontainer_cpu_usage_seconds_total"]
end
CPU -->|allocated| MOSQ_CPU
CPU -->|allocated| CFD_CPU
MEMORY -->|allocated| MOSQ_CPU
MEMORY -->|allocated| CFD_CPU
MOSQ_CPU -.->|export metrics| METRICS
CFD_CPU -.->|export metrics| METRICS
METRICS -->|trigger| ALERTS
Mosquitto Performance Tuning
Extend mosquitto.conf:1-6 with production performance settings:
# Connection limits
max_connections 1000
max_queued_messages 1000
max_inflight_messages 20
# Persistence settings
persistence true
persistence_location /mosquitto/data/
autosave_interval 300
autosave_on_changes false
# Memory management
max_keepalive 60
message_size_limit 0
# Logging (reduced verbosity for production)
log_dest file /mosquitto/log/mosquitto.log
log_type error
log_type warning
log_timestamp true
# Listeners
listener 1883
allow_anonymous false
password_file /mosquitto/config/password_file
listener 9001
protocol websockets
allow_anonymous false
password_file /mosquitto/config/password_file
Key Configuration Parameters:
| Parameter | Development | Production | Purpose |
|---|---|---|---|
max_connections | Unlimited | 1000 | Prevent resource exhaustion |
persistence | False (default) | True | Retain messages across restarts |
autosave_interval | N/A | 300 | Save retained messages every 5 minutes |
max_keepalive | 65535 | 60 | Detect dead connections faster |
log_type | All | error, warning | Reduce log volume |
Sources: mosquitto.conf:1-6 docker-compose.yml:4-9
High Availability and Clustering
Multi-Instance Deployment Architecture
The single-instance architecture in docker-compose.yml:4-9 provides no redundancy. Production systems require high availability.
Cloudflare Load Balancer Configuration:
- Create multiple Cloudflare Tunnels (one per region/instance)
- Configure DNS load balancing in Cloudflare dashboard:
- Geographic routing: Route based on client location
- Health checks: Monitor tunnel availability
- Failover: Automatic failover to healthy tunnels
Multi-Tunnel docker-compose.yml:
Mosquitto Bridge Configuration
Configure message synchronization between instances by adding to mosquitto.conf:
# Bridge configuration for message replication
connection bridge-to-replica
address mosquitto-replica:1883
topic # both 0
cleansession false
try_private false
bridge_attempt_unsubscribe true
bridge_protocol_version mqttv311
Sources: docker-compose.yml:4-18 mosquitto.conf:1-6
Container Orchestration
Kubernetes Deployment
For production-scale deployments, replace docker-compose.yml:1-18 with Kubernetes manifests.
Kubernetes Deployment Manifest (mosquitto):
Kubernetes Deployment Manifest (cloudflared):
PersistentVolumeClaim:
Sources: docker-compose.yml:1-18
Monitoring and Logging
Logging Configuration
The default mosquitto.conf:1-6 does not configure logging. Production deployments require comprehensive logging.
Production Logging Configuration (mosquitto.conf):
graph LR
subgraph "Log Sources"
MOSQ_PROC["mosquitto process"]
CFD_PROC["cloudflared process"]
DOCKER_DAEMON["Docker Daemon"]
end
subgraph "Log Collection"
MOSQ_LOG_VOL["/mosquitto/log/mosquitto.log"]
DOCKER_JSON_LOG["container stdout/stderr\n(JSON driver)"]
end
subgraph "Log Aggregation"
FLUENTD["Fluentd\n(log shipper)"]
FILEBEAT["Filebeat\n(log shipper)"]
end
subgraph "Log Storage & Analysis"
ELK["Elasticsearch\n(log indexing)"]
KIBANA["Kibana\n(visualization)"]
LOKI["Grafana Loki\n(log aggregation)"]
SPLUNK["Splunk\n(SIEM)"]
end
subgraph "Alerting"
ALERT_MGR["AlertManager\n(alert routing)"]
PAGERDUTY["PagerDuty"]
SLACK["Slack"]
end
MOSQ_PROC -->|writes to| MOSQ_LOG_VOL
MOSQ_PROC -->|stdout| DOCKER_JSON_LOG
CFD_PROC -->|stdout| DOCKER_JSON_LOG
DOCKER_DAEMON -->|container logs| DOCKER_JSON_LOG
MOSQ_LOG_VOL -->|tail| FLUENTD
DOCKER_JSON_LOG -->|docker logs| FILEBEAT
FLUENTD --> ELK
FILEBEAT --> ELK
FLUENTD --> LOKI
ELK --> KIBANA
LOKI --> ALERT_MGR
ELK --> ALERT_MGR
ALERT_MGR --> PAGERDUTY
ALERT_MGR --> SLACK
# Comprehensive logging
log_dest file /mosquitto/log/mosquitto.log
log_dest stdout
log_type error
log_type warning
log_type notice
log_type information
log_type subscribe
log_type unsubscribe
log_timestamp true
log_timestamp_format %Y-%m-%dT%H:%M:%S
connection_messages true
Docker Logging Driver Configuration:
Metrics and Monitoring
Implement Prometheus-compatible metrics exporters:
docker-compose.yml with Monitoring:
Prometheus Configuration (prometheus.yml):
Key Metrics to Monitor:
| Metric | Description | Alert Threshold |
|---|---|---|
mosquitto_connected_clients | Active client connections | > 80% of max_connections |
mosquitto_messages_received_total | Total messages received | Rate < 1/min (potential downtime) |
mosquitto_messages_sent_total | Total messages sent | Rate < 1/min (potential issue) |
mosquitto_retained_messages | Number of retained messages | > 90% of storage capacity |
container_memory_usage_bytes | Container memory usage | > 80% of limit |
container_cpu_usage_seconds_total | Container CPU usage | > 80% of limit |
Sources: mosquitto.conf:1-6 docker-compose.yml:1-18
Disaster Recovery and Business Continuity
Backup Automation
Implement automated backup schedules using cron or Kubernetes CronJobs.
Kubernetes CronJob for Backups:
Recovery Time Objectives
| Failure Scenario | RTO (Recovery Time Objective) | RPO (Recovery Point Objective) | Recovery Procedure |
|---|---|---|---|
| Container crash | < 1 minute | 0 (no data loss) | Automatic restart via restart: unless-stopped |
| Node failure | < 5 minutes | 0 (no data loss) | Kubernetes pod rescheduling |
| Data corruption | < 30 minutes | 24 hours | Restore from latest backup |
| Regional outage | < 15 minutes | 0 (no data loss) | Cloudflare automatic failover |
| Complete disaster | < 2 hours | 24 hours | Deploy new infrastructure, restore backups |
Disaster Recovery Testing
Quarterly DR Test Procedure:
-
Simulate Container Failure:
-
Simulate Data Corruption:
-
Simulate Network Partition:
-
Verify Backup Integrity:
Sources: docker-compose.yml9 docker-compose.yml15
Security Compliance and Hardening
Container Image Security
The default images specified in docker-compose.yml5 and docker-compose.yml12 should be validated and scanned for vulnerabilities.
Image Security Checklist:
| Security Control | Implementation |
|---|---|
| Base Image Verification | Use official images with verified publishers |
| Vulnerability Scanning | Run docker scan or Trivy before deployment |
| Image Signing | Verify Docker Content Trust signatures |
| Version Pinning | Use specific version tags, not latest |
| Minimal Base | Prefer Alpine-based images |
| Read-Only Filesystem | Set read_only: true in docker-compose.yml |
Production Image References:
Security Scanning Integration
Add security scanning to CI/CD pipeline:
Compliance Requirements
GDPR/Privacy Compliance:
- Enable audit logging in mosquitto.conf:1-6
- Implement data retention policies
- Configure message encryption
- Document data flows
SOC 2 Compliance:
- Implement access controls via ACL
- Enable comprehensive logging
- Implement change management procedures
- Document disaster recovery procedures
PCI DSS (if handling payment data):
- Implement network segmentation
- Enable encryption in transit and at rest
- Implement strong authentication
- Regular security assessments
Sources: docker-compose.yml5 docker-compose.yml12 mosquitto.conf:1-6
Performance Optimization
Connection Pooling and Load Distribution
The docker-compose.yml:4-9 single-instance architecture does not scale for high-throughput scenarios.
Mosquitto Performance Benchmarks
Expected Performance Metrics:
| Metric | Development (Single Instance) | Production (Clustered) |
|---|---|---|
| Max Concurrent Connections | ~1,000 | ~10,000 |
| Messages/sec (publish) | ~5,000 | ~50,000 |
| Messages/sec (subscribe) | ~10,000 | ~100,000 |
| Latency (p99) | < 100ms | < 50ms |
| Memory per connection | ~2KB | ~2KB |
Performance Testing:
Tuning mosquitto.conf for High Throughput
# High-performance configuration
listener 1883
max_connections 10000
max_queued_messages 10000
max_inflight_messages 100
# Disable persistence for high-throughput, low-retention scenarios
# (Use only if message loss is acceptable)
persistence false
# Reduce keepalive overhead
max_keepalive 30
# Increase message size limit (default: 268435456 bytes)
message_size_limit 1048576
# Optimize memory usage
memory_limit 2147483648
# Queue settings
max_queued_bytes 0
queue_qos0_messages false
# Websocket settings
listener 9001
protocol websockets
websocket_timeout 300
Sources: mosquitto.conf:1-6 docker-compose.yml:4-9
Deployment Checklist
Pre-Production Validation
Infrastructure Readiness:
- Container orchestration platform configured (Kubernetes/Docker Swarm)
- Persistent storage provisioned with backup strategy
- Secrets management system configured
- Monitoring and logging infrastructure deployed
- Disaster recovery procedures documented and tested
Security Configuration:
- Authentication enabled in mosquitto.conf2
- ACL file configured with topic-level restrictions
- Anonymous access disabled
- TLS certificates provisioned (if not using Cloudflare Tunnel)
- Network policies implemented
- Container security contexts configured (
read_only,no-new-privileges) - Image vulnerability scanning completed
Cloudflare Configuration:
- Multiple tunnels created for high availability
- DNS load balancing configured
- Health checks enabled
- Rate limiting configured
- DDoS protection verified
- WAF rules configured
Performance Tuning:
- Resource limits configured in docker-compose.yml:4-18
- Connection limits set in mosquitto.conf:1-6
- Persistence settings optimized
- Load testing completed
- Performance benchmarks documented
Operational Readiness:
- Backup automation tested
- Restore procedures validated
- Monitoring dashboards created
- Alert rules configured
- On-call runbooks documented
- Incident response procedures defined
Compliance:
- Audit logging enabled
- Data retention policies implemented
- Privacy impact assessment completed
- Security assessment completed
- Compliance documentation prepared
Sources: docker-compose.yml:1-18 mosquitto.conf:1-6 README.md:1-93
Migration from Development to Production
Transition Strategy
Migration Steps
Phase 1: Staging Environment Setup
- Deploy staging environment with production-like configuration
- Migrate from
.envto secrets manager - Enable authentication and ACL
- Configure persistent volumes
- Deploy monitoring stack
- Run load tests
Phase 2: Production Infrastructure
- Provision Kubernetes cluster or production Docker hosts
- Configure external secrets management
- Set up persistent storage with replication
- Deploy monitoring and logging infrastructure
- Configure backup automation
- Implement network policies
Phase 3: Service Migration
- Create multiple Cloudflare Tunnels for production regions
- Deploy Mosquitto cluster with bridge configuration
- Configure load balancing and health checks
- Migrate client connections gradually
- Monitor performance and adjust resources
- Validate disaster recovery procedures
Phase 4: Optimization
- Tune Mosquitto configuration based on production metrics
- Adjust resource limits based on actual usage
- Optimize backup schedules
- Refine alerting thresholds
- Document operational procedures
Sources: docker-compose.yml:1-18 mosquitto.conf:1-6 .env.sample1
Conclusion
Production deployment of the Docker MQTT Mosquitto with Cloudflare Tunnel system requires significant enhancements beyond the development configuration in docker-compose.yml:1-18 and mosquitto.conf:1-6 Key production requirements include:
- Security: Disable anonymous access, implement authentication and ACL, use secrets managers
- Reliability: Deploy multiple instances, implement health checks, configure automated backups
- Scalability: Use container orchestration (Kubernetes), implement load balancing, tune resource limits
- Observability: Deploy comprehensive logging and monitoring, configure alerting, track SLIs/SLOs
- Operations: Document runbooks, test disaster recovery, implement change management
The transition from development to production should be gradual, with thorough testing in staging environments before production deployment. Regular security assessments, performance tuning, and disaster recovery testing ensure ongoing operational excellence.
Sources: README.md:1-93 docker-compose.yml:1-18 mosquitto.conf:1-6 .env.sample1 .gitignore:1-2
Dismiss
Refresh this wiki
Enter email to refresh