Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Production Deployment Considerations

Loading…

Production Deployment Considerations

Relevant source files

Purpose and Scope

This document provides technical guidance for deploying the Docker MQTT Mosquitto with Cloudflare Tunnel system in production environments. It covers security hardening, scalability, high availability, resource management, monitoring, and disaster recovery strategies that extend beyond the basic development setup.

For information about the advanced security features in the alternative branch, see Protected Branch Features. For monitoring implementation details, see Monitoring and Health Checks. For secret management during development, see Version Control Best Practices.

The development configuration provided in docker-compose.yml:1-18 is suitable for testing and proof-of-concept deployments, but requires several modifications for production use. This document identifies these gaps and provides implementation guidance.


Security Hardening

Authentication and Authorization

The default configuration in mosquitto.conf2 sets allow_anonymous true, which permits unauthenticated client connections. This is unacceptable for production deployments.

Production Configuration Requirements:

Security ControlDevelopmentProduction Required
Anonymous AccessEnabledDisabled
Password AuthenticationNoneRequired
ACL (Access Control Lists)NoneTopic-level restrictions
TLS EncryptionCloudflare-managedEnd-to-end recommended
Connection LimitsUnlimitedRate-limited

Implementing Password Authentication

Modify mosquitto.conf:1-6 to include authentication:

listener 1883
allow_anonymous false
password_file /mosquitto/config/password_file

listener 9001
protocol websockets
allow_anonymous false
password_file /mosquitto/config/password_file

Create the password file using mosquitto_passwd:

Update docker-compose.yml:7-8 to mount the password file:

Implementing Topic-Based ACL

Create an ACL file to restrict topic access per user. The [protected-no-wildcard branch](https://github.com/jzombie/docker-mqtt-mosquitto-cloudflare-tunnel/blob/59f1274c/protected-no-wildcard branch) demonstrates a username-based topic hierarchy where the first topic level represents the username.

Production ACL Pattern:

# Admin users - full access
user admin
topic readwrite #

# IoT devices - restricted to device-specific topics
user device001
topic readwrite devices/device001/#

user device002
topic readwrite devices/device002/#

# Read-only monitoring users
user monitor
topic read #

Sources: mosquitto.conf:1-6 README.md:9-10


Secret Management

The development approach using .env files is insufficient for production. The CLOUDFLARE_TUNNEL_TOKEN in docker-compose.yml17 grants tunnel routing access and must be protected.

Production Secret Management Flow

Kubernetes Secret Management Example:

Docker Swarm Secret Management Example:

Sources: docker-compose.yml:14-17 .env.sample1 .gitignore1


Network Security

The current configuration in docker-compose.yml:1-18 does not implement network policies. Production deployments should restrict inter-container communication.

graph TB
    subgraph "External Network"
        CF_EDGE["Cloudflare Edge Network"]
end
    
    subgraph "Docker Host"
        subgraph "frontend_network"
            CFD["cloudflared\ncontainer_name: cloudflared\nimage: cloudflare/cloudflared:latest"]
end
        
        subgraph "backend_network"
            MOSQ["mosquitto\ncontainer_name: mosquitto\nimage: eclipse-mosquitto:latest"]
end
        
        subgraph "Shared Network: tunnel_network"
            CFD_BRIDGE["cloudflared bridge"]
MOSQ_BRIDGE["mosquitto bridge"]
end
    end
    
 
   CF_EDGE <-->|outbound tunnel| CFD
 
   CFD_BRIDGE <-->|port 9001| MOSQ_BRIDGE
 
   CFD -.->|connected to tunnel_network| CFD_BRIDGE
 
   MOSQ -.->|connected to tunnel_network| MOSQ_BRIDGE

Docker Network Segmentation

Production docker-compose.yml Network Configuration:

Key Security Enhancements:

  • read_only: true: Container filesystems are read-only
  • security_opt: no-new-privileges: Prevents privilege escalation
  • tmpfs: /tmp: Writable temporary directory
  • Named volumes: Persistent data storage
  • Static IP addresses: Predictable network topology

Sources: docker-compose.yml:1-18


Persistent Storage and Data Management

The development configuration in docker-compose.yml:4-9 does not provision persistent volumes for Mosquitto data. Message retention, subscriptions, and authentication data require persistent storage.

graph LR
    subgraph "Host Filesystem"
        COMPOSE["docker-compose.yml"]
MOSQ_CONF_FILE["mosquitto.conf"]
end
    
    subgraph "Docker Volumes"
        DATA_VOL["mosquitto_data\n(retained messages,\nsubscriptions)"]
LOG_VOL["mosquitto_log\n(broker logs)"]
CONFIG_VOL["config volume\n(mounted configs)"]
end
    
    subgraph "mosquitto Container"
        MOSQ_PROC["mosquitto process"]
DATA_MOUNT["/mosquitto/data"]
LOG_MOUNT["/mosquitto/log"]
CONF_MOUNT["/mosquitto/config"]
end
    
    subgraph "Backup Systems"
        BACKUP_CRON["cron job"]
BACKUP_STORAGE["Remote Storage\n(S3, NFS, etc.)"]
end
    
 
   MOSQ_CONF_FILE -->|bind mount| CONFIG_VOL
 
   CONFIG_VOL -->|mounted at| CONF_MOUNT
 
   DATA_VOL -->|mounted at| DATA_MOUNT
 
   LOG_VOL -->|mounted at| LOG_MOUNT
    
 
   CONF_MOUNT --> MOSQ_PROC
 
   DATA_MOUNT --> MOSQ_PROC
 
   LOG_MOUNT --> MOSQ_PROC
    
 
   DATA_VOL -.->|backup| BACKUP_CRON
 
   LOG_VOL -.->|backup| BACKUP_CRON
 
   BACKUP_CRON -->|archive| BACKUP_STORAGE

Storage Architecture

Volume Configuration for Production

Modify docker-compose.yml:7-8 to include persistent volumes:

Backup Strategy

Automated Backup Script:

Restoration Procedure:

Sources: docker-compose.yml:7-8


Resource Management and Scaling

Container Resource Limits

The current docker-compose.yml:4-18 does not specify resource constraints. Production deployments must prevent resource exhaustion.

Production Resource Configuration:

graph TB
    subgraph "Host Resources"
        CPU["CPU Cores"]
MEMORY["System Memory"]
DISK["Disk I/O"]
end
    
    subgraph "Container Allocations"
        MOSQ_CPU["mosquitto\nCPUs: 1.0\nMemory: 1GB\nreservations.memory: 512MB"]
CFD_CPU["cloudflared\nCPUs: 0.5\nMemory: 512MB\nreservations.memory: 256MB"]
end
    
    subgraph "Monitoring Limits"
        ALERTS["Resource Alerts\n(>80% usage)"]
METRICS["Prometheus Metrics\ncontainer_memory_usage_bytes\ncontainer_cpu_usage_seconds_total"]
end
    
 
   CPU -->|allocated| MOSQ_CPU
 
   CPU -->|allocated| CFD_CPU
 
   MEMORY -->|allocated| MOSQ_CPU
 
   MEMORY -->|allocated| CFD_CPU
    
 
   MOSQ_CPU -.->|export metrics| METRICS
 
   CFD_CPU -.->|export metrics| METRICS
 
   METRICS -->|trigger| ALERTS

Mosquitto Performance Tuning

Extend mosquitto.conf:1-6 with production performance settings:

# Connection limits
max_connections 1000
max_queued_messages 1000
max_inflight_messages 20

# Persistence settings
persistence true
persistence_location /mosquitto/data/
autosave_interval 300
autosave_on_changes false

# Memory management
max_keepalive 60
message_size_limit 0

# Logging (reduced verbosity for production)
log_dest file /mosquitto/log/mosquitto.log
log_type error
log_type warning
log_timestamp true

# Listeners
listener 1883
allow_anonymous false
password_file /mosquitto/config/password_file

listener 9001
protocol websockets
allow_anonymous false
password_file /mosquitto/config/password_file

Key Configuration Parameters:

ParameterDevelopmentProductionPurpose
max_connectionsUnlimited1000Prevent resource exhaustion
persistenceFalse (default)TrueRetain messages across restarts
autosave_intervalN/A300Save retained messages every 5 minutes
max_keepalive6553560Detect dead connections faster
log_typeAllerror, warningReduce log volume

Sources: mosquitto.conf:1-6 docker-compose.yml:4-9


High Availability and Clustering

Multi-Instance Deployment Architecture

The single-instance architecture in docker-compose.yml:4-9 provides no redundancy. Production systems require high availability.

Cloudflare Load Balancer Configuration:

  1. Create multiple Cloudflare Tunnels (one per region/instance)
  2. Configure DNS load balancing in Cloudflare dashboard:
    • Geographic routing: Route based on client location
    • Health checks: Monitor tunnel availability
    • Failover: Automatic failover to healthy tunnels

Multi-Tunnel docker-compose.yml:

Mosquitto Bridge Configuration

Configure message synchronization between instances by adding to mosquitto.conf:

# Bridge configuration for message replication
connection bridge-to-replica
address mosquitto-replica:1883
topic # both 0
cleansession false
try_private false
bridge_attempt_unsubscribe true
bridge_protocol_version mqttv311

Sources: docker-compose.yml:4-18 mosquitto.conf:1-6


Container Orchestration

Kubernetes Deployment

For production-scale deployments, replace docker-compose.yml:1-18 with Kubernetes manifests.

Kubernetes Deployment Manifest (mosquitto):

Kubernetes Deployment Manifest (cloudflared):

PersistentVolumeClaim:

Sources: docker-compose.yml:1-18


Monitoring and Logging

Logging Configuration

The default mosquitto.conf:1-6 does not configure logging. Production deployments require comprehensive logging.

Production Logging Configuration (mosquitto.conf):

graph LR
    subgraph "Log Sources"
        MOSQ_PROC["mosquitto process"]
CFD_PROC["cloudflared process"]
DOCKER_DAEMON["Docker Daemon"]
end
    
    subgraph "Log Collection"
        MOSQ_LOG_VOL["/mosquitto/log/mosquitto.log"]
DOCKER_JSON_LOG["container stdout/stderr\n(JSON driver)"]
end
    
    subgraph "Log Aggregation"
        FLUENTD["Fluentd\n(log shipper)"]
FILEBEAT["Filebeat\n(log shipper)"]
end
    
    subgraph "Log Storage & Analysis"
        ELK["Elasticsearch\n(log indexing)"]
KIBANA["Kibana\n(visualization)"]
LOKI["Grafana Loki\n(log aggregation)"]
SPLUNK["Splunk\n(SIEM)"]
end
    
    subgraph "Alerting"
        ALERT_MGR["AlertManager\n(alert routing)"]
PAGERDUTY["PagerDuty"]
SLACK["Slack"]
end
    
 
   MOSQ_PROC -->|writes to| MOSQ_LOG_VOL
 
   MOSQ_PROC -->|stdout| DOCKER_JSON_LOG
 
   CFD_PROC -->|stdout| DOCKER_JSON_LOG
 
   DOCKER_DAEMON -->|container logs| DOCKER_JSON_LOG
    
 
   MOSQ_LOG_VOL -->|tail| FLUENTD
 
   DOCKER_JSON_LOG -->|docker logs| FILEBEAT
    
 
   FLUENTD --> ELK
 
   FILEBEAT --> ELK
 
   FLUENTD --> LOKI
    
 
   ELK --> KIBANA
 
   LOKI --> ALERT_MGR
 
   ELK --> ALERT_MGR
    
 
   ALERT_MGR --> PAGERDUTY
 
   ALERT_MGR --> SLACK
# Comprehensive logging
log_dest file /mosquitto/log/mosquitto.log
log_dest stdout

log_type error
log_type warning
log_type notice
log_type information
log_type subscribe
log_type unsubscribe

log_timestamp true
log_timestamp_format %Y-%m-%dT%H:%M:%S

connection_messages true

Docker Logging Driver Configuration:

Metrics and Monitoring

Implement Prometheus-compatible metrics exporters:

docker-compose.yml with Monitoring:

Prometheus Configuration (prometheus.yml):

Key Metrics to Monitor:

MetricDescriptionAlert Threshold
mosquitto_connected_clientsActive client connections> 80% of max_connections
mosquitto_messages_received_totalTotal messages receivedRate < 1/min (potential downtime)
mosquitto_messages_sent_totalTotal messages sentRate < 1/min (potential issue)
mosquitto_retained_messagesNumber of retained messages> 90% of storage capacity
container_memory_usage_bytesContainer memory usage> 80% of limit
container_cpu_usage_seconds_totalContainer CPU usage> 80% of limit

Sources: mosquitto.conf:1-6 docker-compose.yml:1-18


Disaster Recovery and Business Continuity

Backup Automation

Implement automated backup schedules using cron or Kubernetes CronJobs.

Kubernetes CronJob for Backups:

Recovery Time Objectives

Failure ScenarioRTO (Recovery Time Objective)RPO (Recovery Point Objective)Recovery Procedure
Container crash< 1 minute0 (no data loss)Automatic restart via restart: unless-stopped
Node failure< 5 minutes0 (no data loss)Kubernetes pod rescheduling
Data corruption< 30 minutes24 hoursRestore from latest backup
Regional outage< 15 minutes0 (no data loss)Cloudflare automatic failover
Complete disaster< 2 hours24 hoursDeploy new infrastructure, restore backups

Disaster Recovery Testing

Quarterly DR Test Procedure:

  1. Simulate Container Failure:

  2. Simulate Data Corruption:

  3. Simulate Network Partition:

  4. Verify Backup Integrity:

Sources: docker-compose.yml9 docker-compose.yml15


Security Compliance and Hardening

Container Image Security

The default images specified in docker-compose.yml5 and docker-compose.yml12 should be validated and scanned for vulnerabilities.

Image Security Checklist:

Security ControlImplementation
Base Image VerificationUse official images with verified publishers
Vulnerability ScanningRun docker scan or Trivy before deployment
Image SigningVerify Docker Content Trust signatures
Version PinningUse specific version tags, not latest
Minimal BasePrefer Alpine-based images
Read-Only FilesystemSet read_only: true in docker-compose.yml

Production Image References:

Security Scanning Integration

Add security scanning to CI/CD pipeline:

Compliance Requirements

GDPR/Privacy Compliance:

  • Enable audit logging in mosquitto.conf:1-6
  • Implement data retention policies
  • Configure message encryption
  • Document data flows

SOC 2 Compliance:

  • Implement access controls via ACL
  • Enable comprehensive logging
  • Implement change management procedures
  • Document disaster recovery procedures

PCI DSS (if handling payment data):

  • Implement network segmentation
  • Enable encryption in transit and at rest
  • Implement strong authentication
  • Regular security assessments

Sources: docker-compose.yml5 docker-compose.yml12 mosquitto.conf:1-6


Performance Optimization

Connection Pooling and Load Distribution

The docker-compose.yml:4-9 single-instance architecture does not scale for high-throughput scenarios.

Mosquitto Performance Benchmarks

Expected Performance Metrics:

MetricDevelopment (Single Instance)Production (Clustered)
Max Concurrent Connections~1,000~10,000
Messages/sec (publish)~5,000~50,000
Messages/sec (subscribe)~10,000~100,000
Latency (p99)< 100ms< 50ms
Memory per connection~2KB~2KB

Performance Testing:

Tuning mosquitto.conf for High Throughput

# High-performance configuration
listener 1883
max_connections 10000
max_queued_messages 10000
max_inflight_messages 100

# Disable persistence for high-throughput, low-retention scenarios
# (Use only if message loss is acceptable)
persistence false

# Reduce keepalive overhead
max_keepalive 30

# Increase message size limit (default: 268435456 bytes)
message_size_limit 1048576

# Optimize memory usage
memory_limit 2147483648

# Queue settings
max_queued_bytes 0
queue_qos0_messages false

# Websocket settings
listener 9001
protocol websockets
websocket_timeout 300

Sources: mosquitto.conf:1-6 docker-compose.yml:4-9


Deployment Checklist

Pre-Production Validation

Infrastructure Readiness:

  • Container orchestration platform configured (Kubernetes/Docker Swarm)
  • Persistent storage provisioned with backup strategy
  • Secrets management system configured
  • Monitoring and logging infrastructure deployed
  • Disaster recovery procedures documented and tested

Security Configuration:

  • Authentication enabled in mosquitto.conf2
  • ACL file configured with topic-level restrictions
  • Anonymous access disabled
  • TLS certificates provisioned (if not using Cloudflare Tunnel)
  • Network policies implemented
  • Container security contexts configured (read_only, no-new-privileges)
  • Image vulnerability scanning completed

Cloudflare Configuration:

  • Multiple tunnels created for high availability
  • DNS load balancing configured
  • Health checks enabled
  • Rate limiting configured
  • DDoS protection verified
  • WAF rules configured

Performance Tuning:

Operational Readiness:

  • Backup automation tested
  • Restore procedures validated
  • Monitoring dashboards created
  • Alert rules configured
  • On-call runbooks documented
  • Incident response procedures defined

Compliance:

  • Audit logging enabled
  • Data retention policies implemented
  • Privacy impact assessment completed
  • Security assessment completed
  • Compliance documentation prepared

Sources: docker-compose.yml:1-18 mosquitto.conf:1-6 README.md:1-93


Migration from Development to Production

Transition Strategy

Migration Steps

Phase 1: Staging Environment Setup

  1. Deploy staging environment with production-like configuration
  2. Migrate from .env to secrets manager
  3. Enable authentication and ACL
  4. Configure persistent volumes
  5. Deploy monitoring stack
  6. Run load tests

Phase 2: Production Infrastructure

  1. Provision Kubernetes cluster or production Docker hosts
  2. Configure external secrets management
  3. Set up persistent storage with replication
  4. Deploy monitoring and logging infrastructure
  5. Configure backup automation
  6. Implement network policies

Phase 3: Service Migration

  1. Create multiple Cloudflare Tunnels for production regions
  2. Deploy Mosquitto cluster with bridge configuration
  3. Configure load balancing and health checks
  4. Migrate client connections gradually
  5. Monitor performance and adjust resources
  6. Validate disaster recovery procedures

Phase 4: Optimization

  1. Tune Mosquitto configuration based on production metrics
  2. Adjust resource limits based on actual usage
  3. Optimize backup schedules
  4. Refine alerting thresholds
  5. Document operational procedures

Sources: docker-compose.yml:1-18 mosquitto.conf:1-6 .env.sample1


Conclusion

Production deployment of the Docker MQTT Mosquitto with Cloudflare Tunnel system requires significant enhancements beyond the development configuration in docker-compose.yml:1-18 and mosquitto.conf:1-6 Key production requirements include:

  • Security: Disable anonymous access, implement authentication and ACL, use secrets managers
  • Reliability: Deploy multiple instances, implement health checks, configure automated backups
  • Scalability: Use container orchestration (Kubernetes), implement load balancing, tune resource limits
  • Observability: Deploy comprehensive logging and monitoring, configure alerting, track SLIs/SLOs
  • Operations: Document runbooks, test disaster recovery, implement change management

The transition from development to production should be gradual, with thorough testing in staging environments before production deployment. Regular security assessments, performance tuning, and disaster recovery testing ensure ongoing operational excellence.

Sources: README.md:1-93 docker-compose.yml:1-18 mosquitto.conf:1-6 .env.sample1 .gitignore:1-2

Dismiss

Refresh this wiki

Enter email to refresh