mWatcher
Server health monitoring for CPU, memory, disk with alerting
mWatcher
mWatcher is a comprehensive server health monitoring tool that provides real-time monitoring of CPU, memory, disk usage, and other critical system metrics. With intelligent alerting and beautiful dashboards, mWatcher helps maintain optimal server performance and prevent downtime.
Overview
mWatcher provides comprehensive server monitoring with:
- Real-time Monitoring: Live tracking of system resources
- Intelligent Alerting: Smart notifications based on thresholds
- Beautiful Dashboards: Modern web interface for monitoring
- Historical Data: Long-term trend analysis and reporting
- Multi-Server Support: Monitor multiple servers from one dashboard
Features
📊 System Metrics
- CPU Usage: Per-core and overall CPU utilization
- Memory Monitoring: RAM usage, swap, and memory pressure
- Disk I/O: Read/write operations and disk space
- Network Traffic: Inbound/outbound network monitoring
- Process Monitoring: Top processes by resource usage
🚨 Smart Alerting
- Threshold-based Alerts: Customizable warning and critical levels
- Trend Analysis: Predictive alerts based on usage patterns
- Multiple Channels: Email, Slack, webhook notifications
- Alert Suppression: Prevent alert spam with intelligent grouping
📈 Dashboards
- Real-time Charts: Live updating system metrics
- Historical Views: Long-term trend analysis
- Custom Dashboards: Create personalized monitoring views
- Mobile Responsive: Monitor from any device
Installation
Quick Install
# Install via pip
pip install mwatcher
# Or via Docker
docker run -d \
--name mwatcher \
-p 8080:8080 \
-v /var/run/docker.sock:/var/run/docker.sock \
harrythedevopsguy/mwatcher:latest
Manual Installation
# Clone repository
git clone https://github.com/HarryTheDevOpsGuy/mWatcher.git
cd mWatcher
# Install dependencies
pip install -r requirements.txt
# Install mWatcher
pip install -e .
# Initialize configuration
mwatcher init --config /etc/mwatcher/mwatcher.yml
Systemd Service
# Create systemd service file
sudo tee /etc/systemd/system/mwatcher.service > /dev/null <<EOF
[Unit]
Description=mWatcher Server Monitor
After=network.target
[Service]
Type=simple
User=mtracker
Group=mtracker
ExecStart=/usr/local/bin/mwatcher daemon --config /etc/mwatcher/mwatcher.yml
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
EOF
# Enable and start service
sudo systemctl enable mwatcher
sudo systemctl start mwatcher
Configuration
Basic Configuration
Create /etc/mwatcher/mwatcher.yml
:
# Global settings
global:
log_level: "INFO"
log_file: "/var/log/mwatcher/mwatcher.log"
data_dir: "/var/lib/mwatcher"
web_port: 8080
web_host: "0.0.0.0"
# Monitoring intervals
monitoring:
system_check_interval: 10 # seconds
process_check_interval: 30 # seconds
disk_check_interval: 60 # seconds
network_check_interval: 5 # seconds
# Alert thresholds
thresholds:
cpu:
warning: 70
critical: 90
memory:
warning: 80
critical: 95
disk:
warning: 85
critical: 95
load_average:
warning: 2.0
critical: 4.0
# Monitored processes
processes:
- name: "nginx"
pattern: "nginx.*master"
alert_on_crash: true
- name: "mysql"
pattern: "mysqld"
alert_on_crash: true
- name: "redis"
pattern: "redis-server"
alert_on_crash: true
# Disk monitoring
disks:
monitor_all: true
exclude_paths:
- "/proc"
- "/sys"
- "/dev"
- "/run"
include_paths:
- "/"
- "/var"
- "/home"
- "/opt"
# Network monitoring
network:
interfaces:
- "eth0"
- "wlan0"
monitor_connections: true
alert_on_port_scan: true
# Notifications
notifications:
email:
enabled: true
smtp_server: "smtp.gmail.com"
smtp_port: 587
username: "${EMAIL_USERNAME}"
password: "${EMAIL_PASSWORD}"
from_address: "mwatcher@company.com"
to_addresses:
- "admin@company.com"
- "devops@company.com"
slack:
enabled: true
webhook_url: "${SLACK_WEBHOOK_URL}"
channel: "#server-alerts"
username: "mWatcher"
icon_emoji: ":computer:"
webhook:
enabled: true
url: "https://your-webhook.com/server-alerts"
headers:
Authorization: "Bearer ${WEBHOOK_TOKEN}"
# Dashboard settings
dashboard:
theme: "dark" # light, dark, auto
refresh_interval: 5 # seconds
max_data_points: 1000
chart_colors:
- "#3b82f6" # Blue
- "#ef4444" # Red
- "#10b981" # Green
- "#f59e0b" # Yellow
Environment Variables
# Email Configuration
export EMAIL_USERNAME="mwatcher@company.com"
export EMAIL_PASSWORD="your-app-password"
# Slack Configuration
export SLACK_WEBHOOK_URL="https://hooks.slack.com/services/..."
# Webhook Configuration
export WEBHOOK_TOKEN="your-webhook-token"
# Database Configuration (optional)
export DATABASE_URL="postgresql://user:password@localhost/mwatcher"
Usage
Command Line Interface
# Start monitoring daemon
mwatcher daemon --config /etc/mwatcher/mwatcher.yml
# Start web dashboard
mwatcher web --port 8080 --host 0.0.0.0
# Check current status
mwatcher status
# Generate system report
mwatcher report --format html --output system-report.html
# Test configuration
mwatcher test-config
# View logs
mwatcher logs --tail 100 --follow
Python API
from mwatcher import SystemMonitor, AlertManager
# Initialize monitor
monitor = SystemMonitor(config_file='/etc/mwatcher/mwatcher.yml')
# Get current system status
status = monitor.get_system_status()
print(f"CPU Usage: {status['cpu']['usage']}%")
print(f"Memory Usage: {status['memory']['usage']}%")
print(f"Disk Usage: {status['disk']['usage']}%")
# Get process information
processes = monitor.get_top_processes(limit=10)
for proc in processes:
print(f"{proc['name']}: {proc['cpu']}% CPU, {proc['memory']}% Memory")
# Check alerts
alerts = monitor.get_active_alerts()
for alert in alerts:
print(f"Alert: {alert['message']} - {alert['severity']}")
# Send custom alert
alert_manager = AlertManager(config_file='/etc/mwatcher/mwatcher.yml')
alert_manager.send_alert(
level='warning',
message='Custom alert message',
metric='cpu',
value=85.5
)
Monitoring Features
System Metrics Collection
import psutil
import time
from datetime import datetime
class SystemMetrics:
def __init__(self):
self.metrics = {}
def collect_cpu_metrics(self):
"""Collect CPU usage metrics"""
cpu_percent = psutil.cpu_percent(interval=1)
cpu_count = psutil.cpu_count()
load_avg = psutil.getloadavg()
return {
'usage_percent': cpu_percent,
'core_count': cpu_count,
'load_1min': load_avg[0],
'load_5min': load_avg[1],
'load_15min': load_avg[2],
'timestamp': datetime.now().isoformat()
}
def collect_memory_metrics(self):
"""Collect memory usage metrics"""
memory = psutil.virtual_memory()
swap = psutil.swap_memory()
return {
'total': memory.total,
'available': memory.available,
'used': memory.used,
'free': memory.free,
'usage_percent': memory.percent,
'swap_total': swap.total,
'swap_used': swap.used,
'swap_free': swap.free,
'swap_percent': swap.percent,
'timestamp': datetime.now().isoformat()
}
def collect_disk_metrics(self):
"""Collect disk usage metrics"""
disk_usage = psutil.disk_usage('/')
disk_io = psutil.disk_io_counters()
return {
'total': disk_usage.total,
'used': disk_usage.used,
'free': disk_usage.free,
'usage_percent': (disk_usage.used / disk_usage.total) * 100,
'read_bytes': disk_io.read_bytes,
'write_bytes': disk_io.write_bytes,
'read_count': disk_io.read_count,
'write_count': disk_io.write_count,
'timestamp': datetime.now().isoformat()
}
Alert System
class AlertManager:
def __init__(self, config):
self.config = config
self.alert_history = []
self.suppressed_alerts = set()
def check_thresholds(self, metrics):
"""Check metrics against configured thresholds"""
alerts = []
# CPU alerts
if metrics['cpu']['usage_percent'] > self.config['thresholds']['cpu']['critical']:
alerts.append({
'type': 'cpu',
'level': 'critical',
'message': f"CPU usage critical: {metrics['cpu']['usage_percent']:.1f}%",
'value': metrics['cpu']['usage_percent'],
'threshold': self.config['thresholds']['cpu']['critical']
})
elif metrics['cpu']['usage_percent'] > self.config['thresholds']['cpu']['warning']:
alerts.append({
'type': 'cpu',
'level': 'warning',
'message': f"CPU usage high: {metrics['cpu']['usage_percent']:.1f}%",
'value': metrics['cpu']['usage_percent'],
'threshold': self.config['thresholds']['cpu']['warning']
})
# Memory alerts
if metrics['memory']['usage_percent'] > self.config['thresholds']['memory']['critical']:
alerts.append({
'type': 'memory',
'level': 'critical',
'message': f"Memory usage critical: {metrics['memory']['usage_percent']:.1f}%",
'value': metrics['memory']['usage_percent'],
'threshold': self.config['thresholds']['memory']['critical']
})
return alerts
def send_alert(self, alert):
"""Send alert through configured channels"""
alert_id = f"{alert['type']}_{alert['level']}"
# Check if alert is suppressed
if alert_id in self.suppressed_alerts:
return
# Send via email
if self.config['notifications']['email']['enabled']:
self._send_email_alert(alert)
# Send via Slack
if self.config['notifications']['slack']['enabled']:
self._send_slack_alert(alert)
# Send via webhook
if self.config['notifications']['webhook']['enabled']:
self._send_webhook_alert(alert)
# Add to history
self.alert_history.append(alert)
Web Dashboard
from flask import Flask, render_template, jsonify
import json
app = Flask(__name__)
@app.route('/')
def dashboard():
"""Main dashboard page"""
return render_template('dashboard.html')
@app.route('/api/metrics')
def api_metrics():
"""API endpoint for current metrics"""
monitor = SystemMonitor()
metrics = monitor.get_current_metrics()
return jsonify(metrics)
@app.route('/api/metrics/history')
def api_metrics_history():
"""API endpoint for historical metrics"""
hours = request.args.get('hours', 24, type=int)
monitor = SystemMonitor()
history = monitor.get_metrics_history(hours=hours)
return jsonify(history)
@app.route('/api/alerts')
def api_alerts():
"""API endpoint for active alerts"""
monitor = SystemMonitor()
alerts = monitor.get_active_alerts()
return jsonify(alerts)
@app.route('/api/processes')
def api_processes():
"""API endpoint for process information"""
monitor = SystemMonitor()
processes = monitor.get_top_processes(limit=20)
return jsonify(processes)
if __name__ == '__main__':
app.run(host='0.0.0.0', port=8080, debug=True)
Dashboard Features
Real-time Charts
// Chart.js configuration for real-time monitoring
const chartConfig = {
type: 'line',
data: {
labels: [],
datasets: [
{
label: 'CPU Usage %',
data: [],
borderColor: '#3b82f6',
backgroundColor: 'rgba(59, 130, 246, 0.1)',
tension: 0.4
},
{
label: 'Memory Usage %',
data: [],
borderColor: '#ef4444',
backgroundColor: 'rgba(239, 68, 68, 0.1)',
tension: 0.4
},
{
label: 'Disk Usage %',
data: [],
borderColor: '#10b981',
backgroundColor: 'rgba(16, 185, 129, 0.1)',
tension: 0.4
}
]
},
options: {
responsive: true,
maintainAspectRatio: false,
scales: {
y: {
beginAtZero: true,
max: 100
}
},
plugins: {
legend: {
position: 'top'
}
}
}
};
// Update charts with real-time data
function updateCharts() {
fetch('/api/metrics')
.then(response => response.json())
.then(data => {
const now = new Date().toLocaleTimeString();
// Add new data point
chartConfig.data.labels.push(now);
chartConfig.data.datasets[0].data.push(data.cpu.usage_percent);
chartConfig.data.datasets[1].data.push(data.memory.usage_percent);
chartConfig.data.datasets[2].data.push(data.disk.usage_percent);
// Keep only last 50 data points
if (chartConfig.data.labels.length > 50) {
chartConfig.data.labels.shift();
chartConfig.data.datasets.forEach(dataset => {
dataset.data.shift();
});
}
chart.update();
});
}
// Update every 5 seconds
setInterval(updateCharts, 5000);
Integration Examples
Prometheus Integration
from prometheus_client import Counter, Histogram, Gauge, start_http_server
# Prometheus metrics
cpu_usage = Gauge('mwatcher_cpu_usage_percent', 'CPU usage percentage')
memory_usage = Gauge('mwatcher_memory_usage_percent', 'Memory usage percentage')
disk_usage = Gauge('mwatcher_disk_usage_percent', 'Disk usage percentage')
alert_count = Counter('mwatcher_alerts_total', 'Total number of alerts', ['level', 'type'])
def expose_prometheus_metrics():
"""Expose metrics to Prometheus"""
start_http_server(9090)
while True:
metrics = collect_system_metrics()
# Update Prometheus metrics
cpu_usage.set(metrics['cpu']['usage_percent'])
memory_usage.set(metrics['memory']['usage_percent'])
disk_usage.set(metrics['disk']['usage_percent'])
time.sleep(10)
Grafana Dashboard
{
"dashboard": {
"title": "mWatcher System Monitoring",
"panels": [
{
"title": "CPU Usage",
"type": "graph",
"targets": [
{
"expr": "mwatcher_cpu_usage_percent",
"legendFormat": "CPU Usage %"
}
]
},
{
"title": "Memory Usage",
"type": "graph",
"targets": [
{
"expr": "mwatcher_memory_usage_percent",
"legendFormat": "Memory Usage %"
}
]
},
{
"title": "Disk Usage",
"type": "graph",
"targets": [
{
"expr": "mwatcher_disk_usage_percent",
"legendFormat": "Disk Usage %"
}
]
}
]
}
}
Troubleshooting
Common Issues
High CPU Usage
# Check mWatcher process usage
ps aux | grep mwatcher
# Check system load
uptime
top
# Reduce monitoring frequency
# Edit /etc/mwatcher/mwatcher.yml
monitoring:
system_check_interval: 30 # Increase from 10
Memory Leaks
# Monitor mWatcher memory usage
ps -o pid,vsz,rss,comm -p $(pgrep mwatcher)
# Check for memory leaks in logs
grep -i "memory" /var/log/mwatcher/mwatcher.log
# Restart service if needed
sudo systemctl restart mwatcher
Dashboard Not Loading
# Check web server status
curl http://localhost:8080/api/metrics
# Check firewall settings
sudo ufw status
sudo ufw allow 8080
# Check logs
sudo journalctl -u mwatcher -f
API Reference
REST API Endpoints
Endpoint | Method | Description |
---|---|---|
/api/metrics |
GET | Get current system metrics |
/api/metrics/history |
GET | Get historical metrics |
/api/processes |
GET | Get process information |
/api/alerts |
GET | Get active alerts |
/api/health |
GET | Health check endpoint |
WebSocket Events
// Real-time updates via WebSocket
const ws = new WebSocket('ws://localhost:8080/ws');
ws.onmessage = function(event) {
const data = JSON.parse(event.data);
switch(data.type) {
case 'metrics_update':
updateDashboard(data.metrics);
break;
case 'alert':
showAlert(data.alert);
break;
case 'process_update':
updateProcessList(data.processes);
break;
}
};
Contributing
We welcome contributions! Please see our Contributing Guidelines for details.
Development Setup
# Fork and clone repository
git clone https://github.com/your-username/mWatcher.git
cd mWatcher
# Create development environment
python3 -m venv venv
source venv/bin/activate
# Install development dependencies
pip install -r requirements-dev.txt
# Run tests
pytest tests/
# Run linting
flake8 mwatcher/
black mwatcher/
License
This project is licensed under the MIT License - see the LICENSE file for details.
Support
- Documentation: https://harrythedevopsguy.github.io/docs/tools/mwatcher/
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: HarrytheDevOpsGuy@gmail.com
Related Documentation
More from Tools
Related by Tags
No related documentation found by tags
Related Blog Posts
OpenResty Production Setup: Supercharge with Lua-Based Metrics and Monitoring
Complete guide to deploying production-ready OpenResty with advanced Lua-based metrics collection...
KEDA on EKS: Complete Guide to Event-Driven Autoscaling with Real-World Examples
Master KEDA implementation on Amazon EKS with comprehensive examples for multiple scaling scenari...
AIOps: AI-Powered DevOps Automation and Intelligent Operations
Comprehensive guide to implementing AIOps - using AI and machine learning to transform DevOps pra...
Related Tools & Projects
BG Deployer
Automated blue-green deployment for zero-downtime AWS releases
mCert
SSL certificate monitoring with Slack/email alerts & Telegram
mTracker
Real-time Linux user activity monitoring with Slack notifications
mWatcher
Server health monitoring for CPU, memory, disk with alerting
gCrypt
Git-crypt wrapper for secure file encryption & access management
Interactive Tools
AWS VPC Designer, EKS Cost Calculator, and more utilities
External Resources
Quick Actions
Found this helpful?
Help us improve this documentation by sharing your feedback or suggesting improvements.