Prometheus监控redis

DevOps Prometheus评论1,056字数 5544阅读18分28秒阅读模式

redis_exporter prometheus官方开源的专门监控Redis的插件工具,我们直接使用就可以。

更多资料介绍点我

安装配置redis_exporter

[root@devops ~]#https://github.com/oliver006/redis_exporter/releases/download/v1.61.0/redis_exporter-v1.61.0.linux-amd64.tar.gz
[root@devops ~]# tar zxf redis_exporter-v1.61.0.linux-amd64.tar.gz -C /usr/local/
[root@devops ~]# mv /usr/local/redis_exporter-v1.61.0.linux-amd64 /usr/local/redis_exporter
[root@devops ~]#

cat > /etc/systemd/system/redis_exporter.service << "EOF"
[Unit]
Description=redis_exporter
After=local-fs.target network-online.target network.target
Wants=local-fs.target network-online.target network.target
 
[Service]
ExecStart=/usr/local/redis_exporter/redis_exporter -redis.addr 192.168.1.230:16379 -redis.password '123456677'
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
[root@devops ~]# systemctl daemon-reload
[root@devops ~]# systemctl start redis_exporter
[root@devops ~]# systemctl status redis_exporter
● redis_exporter.service - redis_exporter
   Loaded: loaded (/etc/systemd/system/redis_exporter.service; disabled; vendor preset: disabled)
   Active: active (running) since 五 2024-06-14 23:39:09 CST; 12s ago
 Main PID: 51720 (redis_exporter)
    Tasks: 6
   Memory: 5.4M
   CGroup: /system.slice/redis_exporter.service
           └─51720 /usr/local/redis_exporter/redis_exporter -redis.addr 192.168.1.230:16379 -redis.password PV8NQ@RSkVHZ

6月 14 23:39:09 devops systemd[1]: Started redis_exporter.
6月 14 23:39:09 devops redis_exporter[51720]: time="2024-06-14T23:39:09+08:00" level=info msg="Redis Metrics Exporter v1.61.0    build date: 2024-06-09-17:28:...CH: amd64"
6月 14 23:39:09 devops redis_exporter[51720]: time="2024-06-14T23:39:09+08:00" level=info msg="Providing metrics at :9121/metrics"
Hint: Some lines were ellipsized, use -l to show in full.
[root@devops ~]# systemctl enable redis_exporter
Created symlink from /etc/systemd/system/multi-user.target.wants/redis_exporter.service to /etc/systemd/system/redis_exporter.service.

加入prometheus

  - job_name: redis_since
    static_configs:
    - targets: ['192.168.1.102:9121']

加入grfana

  • 使用17507

Prometheus监控redis

告警规则

[root@devops prometheus]# vim first_rules.yml

- name: REDIS-Alert
  rules:
  - alert: RedisDown
    expr: redis_up == 0
    for: 0m
    labels:
      severity: critical
    annotations:
      summary: Redis down (instance {{ $labels.instance }})
      description: "Redis instance is down\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

  - alert: RedisMissingMaster
    expr: (count(redis_instance_info{role="master"}) or vector(0)) < 1
    for: 0m
    labels:
      severity: critical
    annotations:
      summary: Redis missing master (instance {{ $labels.instance }})
      description: "Redis cluster has no node marked as master.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

  - alert: RedisTooManyMasters
    expr: count(redis_instance_info{role="master"}) > 1
    for: 0m
    labels:
      severity: critical
    annotations:
      summary: Redis too many masters (instance {{ $labels.instance }})
      description: "Redis cluster has too many nodes marked as master.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

  - alert: RedisDisconnectedSlaves
    expr: count without (instance, job) (redis_connected_slaves) - sum without (instance, job) (redis_connected_slaves) - 1 > 1
    for: 0m
    labels:
      severity: critical
    annotations:
      summary: Redis disconnected slaves (instance {{ $labels.instance }})
      description: "Redis not replicating for all slaves. Consider reviewing the redis replication status.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

  - alert: RedisReplicationBroken
    expr: delta(redis_connected_slaves[1m]) < 0
    for: 0m
    labels:
      severity: critical
    annotations:
      summary: Redis replication broken (instance {{ $labels.instance }})
      description: "Redis instance lost a slave\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

  - alert: RedisClusterFlapping
    expr: changes(redis_connected_slaves[1m]) > 1
    for: 2m
    labels:
      severity: critical
    annotations:
      summary: Redis cluster flapping (instance {{ $labels.instance }})
      description: "Changes have been detected in Redis replica connection. This can occur when replica nodes lose connection to the master and reconnect (a.k.a flapping).\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

  - alert: RedisMissingBackup
    expr: time() - redis_rdb_last_save_timestamp_seconds > 60 * 60 * 24
    for: 0m
    labels:
      severity: critical
    annotations:
      summary: Redis missing backup (instance {{ $labels.instance }})
      description: "Redis has not been backuped for 24 hours\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

  - alert: RedisOutOfSystemMemory
    expr: redis_memory_used_bytes / redis_total_system_memory_bytes * 100 > 90
    for: 2m
    labels:
      severity: warning
    annotations:
      summary: Redis out of system memory (instance {{ $labels.instance }})
      description: "Redis is running out of system memory (> 90%)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

  - alert: RedisOutOfConfiguredMaxmemory
    expr: redis_memory_used_bytes / redis_memory_max_bytes * 100 > 90
    for: 2m
    labels:
      severity: warning
    annotations:
      summary: Redis out of configured maxmemory (instance {{ $labels.instance }})
      description: "Redis is running out of configured maxmemory (> 90%)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

  - alert: RedisTooManyConnections
    expr: redis_connected_clients > 100
    for: 2m
    labels:
      severity: warning
    annotations:
      summary: Redis too many connections (instance {{ $labels.instance }})
      description: "Redis instance has too many connections\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

  - alert: RedisNotEnoughConnections
    expr: redis_connected_clients < 5
    for: 2m
    labels:
      severity: warning
    annotations:
      summary: Redis not enough connections (instance {{ $labels.instance }})
      description: "Redis instance should have more connections (> 5)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

  - alert: RedisRejectedConnections
    expr: increase(redis_rejected_connections_total[1m]) > 0
    for: 0m
    labels:
      severity: critical
    annotations:
      summary: Redis rejected connections (instance {{ $labels.instance }})
      description: "Some connections to Redis has been rejected\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"


[root@devops prometheus]# ./promtool check config prometheus.yml
Checking prometheus.yml
  SUCCESS: 1 rule files found
 SUCCESS: prometheus.yml is valid prometheus config file syntax

Checking first_rules.yml
  SUCCESS: 42 rules found

[root@devops prometheus]# systemctl restart prometheus

 

继续阅读
DevOps
  • 本文由 发表于 2024年6月14日 23:40:54
  • 除非特殊声明,本站文章均为原创,转载请务必保留本文链接
  • redis_exporter
Prometheus监控kafka Prometheus

Prometheus监控kafka

前言 Kafka现有开源的集群监控方案:kafka-manager、kafka-monitor、kafka-eagle、KafkaOffsetMonitor,但有所限制监控指标被固化,不易扩展、预警...
Prometheus Prometheus

Prometheus

版本 CentOS Linux release 7.9 Prometheus:prometheus-2.52.0.linux-amd64 Alertmanager:alertma...
prometheus告警规则 Prometheus

prometheus告警规则

基础监控 主机内存不足 节点内存已满(剩余 < 10%) groups: - name: Node memory is filling up (< 10% left) rul...
评论  0  访客  0

发表评论