mercredi 21 octobre 2020

monit "check program" does not honor alert conditions

I am trying to use monit to run a script in every cycle and report error in syslog, only upon N consecutive failures.

e.g.

check program TestScript with path "/usr/local/bin/test_script.py"
    if status != 0 for 5 times within 5 cycles then alert

I intentionally made the script to exit with non-zero value in every invocation (simulating consistent failure)

The above config raises alert (implying write ERR message in syslog) for every cycle instead of waiting for 5 cycles and then raise the alert.

Tried the following "if" clauses too.

check program TestScript with path "/usr/local/bin/test_script.py"
    if status != 0 for 10 times within 20 cycles then alert

check program TestScript with path "/usr/local/bin/test_script.py"
    if status = 1 for 3 cycles then alert

check program TestScript with path "/usr/local/bin/test_script.py"
    if status != 0 for 5 times within 5 cycles then alert

check program TestScript with path "/usr/local/bin/test_script.py"
    if status != 0 for 5 times within 5 cycles then exec “/tmp/t.sh”

Interestingly, the last variation, where it does not even have the word "alert" but instead "exec", still raises alert as ERR message in syslog in every cycle.

The only way to stop the alert, is either don't have the "if" clause or the "if" clause don't match.

Any tips to help fix this issue would be a big help.

Thank you in advance!

Aucun commentaire:

Enregistrer un commentaire