Error monitoring
When you run a FlexMeasures server, you want to stay on top of things going wrong. We added two ways of doing that:
You can connect to Sentry, so that all errors will be sent to your Sentry account. Add the token you got from Sentry in the config setting SENTRY_SDN and you’re up and running!
Another source of crucial errors are things that did not even happen! For instance, a (bot) user who is supposed to send data regularly, fails to connect with FlexMeasures. Or, a task to import prices from a day-ahead market, which you depend on later for scheduling, fails silently.
Let’s look at how to monitor for things not happening in more detail:
Monitoring the time users were last seen
The CLI task flexmeasures monitor last-seen lets you be alerted if a user has contacted your FlexMeasures instance longer ago than you expect. This is most useful for bot users (a.k.a. scripts).
Here is an example for illustration:
$ flexmeasures monitor last-seen --account-role SubscriberToServiceXYZ --user-role bot --maximum-minutes-since-last-seen 100
As you see, users are filtered by roles. You might need to add roles before this works as you want. Use --recipient one or more times to send the monitoring alert to specific FlexMeasures user IDs or email addresses. If you do not use this option, FlexMeasures falls back to FLEXMEASURES_DEFAULT_MONITORING_MAIL_RECIPIENTS.
You can also narrow the check to users in one or more accounts with --account.
Use --account multiple times to include multiple accounts.
Use --consultancy to narrow the check to users in accounts that are clients of the given consultant account.
If you run distinct filters, such as separate checks per account, account group or consultant, use distinct --task-name values so the --only-newly-absent-users feature tracks each filter independently.
$ flexmeasures monitor last-seen --task-name monitor-last-seen-account-12 --account 12 --account-role SubscriberToServiceXYZ --user-role bot --maximum-minutes-since-last-seen 100 --recipient 42 --recipient alerts@example.com
Todo
Adding roles and assigning them to accounts is not supported by the UI yet (user roles can be added in the UI). Account roles can be added with flexmeasures add account-role.
Monitoring task runs
The CLI task flexmeasures monitor latest-run lets you be alerted when tasks have not successfully run at least so-and-so many minutes ago.
The alerts will come in via Sentry, but you can also send them to specific FlexMeasures user IDs or email addresses with --recipient or to email addresses with the config setting FLEXMEASURES_DEFAULT_MONITORING_MAIL_RECIPIENTS.
For illustration, here is one example of how we monitor the latest run times of tasks on a server ― the below is run in a cron script every hour and checks if every listed task ran 60, 6 or 1440 minutes ago, respectively:
$ flexmeasures monitor latest-run --task get_weather_forecasts 60 --task get_recent_meter_data 6 --task import_epex_prices 1440
These tasks are defined in plugins we wrote - the weather forecast you will find in the flexmeasures/flexmeasures-weather repository.
This task status monitoring is enabled by decorating the functions behind these tasks with:
@task_with_status_report
def my_function():
...
Then, FlexMeasures will log if this task ran, and if it succeeded or failed. The result is in the table latest_task_runs, and that’s where the flexmeasures monitor latest-run will look.
Note
The decorator should be placed right before the function (after all other decorators).
Per default the function name is used as task name. If the number of tasks accumulate (e.g. by using multiple plugins that each define a task or two), it is useful to come up with more dedicated names. You can add a custom name as argument to the decorator:
@task_with_status_report("pluginA_myFunction")
def my_function():
...