With Preprocessing you can alter, fit and adjust your incoming data in any way shape or form you wish. This serves many purposes, e.g. your devices might not present data in the right format or gather numbered metrics to the right decimal.
Excerpt from zabbix.com about item preprocessing:
Item value preprocessing allows to define and execute transformation rules for the received item values.
Preprocessing is managed by a preprocessing manager process, which was added in Zabbix 3.4, along with preprocessing workers that perform the preprocessing steps. All values (with or without preprocessing) from different data gatherers pass through the preprocessing manager before being added to the history cache. Socket-based IPC communication is used between data gatherers (pollers, trappers, etc) and the preprocessing process. Either Zabbix server or Zabbix proxy (for items monitored by the proxy) is performing preprocessing steps.
With this, you have the ability to use, among other things:
- Customer multipliers – Data retrieved in ms can multiplied by 1000 and then using the ‘s’ unit causes Zabbix to present the data in human-readable time
- JSON Path – Dig into JSON data structures using JSONPath to grab and store the precise piece of data you need
- Simple Change or Change Per Second – Many counters are incremental and this allows Zabbix to automatically calculate change between current and previous gathered data
- Regular Expression – Use capture groups to search for and grab only the data you need, even with the possibility of adding or redacting portions of data
- Conversions – Hexadecimal, Octal, Boolean and decimal, you can convert your numbers in any which way
You might be measuring temperature data and your device might be throwing hexadecimal unreadable gibberish at you. Use preprocessing to easily convert it into decimal!
Redacting Sensitive Data with Preprocessing
A monitoring system gathers an awful lot of data, most of it innocuous in nature. However, in conjunction with other data being gathered you might be gathering usernames, tokens or varying types of personal data. Zabbix is meant to store data for a long time, so it’s a good idea to vacuum and remove and such data from the get-go.
As an example, you might be monitoring a JWT authentication service. You want to know if it functions correctly and replies as expected with a valid token. Using an HTTP agent item you regularly send a POST request to the service and the service responds with:
{"success":true,"token":"<SENSITIVE-TOKEN-DATA>","expiresOn":"2020-05-08T13:12:05.9834959Z","user":{"userName":"test@zabbix.tips","email":"test@zabbix.tips"}}
This data will be stored in the Zabbix database for a long and anyone with access to the item data or database can grab the token data and use it at will. This may or may not be an issue, depending on the permissions of the user and the system involved, so, why not simply remove the sensitive data? Preprocessing with a Regular Expression can help you here. This part requires previous knowledge of RegEx which you can find much more information about here. You can also test your RegEx using the awesome tools at regex101.com
Regular expression parameters:
- Pattern:
(.*\"token\":\")(.*?)(\"\,.*)
- Output:
\1<redacted>\3
Let’s break it down: The pattern is divided into 3 groups, each starting and ending with a parenthesis. Here’s a colourised representation of the groupings, 1, 2 and 3
In the Output parameter, we then simply only use groups 1 and 3. We leave out group 2 and instead we write <redacted>
. The resulting data stored by Zabbix is:
{"success":true,"token":"<redacted>","expiresOn":"2020-05-08T13:12:05.9834959Z","user":{"userName":"test@zabbix.tips","email":"test@zabbix.tips"}}
Any token data gathered by Zabbix is now effectively thrown away before being stored to the database.
Preprocessing offers many, many more ways to adjust and uniform your data before storing it to fit your exact needs, play around with it!