RRDtool has some limits that are difficult to work around, so I am working to convert it to using graphite for performance graphs instead. I have graphite installed so far, and even added grafana for a different dashboard. Now I am working on making service_process_perfdata_file send data to carbon in addition to rrdtool, or even instead for those situations where rrdtool isn’t suitable, such as ds names that are too long (19 characters).
Then I will have to work on getting the performance views replaced with graphite graphs instead of rrdtool graphs.
service_process_perfdata_file massages the label name to fit rrdtool, and even filters out perfdata based on the performance configuration. I’m going to have it stuff all data into carbon, and have the performance graphs get created later based on whatever you want. Say, change the graph command from getting a PNG from rrdtool and getting it from graphite instead, using a different query.
I’m making the script to create the carbon labels basically the same as the foundation labels, then the rrd labels will be created after, if the script decides rrd’s are needed. I’m thinking of maybe creating a second script that possibly reads a pipe that the first writes the perfdata for carbon to. This way I can take advantage of multiprocessing and let the first script continue on its way. This might be good because I’m going to be using the plaintext protocol, which only accepts one line at a time. So passing it on to another script might speed it up a bit. I might be able to use the pickle protocol and write the script on python, which would be more efficient, because carbon’s pickle protocol supports more than one data point at a time, and plaintext doesn’t. That would reduce the number of socket connects and forks (if it was done in shell). I would prefer ruby, but ruby doesn’t have pickle, and groundwork doesn’t come with ruby, so there wouldn’t be any guarantee the patch would be accepted.
The foundation portion would be more involved though, likely requiring at least one more table and possibly other table alterations. It would probably also require some code changes to get the ui portals to recognize the new graphs, although I would probably just require users go to an alternate page, such as the graphite or grafana dashboards, to get the graphs.