-
-
Notifications
You must be signed in to change notification settings - Fork 436
Increased bulk insert size to avoid partial inserts and potential data loss. #4808
Description
Describe the bug
I have just upgraded from 1.2.15 to 1.2.21 on CentOS 7 using the EPEL 7 RPM.
2022/06/07 15:40:01 - POLLER: Poller[1] PID[6208] WARNING: Poller Output Table not Empty. Issues: 8, DS[24539, 24539, 24539, 24539, 24798, 24798, 24798, 24798]
I now see this error for two data sources of two different devices on every poller run every 5 minute.s
If I delete the graph & data source and recreate it, the error just moves to the next following data source id on the same device.
The problem looks very much like issue #4687 but I still have the issue with 1.2.21.
I run the command suggested in #4687 and it showed something interesting:
[root@cacti cacti]# sudo -u apache php -q /usr/share/cacti/cmd.php --first=79 --last=79 --debug --mibs --poller=1
Total[0.1503] Device[79] STATUS: Device 'sw4cr.example.com' is UP.
Total[1.1050] Device[79] RECACHE: Processing 1 items in the auto reindex cache for 'sw4cr.example.com'.
Total[1.1623] Device[79] DQ[1] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert:10326104400 < output:10326118300)
Total[1.1692] Device[79] DS[20692] TT[6.44] SNMP: v2: sw4cr.example.com, dsname: traffic_in, oid: .1.3.6.1.2.1.31.1.1.1.6.2, output: 967480684385
Total[1.1828] Device[79] DS[20692] TT[13.22] SNMP: v2: sw4cr.example.com, dsname: traffic_out, oid: .1.3.6.1.2.1.31.1.1.1.10.2, output: 197844427214367
Total[1.1882] Device[79] DS[20693] TT[5.34] SNMP: v2: sw4cr.example.com, dsname: traffic_in, oid: .1.3.6.1.2.1.31.1.1.1.6.3, output: 15612290891279
Total[1.1942] Device[79] DS[20693] TT[5.82] SNMP: v2: sw4cr.example.com, dsname: traffic_out, oid: .1.3.6.1.2.1.31.1.1.1.10.3, output: 1128329598664172
Total[1.2023] Device[79] DS[20694] TT[8.08] SNMP: v2: sw4cr.example.com, dsname: traffic_in, oid: .1.3.6.1.2.1.31.1.1.1.6.4, output: 61918210986497
Total[1.2079] Device[79] DS[20694] TT[5.5] SNMP: v2: sw4cr.example.com, dsname: traffic_out, oid: .1.3.6.1.2.1.31.1.1.1.10.4, output: 67025955929767
Total[1.2139] Device[79] DS[20695] TT[5.93] SNMP: v2: sw4cr.example.com, dsname: traffic_in, oid: .1.3.6.1.2.1.31.1.1.1.6.5, output: 84245199078354
Total[1.2197] Device[79] DS[20695] TT[5.64] SNMP: v2: sw4cr.example.com, dsname: traffic_out, oid: .1.3.6.1.2.1.31.1.1.1.10.5, output: 55206823810825
Total[1.2277] Device[79] DS[20703] TT[7.9] SNMP: v2: sw4cr.example.com, dsname: traffic_in, oid: .1.3.6.1.2.1.31.1.1.1.6.41, output: 1158166100030
Total[1.2361] Device[79] DS[20703] TT[8.2] SNMP: v2: sw4cr.example.com, dsname: traffic_out, oid: .1.3.6.1.2.1.31.1.1.1.10.41, output: 661501084783
Total[1.2440] Device[79] DS[20704] TT[7.7] SNMP: v2: sw4cr.example.com, dsname: traffic_in, oid: .1.3.6.1.2.1.31.1.1.1.6.46, output: 580914728523
Total[1.2646] Device[79] DS[20704] TT[20.49] SNMP: v2: sw4cr.example.com, dsname: traffic_out, oid: .1.3.6.1.2.1.31.1.1.1.10.46, output: 683445271450
Total[1.2727] Device[79] DS[20706] TT[8.01] SNMP: v2: sw4cr.example.com, dsname: traffic_in, oid: .1.3.6.1.2.1.31.1.1.1.6.56, output: 251804378104363
Total[1.2802] Device[79] DS[20706] TT[7.35] SNMP: v2: sw4cr.example.com, dsname: traffic_out, oid: .1.3.6.1.2.1.31.1.1.1.10.56, output: 337430807284819
Total[1.2877] Device[79] DS[20707] TT[7.41] SNMP: v2: sw4cr.example.com, dsname: traffic_in, oid: .1.3.6.1.2.1.31.1.1.1.6.61, output: 4288729550307
...
Total[11.3192] Device[79] DS[24538] TT[7.24] SNMP: v2: sw4cr.example.com, dsname: discards_in, oid: .1.3.6.1.2.1.2.2.1.13.336, output: 0
Total[11.3288] Device[79] DS[24538] TT[9.49] SNMP: v2: sw4cr.example.com, dsname: discards_out, oid: .1.3.6.1.2.1.2.2.1.19.336, output: 92
Total[11.3340] Device[79] DS[24538] TT[5.07] SNMP: v2: sw4cr.example.com, dsname: errors_in, oid: .1.3.6.1.2.1.2.2.1.14.336, output: 0
Total[11.3389] Device[79] DS[24538] TT[4.85] SNMP: v2: sw4cr.example.com, dsname: errors_out, oid: .1.3.6.1.2.1.2.2.1.20.336, output: 0
Total[11.3452] Device[79] DS[24539] TT[6.2] SNMP: v2: sw4cr.example.com, dsname: discards_in, oid: .1.3.6.1.2.1.2.2.1.13.341, output: 0
Total[11.3452] Device[79] Writing 1001 items to Poller Output Table
Total[11.3628] Device[79] DS[24539] TT[8.29] SNMP: v2: sw4cr.example.com, dsname: discards_out, oid: .1.3.6.1.2.1.2.2.1.19.341, output: 100
Total[11.3978] Device[79] DS[24539] TT[34.81] SNMP: v2: sw4cr.example.com, dsname: errors_in, oid: .1.3.6.1.2.1.2.2.1.14.341, output: 0
Total[11.4045] Device[79] DS[24539] TT[6.59] SNMP: v2: sw4cr.example.com, dsname: errors_out, oid: .1.3.6.1.2.1.2.2.1.20.341, output: 0
...
Total[16.3092] Device[79] DS[24650] TT[10.53] SNMP: v2: sw4cr.example.com, dsname: discards_in, oid: .1.3.6.1.2.1.2.2.1.13.2013, output: 0
Total[16.3221] Device[79] DS[24650] TT[12.88] SNMP: v2: sw4cr.example.com, dsname: discards_out, oid: .1.3.6.1.2.1.2.2.1.19.2013, output: 0
Total[16.3352] Device[79] DS[24650] TT[12.94] SNMP: v2: sw4cr.example.com, dsname: errors_in, oid: .1.3.6.1.2.1.2.2.1.14.2013, output: 0
Total[16.3492] Device[79] DS[24650] TT[13.95] SNMP: v2: sw4cr.example.com, dsname: errors_out, oid: .1.3.6.1.2.1.2.2.1.20.2013, output: 0
Total[16.3493] Device[79] Writing 447 items to Poller Output Table
Total[16.3549] Device[79] Time[16.21] Items[1449] Errors[2]
Total[16.3639] Time: 16.3634 s, Poller: 1, Threads: N/A, Devices: 1, Items: 1449, Errors: 2
It looks the same on the other device:
Total[3.3275] Device[78] DS[24798] TT[1.35] SNMP: v2: bd1sw0.example.com, dsname: discards_in, oid: .1.3.6.1.2.1.2.2.1.13.4196996, output: 0
Total[3.3290] Device[78] DS[24798] TT[1.43] SNMP: v2: bd1sw0.example.com, dsname: discards_out, oid: .1.3.6.1.2.1.2.2.1.19.4196996, output: 0
Total[3.3290] Device[78] Writing 1001 items to Poller Output Table
Total[3.3401] Device[78] DS[24798] TT[1.93] SNMP: v2: bd1sw0.example.com, dsname: errors_in, oid: .1.3.6.1.2.1.2.2.1.14.4196996, output: 0
Total[3.3437] Device[78] DS[24798] TT[3.42] SNMP: v2: bd1sw0.example.com, dsname: errors_out, oid: .1.3.6.1.2.1.2.2.1.20.4196996, output: 0
So it seems that exactly for the data source id during which the Writing 1001 items to Poller Output Table happens it won't clear out the poller output table correctly.
I have two other devices which very many data sources. However, when I run the command on those two it looks like this:
Total[1.1513] Device[92] DS[23878] TT[0.64] SNMP: v2: firecr.example.com, dsname: discards_in, oid: .1.3.6.1.2.1.2.2.1.13.709, output: 0
Total[1.1521] Device[92] DS[23878] TT[0.57] SNMP: v2: firecr.example.com, dsname: discards_out, oid: .1.3.6.1.2.1.2.2.1.19.709, output: 0
Total[1.1527] Device[92] DS[23878] TT[0.53] SNMP: v2: firecr.example.com, dsname: errors_in, oid: .1.3.6.1.2.1.2.2.1.14.709, output: 0
Total[1.1533] Device[92] DS[23878] TT[0.5] SNMP: v2: firecr.example.com, dsname: errors_out, oid: .1.3.6.1.2.1.2.2.1.20.709, output: 0
Total[1.1533] Device[92] Writing 1001 items to Poller Output Table
Total[1.1635] Device[92] DS[23879] TT[0.71] SNMP: v2: firecr.example.com, dsname: discards_in, oid: .1.3.6.1.2.1.2.2.1.13.710, output: 0
Total[1.1642] Device[92] DS[23879] TT[0.57] SNMP: v2: firecr.example.com, dsname: discards_out, oid: .1.3.6.1.2.1.2.2.1.19.710, output: 0
Total[1.1648] Device[92] DS[23879] TT[0.54] SNMP: v2: firecr.example.com, dsname: errors_in, oid: .1.3.6.1.2.1.2.2.1.14.710, output: 0
Total[1.1654] Device[92] DS[23879] TT[0.57] SNMP: v2: firecr.example.com, dsname: errors_out, oid: .1.3.6.1.2.1.2.2.1.20.710, output: 0
Here, the "Writing 1001 items to Poller Output Table" happens exactly between the data sources and not in the middle of it.