Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interface Traffic and Interface Traffic Distribution not populating #7

Open
sjabby opened this issue Jun 4, 2017 · 16 comments
Open

Comments

@sjabby
Copy link

sjabby commented Jun 4, 2017

I've followed all of the instructions and the dashboard is almsot working 100%, but i cant get the graphs for Interface Traffic and Interface Traffic Distribution populate.

Any input? @WaterByWind

@WaterByWind
Copy link
Owner

Which dashboard are you using? (UAP, EdgeRouter, ??)

If using the edgerouter dashboard, be sure to be using at least 1.3.0 version of Telegraf as previous versions do not include the required support for collecting some data. (This does not apply to the UAP dashboard)

@uhede
Copy link

uhede commented Jun 8, 2017

I have the same problem with the edgerouter dashboard. I am running telegraf version 1.3.1. none of the Interface or IP charts are populated. My UAP dashboard works fine.

The required data seems to have been captured because a query like "select ifName,agent_host,ifHCInOctets, ifHCOutOctets from "snmp.EdgeOS" where time > now() - 1d and ifName = 'eth0'" does return a nice time series of Octet counts.

I can provide a backup of my db if that would help.

@uhede
Copy link

uhede commented Jun 11, 2017

Found my problem. I had a name_override setting in my telegraf.conf file that placed all observations in the same measurement. Removing that setting make the dashboard run fine.

@WaterByWind
Copy link
Owner

@uhede - Glad you found your issue and thanks for noting it here.
@sjabby - Are you still having a problem? I'll keep this open just in case.

@sjabby
Copy link
Author

sjabby commented Jun 24, 2017

@WaterByWind , sorry for the delay.

I'm using the EdgeRouter ERPOE-5(FW 1.9.0) and running grafana, telegraf etc on a RPi3.

Telegraf version is 1.3.1

@sjabby
Copy link
Author

sjabby commented Aug 3, 2017

Tried running telegraf and capturing more log data:

2017-08-03T20:25:24Z E! Error in plugin [inputs.snmp]: agent 192.168.2.1: performing get on field ssCpuRawNice: Request timeout (after 3 retries) 2017-08-03T20:25:50Z E! Error in plugin [inputs.snmp]: took longer to collect than collection interval (1m0s) 2017-08-03T20:25:53Z E! Error in plugin [inputs.snmp]: agent 192.168.2.1: gathering table ifXTable: performing bulk walk for field ifHighSpeed: Request timeout (after 3 retries) 2017-08-03T20:26:03Z E! Error in plugin [inputs.snmp]: agent 192.168.2.1: gathering table ipSystemStatsTable: performing bulk walk for field ipSystemStatsInReceives: Request timeout (after 3 retries) 2017-08-03T20:26:36Z E! Error in plugin [inputs.snmp]: agent 192.168.2.1: gathering table ifTable: performing bulk walk for field ifInUcastPkts: Request timeout (after 3 retries) 2017-08-03T20:27:01Z E! Error in plugin [inputs.snmp]: agent 192.168.2.1: gathering table ifXTable: performing bulk walk for field ifLinkUpDownTrapEnable: Request timeout (after 3 retries) 2017-08-03T20:27:07Z E! Error in plugin [inputs.snmp]: took longer to collect than collection interval (1m0s) 2017-08-03T20:27:35Z E! Error in plugin [inputs.snmp]: agent 192.168.2.1: gathering table ifTable: performing bulk walk for field ifInOctets: Request timeout (after 3 retries) 2017-08-03T20:27:59Z E! Error in plugin [inputs.snmp]: agent 192.168.2.1: gathering table ifXTable: performing bulk walk for field ifLinkUpDownTrapEnable: Request timeout (after 3 retries) 2017-08-03T20:28:33Z E! Error in plugin [inputs.snmp]: agent 192.168.2.1: gathering table ifXTable: performing bulk walk for field ifHCOutUcastPkts: Request timeout (after 3 retries) 2017-08-03T20:29:11Z E! Error in plugin [inputs.snmp]: agent 192.168.2.1: performing get on field sysUpTime: Request timeout (after 3 retries) 2017-08-03T20:29:30Z E! Error in plugin [inputs.snmp]: agent 192.168.2.1: gathering table ifXTable: performing bulk walk for field ifLinkUpDownTrapEnable: Request timeout (after 3 retries) 2017-08-03T20:29:40Z E! Error in plugin [inputs.snmp]: agent 192.168.2.1: gathering table ipSystemStatsTable: performing bulk walk for field ipSystemStatsInReceives: Request timeout (after 3 retries) 2017-08-03T20:30:10Z E! Error in plugin [inputs.snmp]: agent 192.168.2.1: performing get on field sysUpTime: Request timeout (after 3 retries) 2017-08-03T20:30:29Z E! Error in plugin [inputs.snmp]: agent 192.168.2.1: gathering table ifXTable: performing bulk walk for field ifLinkUpDownTrapEnable: Request timeout (after 3 retries)

Now running ERPOE-5 (FW 1.9.7) and Telegraf 1.3.5

@WaterByWind
Copy link
Owner

@sjabby are you able to do an snmpwalk by hand to get any data? It looks like the ER may not be responding to SNMP requests at all.

For instance, does this work:
snmpwalk -v 2c -c <your_community_string> <your_edgerouter> sysORID

@jjlawren
Copy link

IF-MIB::ifXTable is the only SNMP request that regularly times out for me. I had to increase the timeout in the example config from 5s to 10s.

@woody3000
Copy link

I have an EdgeRouter Lite 3 running EdgeOS v1.9.7. I had this same problem and fixed it by commenting out the high-capacity (HC) counters section:

   ##
   ## Interface metrics
   ##
   #  Per-interface traffic, errors, drops
   [[inputs.snmp.table]]
     oid = "IF-MIB::ifTable"
     [[inputs.snmp.table.field]]
       oid = "IF-MIB::ifName"
       is_tag = true
   #  Per-interface high-capacity (HC) counters
   #[[inputs.snmp.table]]
   #  oid = "IF-MIB::ifXTable"
   #  [[inputs.snmp.table.field]]
   #    oid = "IF-MIB::ifName"
   #    is_tag = true

IF-MIB::ifXTable doesn't seem to be a valid oid for my setup, which is why that failed.

Awesome work though, thanks!

@sjabby
Copy link
Author

sjabby commented Sep 5, 2017

I solved my problem by setting the timeout to 10s as suggested by @jjlawren

@WaterByWind
Copy link
Owner

@woody3000 - What behavior did you see to make you think IF-MIB::ifXTable is not valid? What errors did you see?

This is a standard object that has been populated by EdgeOS for a long time. This OID is still valid even with the latest hot fix releases and betas so it would be odd that your instance would not have this.

If you did see a message such as 'no such object' or 'no such instance' then a little more investigation may be helpful.

It is possible that the ER is just taking too long to respond and increasing the timeout as suggested earlier may address that.

If you remove the ifXTable from collection then you'd also need to update the graphs to use the standard counters instead of the HC counters.

@woody3000
Copy link

From my gafana host I tried running:
snmpwalk -v 2c -c EDGEOS <ip> "IF-MIB::ifXName"
and I get:
IF-MIB::ifXName: Unknown Object Identifier

It's entirely possible I don't have a MIB or something. It's running on Ubuntu 16.04 and to get the rest of the MIBs working, I only ran:
sudo apt-get install snmp-mibs-downloader
sudo download-mibs

@WaterByWind
Copy link
Owner

No there indeed is no IF-MIB::ifXName, but that is not used above.

The table is IF-MIB::ifXTable and a tag is added (IF-MIB::ifName with no 'X') to provide a direct correlation with IF-MIB::ifTable

@mvanbaak
Copy link

mvanbaak commented Feb 7, 2018

I just dropped by to add a 'me too' for the imeout setting.
When I added the configs to my telegraf, I got timeout errors in my logs all the time for the IF-MIB::ifXTable and once I bumped the timeout from 5 to 15s things started working.

Edgerouter lite with FW 1.9, telegraf 1.4.4 running in a FreeBSD 11.1-RELEASE jail

Error I got in the logs:
2018-02-07T13:27:36Z E! Error in plugin [inputs.snmp]: agent 192.168.10.250: gathering table ifXTable: performing bulk walk for field ifAlias: Request timeout (after 3 retries)

@pixelmagic66
Copy link

Also no population of the ifXTable with an Edgerouter X software version 2.09hotfix.

On Interface traffic I do get data after changing the ifXTable to ifTable but with the Interface Traffic Distribution no such thing.
Setting the timeout on SMTP from 5s to 15s did nothing for me.

Too bad it does not work with all parts, otherwise a perfect dashboard !

@WaterByWind
Copy link
Owner

ifTable has only 32 bit counters which can roll over fairly quickly on a busy interface. ifXTable uses 64 bit counters instead.

Which 2.0.9-hotfix version specifically (there were multiple updates)? For Cavium-based ERs there is no issue with tables with 2.0.9-hotfix.4. I don't have an ER-X available at the moment to test with, but this would be an issue with the SNMP on the ER if this does not work. It is not possible to collect and display data that is not provided by the ER-X itself, but if the tables (which are standard) are not available then something would appear broken.

The timeout setting can be removed with more recent versions of telegraf as the defaults work well, unlike earlier versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants