Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken Pipe #128

Open
UntestedEngineer opened this issue Jan 21, 2025 · 20 comments
Open

Broken Pipe #128

UntestedEngineer opened this issue Jan 21, 2025 · 20 comments

Comments

@UntestedEngineer
Copy link

UntestedEngineer commented Jan 21, 2025

Decided to update the application running in my container from last year. When I recreated the Kubernetes config-map that mounts into the container I now receive the following:

  1. Failed: cannot extract value from json by path "$.port_table[ ?(@.port_idx=='10')].rx_bytes.first()": invalid object format, expected opening character '{' or '[' at: '/usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe
    { "at":"06:02:36", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: pars

Using Zabbix 7.0.8 and I have jq and expect installed. I also cleared (deleted) the old templates and imported brand new based on the master branch.

Config map that I imported directly from the file:

#!/usr/bin/env bash
set -uo pipefail

declare HE_RSA_SSH_KEY_OPTIONS='-o PubkeyAcceptedKeyTypes=+ssh-rsa -o HostKeyAlgorithms=+ssh-rsa'

#AP|SWITCH|SWITCH_FEATURE_DISCOVERY|SWITCH_DISCOVERY|UDMP|USG
declare -A VALIDATOR_BY_TYPE
VALIDATOR_BY_TYPE["AP"]=".vap_table? != null and .radio_table != null"
VALIDATOR_BY_TYPE["UDMP"]=".network_table? != null"
VALIDATOR_BY_TYPE["USG-LITE"]=".network_table? != null"
VALIDATOR_BY_TYPE["USG"]="( .network_table? != null ) and ( .network_table | map(select(.mac!=null)) | length>0 )"
declare -A OPTIONAL_VALIDATOR_BY_TYPE
declare -A OPTION_MESSAGE
#OPTIONAL_VALIDATOR_BY_TYPE["USG"]=" ( ( .[\"system-stats\"].temps | length ) == 4 ) "
#OPTION_MESSAGE["USG"]="missingTemperatures"

declare RETRIABLE_ERROR=250
declare SSH_CONNECT_TIMEOUT=5

#---------------------------------------------------------------------------------------
# Utilities

function runWithTimeout () { 
	local timeout=$1
	shift
	"$@" &
	local child=$!
	# Avoid default notification in non-interactive shell for SIGTERM
	trap -- "" SIGTERM
	local now; now=$(date +%s%N); now="${now:0:-6}"
	local endDate; endDate=$(( now + timeout*1000 ))
	local running=true
	( 	while (( now < endDate )) && [[ -n "${running}" ]];  
	  	do 
	  		sleep 0.1
			if ! ps -p ${child} > /dev/null; then
				running=
			fi
			now=$(date +%s%N); now="${now:0:-6}"
		done
		if [[ -n "${running}" ]]; then kill ${child} 2> /dev/null; fi
	) &
	wait ${child}
}


function errorJsonWithReason() {
	local reason; reason=$(echo "$1" | tr -d "\"'\n\r" )
	local t; t=$(date +"%T")
	echo '{ "at":"'"${t}"'", "r":"'"${reason}"'", "device":"'"${TARGET_DEVICE}"'", "mcaDumpError":"Error" }' 
}

function validationErrorJsonWithReason() {
	local reason; reason=$(echo "$1" | tr -d "\"'\n\r" )
	local t; t=$(date +"%T")
	echo '{ "at":"'"${t}"'", "r":"'"${reason}"'", "device":"'"${TARGET_DEVICE}"'", "mcaDumpValidationError":"Error" }' 
}

function timeoutJsonWithReason() {
	local reason; reason=$(echo "$1" | tr -d "\"'\n\r" )
	local t; t=$(date +"%T")
	echo '{ "at":"'"${t}"'", "r":"'"${reason}"'", "device":"'"${TARGET_DEVICE}"'", "mcaDumpTimeout":"Error" }' 
}


function insertWarningIntoJsonOutput() {
	local warning=$1
	local output=$2
	echo "${output}" | jq ". + { mcaDumpWarning: { \"${warning}\": true } }"
	echoErr "warning: $warning"
}

function echoErr() {
	local error=$1
	{
		echo "----------------------------------"
		echo "$(date) $TARGET_DEVICE"
		echo "  ${error}"
	} >> "${errFile}"
	if [[ -f "/./.dockerenv" ]]; then   # also echo the error to docker logs if running inside a container
	{
		echo "  ${error}" 
	} >> /proc/1/fd/1
	fi
}

function issueSSHCommand() {
	local command=$*
 	if [[ -n "${VERBOSE:-}" ]]; then
 		#shellcheck disable=SC2086
 		echo ${SSHPASS_OPTIONS} ssh ${SSH_PORT} ${VERBOSE_SSH} ${HE_RSA_SSH_KEY_OPTIONS} ${BATCH_MODE} -o LogLevel=Error -o ConnectTimeout=${SSH_CONNECT_TIMEOUT} -o StrictHostKeyChecking=accept-new ${PRIVKEY_OPTION} "${USER}@${TARGET_DEVICE}" "$command"
 	fi
 	#shellcheck disable=SC2086
	${SSHPASS_OPTIONS} ssh ${SSH_PORT} ${VERBOSE_SSH} ${HE_RSA_SSH_KEY_OPTIONS} ${BATCH_MODE} -o LogLevel=Error -o ConnectTimeout=${SSH_CONNECT_TIMEOUT} -o StrictHostKeyChecking=accept-new ${PRIVKEY_OPTION} "${USER}@${TARGET_DEVICE}" "$command"
}

declare TRUNCATE_SIZE=1000000 # 1M
declare TRUNCATE_FREQUENCY=86400 #1D
function truncateFileOnceADay() {
	local file=$1
	if [[ -f "$file" ]]; then
		local size
		if ! size=$(wc -c < "$file"); then return; fi
		if (( size > TRUNCATE_SIZE )); then
			local haveToTrunc=1
			local truncMarker="$file.truncMarker"
			if [[ -f "$truncMarker" ]]; then
				local trunkMarkerDate; 
				if ! trunkMarkerDate=$(date -r "$truncMarker" +%s); then return; fi	
				local now; now=$(date +%s)
				if (( now - trunkMarkerDate < TRUNCATE_FREQUENCY )); then  
					haveToTrunc=0
				fi
			fi
			if (( haveToTrunc )); then
				local tmpFile="$file.tmpTrunc"
				tail -c "$TRUNCATE_SIZE" "$file" > "$tmpFile"
				mv "$tmpFile" "$file"
				touch "$truncMarker"
			fi
		fi 
	fi
}

#---------------------------------------------------------------------------------------
# Fan Discovery

function fanDiscovery() {
	local -n exitCode=$1
	exitCode=0
	shift
	local sensors; sensors=$(issueSSHCommand sensors | grep -E "^fan[0-9]:" | cut -d' ' -f1)
	exitCode=$?
	if (( exitCode == 0 )); then
		local first=true
		echo -n "[ "
		for fan in $sensors; do
			if [[ -n "$fan" ]]; then 
				if [[ -z "$first" ]]; then echo -n ","; else first=; fi				
				echo -n "{ \"name\": \"${fan::-1}\" }"
			fi
		done
		echo -n " ]"
	fi
}

#---------------------------------------------------------------------------------------
# Switch Discovery


# thanks @zpolisensky for this contribution
#shellcheck disable=SC2016
PORT_NAMES_AWK='
BEGIN { IFS=" \n"; first=1; countedPortId=0 }
match($0, "^interface 0/[0-9]+$") { 
	portId=substr($2,3)
}
match($0, "^interface [A-z0-9]+$") { 
	countedPortId=countedPortId+1
	portId=countedPortId
}
/description / {
		desc=""
		defaultDesc="Port " portId
		for (i=2; i<=NF; i++) {
			f=$i
			if (i==2) f=substr(f,2)
			if (i==NF) 
				f=substr(f,1,length(f)-1)
			else
				f=f " "
			desc=desc f
		}
		if (first != 1) printf "| "
		first=0
		if ( desc == defaultDesc) 
			desc="-"
		else
			desc="(" desc ")"
		printf ".port_table[" portId-1 "] += { \"port_desc\": \"" desc "\" }"
	}'





declare SWITCH_DISCOVERY_DIR="/tmp/unifiSwitchDiscovery"
function startSwitchDiscovery() {
	local jqProgram=$1
	local exp; exp=$(command -v expect)
	if [[ -z "${exp}" ]]; then exp=$(ls /usr/bin/expect); fi
	if [[ -z "${exp}" ]]; then 
		OUTPUT=$(errorJsonWithReason "please install 'expect' to run SWITCH_DISCOVERY")
		return 1
	else
		mkdir -p "${SWITCH_DISCOVERY_DIR}"
		#shellcheck disable=SC2034 
		# o=$(runWithTimeout 60 retrievePortNamesInto "${jqProgram}") &
		#	nohup needs a cmd-line utility
		#	nohup runWithTimeout 60 retrievePortNamesInto "${jqProgram}" &
		#(set -m; runWithTimeout 60 retrievePortNamesInto "${jqProgram}" &) &
		#runWithTimeout 60 retrievePortNamesInto "${jqProgram}" &
		runWithTimeout 60 retrievePortNamesInto "${jqProgram}" > /dev/null 2> /dev/null < /dev/null & disown
	fi
	return 0
}


function retrievePortNamesInto() {
	local logFile="$1-$RANDOM.log"
	local jqFile=$1
	local outStream="/dev/null"
	local options=
	#sleep $(( TIMEOUT + 1 )) # This ensures we leave the switch alone while mca-dump proper is processed;  the next invocation will find the result
 	if [[ -n "${VERBOSE:-}" ]]; then
 		#shellcheck disable=SC2086
 		echo ${SSHPASS_OPTIONS} spawn ssh  ${SSH_PORT} ${VERBOSE_SSH} ${HE_RSA_SSH_KEY_OPTIONS} -o LogLevel=Error -o ConnectTimeout=${SSH_CONNECT_TIMEOUT} -o LogLevel=Error -o StrictHostKeyChecking=accept-new "${PRIVKEY_OPTION}" "${USER}@${TARGET_DEVICE}"  >&2
 	fi
 	if [[ -n "${VERBOSE_PORT_DISCOVERY:-}" ]]; then
 		options="-d"
 		outStream="/dev/stdout"
 	fi

	#shellcheck disable=SC2086
	/usr/bin/expect ${options} > "${outStream}" <<EOD
      set timeout 30

      spawn ${SSHPASS_OPTIONS} ssh  ${SSH_PORT} ${HE_RSA_SSH_KEY_OPTIONS}  -o ConnectTimeout=${SSH_CONNECT_TIMEOUT} -o LogLevel=Error -o StrictHostKeyChecking=accept-new ${PRIVKEY_OPTION} ${USER}@${TARGET_DEVICE}
      
	  send -- "\r"

      expect ".*#"
	  send -- "cat /etc/board.info | grep board.name | cut -d'=' -f2\r"
      expect ".*\r\n"
	  expect {

	  	-re "(USW-Aggregation|USW-Flex-XG|USW-Enterprise-8-PoE)\r\n" {
		  expect -re ".*#"

		  send -- "cli\r"
		  expect -re ".*#"
		  
		  send -- "terminal length 0\r"
		  expect -re ".*#"

		  send -- "terminal datadump\r"
 		  expect -re ".*#"
		  
		  send -- "show run\r"
		  log_file -noappend ${logFile};
		  expect -re ".*#"
		  
		  send -- "exit\r"

	  	}	  	
	  	
	  	"USW-Flex\r\n" {
		  log_file -noappend ${logFile};
		  send_log "interface 0/1\r\n"
		  send_log "description 'Port 1'\r\n"
		  send_log "interface 0/2\r\n"
		  send_log "description 'Port 2;\r\n"
		  send_log "interface 0/3\r\n"
		  send_log "description 'Port 3'\r\n"
		  send_log "interface 0/4\r\n"
		  send_log "description 'Port 4'\r\n"
		  log_file;
	  	 }

		-re ".*\r\n" { 
			send -- "telnet 127.0.0.1\r"
			expect { 
				"(UBNT) >" { 
					send -- "enable\r"
					expect "(UBNT) #" 

					send -- "terminal length 0\r"
					expect "(UBNT) #"

					send -- "show run\r" 
					log_file -noappend ${logFile};

					expect "(UBNT) #" 
					send -- "exit\r"

				} 
				"telnet: not found\r\n" { 
					send -- "cli\r"
					expect -re ".*#" 

					send -- "terminal length 0\r"
					expect -re ".*#" 

					send -- "show run\r" 
					log_file -noappend ${logFile};
					expect -re ".*#" 
				
					send "exit\r" 
				}
			}
		}
EOD
	local exitCode=$?
	if (( exitCode )); then
		{ 	echo "$(date) $TARGET_DEVICE"; 
			echo "  retrievePortNamesInto failed with code $exitCode";
			echo "Full command was mca-dump-short.sh $FULL_ARGS" 
			if [[ -f "$logFile" ]]; then 
				cat "$logFile"
			fi
		} >> "${errFile}"
		exit "${exitCode}"
	fi

	if [[ -f "$logFile" ]]; then 
		#shellcheck disable=SC2002
		cat "$logFile" | tr -d '\r' | awk "$PORT_NAMES_AWK" > "${jqFile}"
		rm -f "$logFile" 2>/dev/null
	else
		if [[ -n "${VERBOSE:-}" ]]; then
			echo "** No Show Run output"
		fi	
	fi

}

function insertPortNamesIntoJson() {
	local -n out=$1
	local jqProgramFile=$2
	local json=$3
	if [[ -f "${jqProgramFile}" ]]; then	
		if [[ -n "${VERBOSE:-}" ]]; then
			echo "jqProgramFile: "
			cat "${jqProgramFile}"
			echo; echo
		fi
		#shellcheck disable=SC2034
		out=$(echo "${json}" | jq -f "${jqProgramFile}" -r)
		#rm "$jqProgramFile" 2>/dev/null # we now leave it for the next guy
	else
		exit 2
	fi
}

#---------------------------------------------------------------------------------------------------------------------
# mca-dump invocation


function invokeUpToNTimesWithDelay() {
	local count=$1
	local delay=$2
	shift 2
	local returnCode=0
	local invocations
	for (( invocations=0; invocations < count; invocations++ )); do
		"$@"
		returnCode=$?
		if (( returnCode==0 || returnCode != RETRIABLE_ERROR )); then
			invocations=$count
		else
			echoErr " Warning: Retrying $1 request"
			sleep "$delay"
		fi
	done
	return $returnCode
}

function invokeMcaDump() {
	local deviceType=$1
	local jqProgram=$2
	local -n exitCode=$3; exitCode=0
	local -n output=$4; output=
	local -n jsonOutput=$5; jsonOutput=

	local indentOption="--indent 0"


	local delay=1 # the CPU is very wimpy on the USG-lite, ssh into it affects the usage.  Sleeping 2s gets a better CPU read
	case "${deviceType:-}" in 

		AP) 							JQ_OPTIONS='del (.port_table) |
													del(.radio_table[]?.scan_table) | del(.scan_radio_table) |
												    del(.radio_table[]?.spectrum_table) |
												    ( .vap_table[]|= ( .clientCount = ( .sta_table|length ) ) ) | del (.vap_table[]?.sta_table)' ;;
		SWITCH | SWITCH_DISCOVERY)		JQ_OPTIONS='del (.port_table[]?.mac_table)' ;;
		SWITCH_FEATURE_DISCOVERY)		JQ_OPTIONS="[ { power:  .port_table |  any (  .poe_power >= 0 ) ,\
												total_power_consumed_key_name: \"total_power_consumed\",\
												max_power_key_name: \"max_power\",\
												max_power: .total_max_power,\
												percent_power_consumed_key_name: \"percent_power_consumed\",\
												has_eth1: .has_eth1,\
												has_temperature: .has_temperature,\
												temperature_key_name: \"temperature\",\
													overheating_key_name: \"overheating\",\
												has_fan: .has_fan,\
												fan_level_key_name: \"fan_level\"
												} ]" ;;
		UDMP| USG)						JQ_OPTIONS='del (.dpi_stats) | del(.fingerprints) | del( .network_table[]? |  select ( .address == null ))' ;;
		USG-LITE)						JQ_OPTIONS='del (.dpi_stats) | del(.fingerprints) | del( .network_table[]? |  select ( .address == null ))'
										delay=4 ;;  # the CPU is very wimpy on the USG-lite, ssh into it affects the usage.  Sleeping 2s gets a better CPU read
		*)								echo "Unknown device Type: '${DEVICE_TYPE:-}'"; usage ;;
	esac
	
	#shellcheck disable=SC2086
	output=$(timeout --signal=HUP --kill-after=5 "${TIMEOUT}" \
		${SSHPASS_OPTIONS} ssh ${SSH_PORT} ${VERBOSE_SSH} ${HE_RSA_SSH_KEY_OPTIONS} ${BATCH_MODE} -o LogLevel=Error -o ConnectTimeout=${SSH_CONNECT_TIMEOUT} -o StrictHostKeyChecking=accept-new ${PRIVKEY_OPTION} "${USER}@${TARGET_DEVICE}" ${delay:+sleep ${delay}\;} mca-dump 2>&1	)
	exitCode=$?
	#shellcheck disable=SC2034
	jsonOutput="${output}"

	if (( exitCode == 124  )); then
		output=$(timeoutJsonWithReason "timeout ($exitCode)")
	elif (( exitCode )) || [[ -z "${output}" ]]; then
		output=$(errorJsonWithReason "$(echo "Remote pb: "; echo "${output}" )")
		exitCode=1
	else
		if [[ -n "${JQ_VALIDATOR:-}" ]]; then
			local validation; validation=$(echo "${output}" | jq "${JQ_VALIDATOR}")
			exitCode=$?
			if [[ -z "${validation}"  || "${validation}" == "false" ]] || (( exitCode )); then
				output=$(validationErrorJsonWithReason "validationError: ${JQ_VALIDATOR}")
				exitCode=$RETRIABLE_ERROR
			fi
		fi
		if (( ! exitCode )) && [[ -n "${JQ_OPTION_VALIDATOR:-}" ]]; then
			local optionValidation; optionValidation=$(echo "${output}" | jq "${JQ_OPTION_VALIDATOR}")
			exitCode=$?
			if [[ -z "${optionValidation}" ]] || [[ "${optionValidation}" == "false" ]] || (( exitCode != 0 )); then				
				local message=${OPTION_MESSAGE["${DEVICE_TYPE}"]:-"unknownWarning"}
				output=$(insertWarningIntoJsonOutput "$message" "$output")
			fi			
		fi		
		if (( ! exitCode )); then
			errorFile="/tmp/jq$RANDOM$RANDOM.err"
			jqInput=${output}
			output=
			#shellcheck disable=SC2086
			output=$(echo  "${jqInput}" | jq ${indentOption} "${JQ_OPTIONS}" 2> "${errorFile}")
			exitCode=$?
			if (( exitCode != 0 )) || [[ -z "${output}" ]]; then
				output=$(errorJsonWithReason "jq ${indentOption} ${JQ_OPTIONS} returned status $exitCode; $(cat "$errorFile")")
				exitCode=1
			fi
			rm -f "${errorFile}" 2>/dev/null
		fi
	fi

	if (( ! exitCode )) && [[ "${DEVICE_TYPE:-}" == 'SWITCH_DISCOVERY' ]]; then
		# do not wait anymore for retrievePortNamesInto
		# this will ensure we don't time out, but sometimes we will use an older file
		# wait 
		errorFile="/tmp/jq${RANDOM}${RANDOM}.err"
		jqInput="${output}"
		output=
		insertPortNamesIntoJson output "${jqProgram}" "${jqInput}"  2> "${errorFile}"
		local code=$?
		if (( code != 0 )) || [[ -z "${output}" ]]; then
			output=$(errorJsonWithReason "insertPortNamesIntoJson failed with error code $code; $(cat "$errorFile")")
			exitCode=1
		fi
		rm "${errorFile}" 2>/dev/null
	fi
	return "$exitCode"
}


#------------------------------------------------------------------------------------------------


function usage() {

	local error="${1:-}"
	if [[ -n "${error}" ]]; then
		echo "${error}"
		echo
	fi
	
	cat <<- EOF
	Usage ${0}  -i privateKeyPath -p <passwordFilePath> -u user -v -d targetDevice [-t AP|SWITCH|SWITCH_FEATURE_DISCOVERY|SWITCH_DISCOVERY|UDMP|UDMP_FAN_DISCOVERY|UDMP_TEMP_DISCOVERY|USG|USG-LITE]
	  -i specify private public key pair path
	  -p specify password file path to be passed to sshpass -f. Note if both -i and -p are provided, the password file will be used
	  -u SSH user name for Unifi device
	  -d IP or FQDN for Unifi device
	  -o alternate port for SSH connection
	  -t Unifi device type
	  -v verbose and non compressed output
	  -w verbose output for port discovery
	  -x extreme debugging
	  -o <timeout> max timeout (3s minimum)
	  -O echoes debug and timing info to /tmp/mcaDumpShort.log; errors are always echoed to /tmp/mcaDumpShort.err
	  -V <jqExpression> Provide a JQ expression that must return a non empty output to validate the results. A json error is returned otherwise
	  -b run SSH in batch mode (do not ask for passwords)
	EOF
	exit 1
}

function checkOptForMissingMacro() {
	local v=$1
	local t=$2
	if [[ "$v" == "{\$$t}" ]]; then
		echo "Please set the {\$$t} macro in zabbix > Administration"
	fi
}

#------------------------------------------------------------------------------------------------

declare SSHPASS_OPTIONS=
declare PRIVKEY_OPTION=
declare PASSWORD_FILE_PATH=
declare VERBOSE_OPTION=
declare TIMEOUT=15
declare VERBOSE_SSH=
declare SSH_PORT=
declare TARGET_DEVICE_PORT=
declare logFile="/tmp/mcaDumpShort.log"
declare errFile="/tmp/mcaDumpShort.err"
declare ECHO_OUTPUT=
declare VERBOSE=
declare FULL_ARGS="$*"
declare BATCH_MODE=

while getopts 'i:u:t:hd:vp:wm:o:OV:U:P:ebx' OPT
do
  case $OPT in
    i) 	checkOptForMissingMacro "${OPTARG}" "UNIFI_SSH_PRIV_KEY_PATH}"
    	PRIVKEY_OPTION="-i "${OPTARG} ;;
    u) 	checkOptForMissingMacro "${OPTARG}" "USER}"
    	USER=${OPTARG} ;;
    t) 	DEVICE_TYPE=${OPTARG} ;;
    d) 	TARGET_DEVICE=${OPTARG} ;;
    P) 	TARGET_DEVICE_PORT=${OPTARG} ;;
    v) 	export VERBOSE=true ;;
    p) 	PASSWORD_FILE_PATH=${OPTARG} ;;
    w) 	VERBOSE_PORT_DISCOVERY=true ;;
    m) 	logFile=${OPTARG} ;;
    o) 	TIMEOUT=$(( OPTARG-1 )) ;;
    O) 	ECHO_OUTPUT=true ;;
    V) 	JQ_VALIDATOR=${OPTARG} ;;
    x)	set -x ;;
    b) 	BATCH_MODE="-o BatchMode=yes" ;;
    e) 	echo -n "$(errorJsonWithReason "simulated error")"; exit 1 ;;
    U)  if [[ -n "${OPTARG}" ]] &&  [[ "${OPTARG}" != "{\$UNIFI_VERBOSE_SSH}" ]]; then
    		export VERBOSE_SSH="${OPTARG}"
    	fi ;;
    *) usage ;;
  esac
done

declare EXIT_CODE=0
declare OUTPUT=
declare JSON_OUTPUT=



if [[ -n "${ECHO_OUTPUT:-}" ]]; then
	START_TIME=$(date +%s)
fi

if [[ -n "${VERBOSE:-}" ]]; then
        export VERBOSE_OPTION="-v"
fi

if [[ -z "${TARGET_DEVICE:-}" ]]; then
	usage "Please specify a target device with -d"
fi

if [[ -z "${DEVICE_TYPE:-}" ]]; then
	usage "Please specify a device type with -t"
fi

if [[ "${TARGET_DEVICE_PORT}" == "{\$UNIFI_SSH_PORT}" ]]; then
	TARGET_DEVICE_PORT=""
fi
if [[ -n "${TARGET_DEVICE_PORT}" ]]; then
	if (( TARGET_DEVICE_PORT == 0 )) || (( TARGET_DEVICE_PORT < 0 )) || (( TARGET_DEVICE_PORT > 65535 )); then
		echo "Please specify a valid port with -P ($TARGET_DEVICE_PORT was specified)" >&2
		usage
	fi
	if (( TARGET_DEVICE_PORT != 10050 )); then
		SSH_PORT="-p ${TARGET_DEVICE_PORT}"
	fi
fi


if [[ -z "${USER:-}" ]]; then
	echo "Please specify a username with -u" >&2
	usage
fi


if [[ -z "${JQ_VALIDATOR:-}" ]]; then
	JQ_VALIDATOR=${VALIDATOR_BY_TYPE["${DEVICE_TYPE}"]:-}
fi
declare JQ_OPTION_VALIDATOR=${OPTIONAL_VALIDATOR_BY_TYPE["${DEVICE_TYPE}"]:-}


# {$UNIFI_SSHPASS_PASSWORD_PATH} means the macro didn't resolve in Zabbix
if [[ -n "${PASSWORD_FILE_PATH}" ]] && ! [[ "${PASSWORD_FILE_PATH}" == "{\$UNIFI_SSHPASS_PASSWORD_PATH}" ]]; then 
	if ! [[ -f "${PASSWORD_FILE_PATH}" ]]; then
		echo "Password file not found '$PASSWORD_FILE_PATH'"
		exit 1
	fi
	SSHPASS_OPTIONS="sshpass -f ${PASSWORD_FILE_PATH} ${VERBOSE_OPTION}"
	PRIVKEY_OPTION=
fi

declare JQ_PROGRAM="${SWITCH_DISCOVERY_DIR}/switchPorts-${TARGET_DEVICE}.jq"
if [[ ${DEVICE_TYPE:-} == 'SWITCH_DISCOVERY' ]]; then
	startSwitchDiscovery "$JQ_PROGRAM"  # asynchronously discover port names
	EXIT_CODE=$?
fi

if (( EXIT_CODE == 0 )); then
	case "${DEVICE_TYPE}" in
		UDMP_FAN_DISCOVERY)	fanDiscovery EXIT_CODE OUTPUT JSON_OUTPUT ;;
		*)					invokeUpToNTimesWithDelay 2 0 invokeMcaDump "$DEVICE_TYPE" "$JQ_PROGRAM" EXIT_CODE OUTPUT JSON_OUTPUT ;;
	esac
fi


if [[ -n "${ECHO_OUTPUT:-}" ]]; then
	END_TIME=$(date +%s)
	DURATION=$((  END_TIME - START_TIME   ))
	echo "$(date): ${TARGET_DEVICE}:${TARGET_DEVICE_PORT:-} ${DEVICE_TYPE} ${JQ_VALIDATOR:-} : ${DURATION}s - $EXIT_CODE" >> "${logFile}" 
	if [[ -n "${ECHO_OUTPUT:-}" ]]; then
		echo -n "${OUTPUT}" >> "${logFile}" 
		echo >> "${logFile}"
	fi
fi

if (( EXIT_CODE )); then
	echoErr "${OUTPUT}" 
	echoErr "${JSON_OUTPUT}" 
fi

echo "${OUTPUT}"

truncateFileOnceADay "$errFile"
truncateFileOnceADay "$logFile"

exit $EXIT_CODE
@patricegautier
Copy link
Owner

would be great to know the rest of this error message: '{ "at":"06:02:36", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: pars..'

if you look in /tmp/mcaDump.err on your zabbix server, can you find the rest?

@UntestedEngineer
Copy link
Author

UntestedEngineer commented Jan 22, 2025

Apologizes, see below.

I use the following switches and they all seem to have the same problem when using the current mca dump from the master branch. I roll back to using an older mca dump from earlier last year/2023 it seems to work without issue.

  • USW Enterprise 8 PoE
  • US XG 16
  • US XG 6 PoE
  • USW Flex XG

I redacted all my host IPs with "ip_host"

----------------------------------
Wed Jan 22 21:33:34 UTC 2025 ip_host
  { "at":"21:33:34", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: parse error: Invalid numeric literal at line 1, column 14", "device":"ip_host", "mcaDumpError":"Error" }
----------------------------------

----------------------------------
Wed Jan 22 21:33:34 UTC 2025 ip_host
  { "at":"21:33:34", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: parse error: Invalid numeric literal at line 1, column 14", "device":"ip_host", "mcaDumpError":"Error" }
----------------------------------

----------------------------------
Wed Jan 22 21:33:35 UTC 2025 ip_host
  { "at":"21:33:35", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: parse error: Invalid numeric literal at line 1, column 14", "device":"ip_host", "mcaDumpError":"Error" }
----------------------------------

----------------------------------
Wed Jan 22 21:33:37 UTC 2025 ip_host
  { "at":"21:33:37", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: parse error: Invalid numeric literal at line 1, column 14", "device":"ip_host", "mcaDumpError":"Error" }
----------------------------------

----------------------------------
Wed Jan 22 21:33:38 UTC 2025 ip_host
  { "at":"21:33:38", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: parse error: Invalid numeric literal at line 1, column 14", "device":"ip_host", "mcaDumpError":"Error" }
----------------------------------

----------------------------------
Wed Jan 22 21:33:38 UTC 2025 ip_host
  { "at":"21:33:38", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: parse error: Invalid numeric literal at line 1, column 14", "device":"ip_host", "mcaDumpError":"Error" }
----------------------------------

----------------------------------
Wed Jan 22 21:35:32 UTC 2025 ip_host
  { "at":"21:35:32", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: parse error: Invalid numeric literal at line 1, column 14", "device":"ip_host", "mcaDumpError":"Error" }
----------------------------------
----------------------------------
Wed Jan 22 21:35:33 UTC 2025 ip_host
  { "at":"21:35:33", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: parse error: Invalid numeric literal at line 1, column 14", "device":"ip_host", "mcaDumpError":"Error" }
----------------------------------

----------------------------------
Wed Jan 22 21:35:34 UTC 2025 ip_host
  { "at":"21:35:34", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: parse error: Invalid numeric literal at line 1, column 14", "device":"ip_host", "mcaDumpError":"Error" }
----------------------------------
1. Result: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
2. Failed: cannot extract value from json by path "$.['system-stats'].cpu": invalid object format, expected opening character '{' or '[' at: '/usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe
{ "at":"21:33:34", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: pars
   302:20250122:213334.676 item "Basement Infrastructure Switch 1:load_avg_1mn" became not supported: Preprocessing failed for: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
1. Result: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
2. Failed: cannot extract value from json by path "$.sys_stats.loadavg_1": invalid object format, expected opening character '{' or '[' at: '/usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe
{ "at":"21:33:34", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: pars
   302:20250122:213334.676 item "Basement Infrastructure Switch 1:load_avg_5mn" became not supported: Preprocessing failed for: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
1. Result: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
2. Failed: cannot extract value from json by path "$.sys_stats.loadavg_5": invalid object format, expected opening character '{' or '[' at: '/usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe
{ "at":"21:33:34", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: pars
   302:20250122:213334.676 item "Basement Infrastructure Switch 1:load_avg_15mn" became not supported: Preprocessing failed for: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
1. Result: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
2. Failed: cannot extract value from json by path "$.sys_stats.loadavg_15": invalid object format, expected opening character '{' or '[' at: '/usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe
{ "at":"21:33:34", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: pars
   302:20250122:213334.676 item "Basement Infrastructure Switch 1:mac_address" became not supported: Preprocessing failed for: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
1. Result: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
2. Failed: cannot extract value from json by path "$.mac": invalid object format, expected opening character '{' or '[' at: '/usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe
{ "at":"21:33:34", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: pars
   302:20250122:213334.676 item "Basement Infrastructure Switch 1:mem_total" became not supported: Preprocessing failed for: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
1. Result: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
2. Failed: cannot extract value from json by path "$.sys_stats.mem_total": invalid object format, expected opening character '{' or '[' at: '/usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe
{ "at":"21:33:34", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: pars
   302:20250122:213334.676 item "Basement Infrastructure Switch 1:mem_used" became not supported: Preprocessing failed for: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
1. Result: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
2. Failed: cannot extract value from json by path "$.sys_stats.mem_used": invalid object format, expected opening character '{' or '[' at: '/usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe
{ "at":"21:33:34", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: pars
   302:20250122:213334.676 item "Basement Infrastructure Switch 1:model" became not supported: Preprocessing failed for: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
1. Result: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
2. Failed: cannot extract value from json by path "$.model_display": invalid object format, expected opening character '{' or '[' at: '/usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe
{ "at":"21:33:34", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: pars
   302:20250122:213334.676 item "Basement Infrastructure Switch 1:total_rx_bandwidth" became not supported: Preprocessing failed for: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
1. Result: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
2. Failed: cannot extract value from json by path "$.port_table[*].rx_bytes.sum()": invalid object format, expected opening character '{' or '[' at: '/usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe
{ "at":"21:33:34", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: pars
   302:20250122:213334.676 item "Basement Infrastructure Switch 1:total_tx_bandwidth" became not supported: Preprocessing failed for: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
1. Result: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
2. Failed: cannot extract value from json by path "$.port_table[*].tx_bytes.sum()": invalid object format, expected opening character '{' or '[' at: '/usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe
{ "at":"21:33:34", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: pars
   302:20250122:213334.676 item "Basement Infrastructure Switch 1:uptime" became not supported: Preprocessing failed for: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
1. Result: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
2. Failed: cannot extract value from json by path "$.uptime": invalid object format, expected opening character '{' or '[' at: '/usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe
{ "at":"21:33:34", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: pars
   302:20250122:213334.676 item "Basement Infrastructure Switch 1:_[overheating]" became not supported: Preprocessing failed for: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
1. Result: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
2. Failed: cannot extract value from json by path "$.overheating": invalid object format, expected opening character '{' or '[' at: '/usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe
{ "at":"21:33:34", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: pars
   302:20250122:213334.676 item "Basement Infrastructure Switch 1:_[temperature]" became not supported: Preprocessing failed for: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
1. Result: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
2. Failed: cannot extract value from json by path "$.general_temperature": invalid object format, expected opening character '{' or '[' at: '/usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe
{ "at":"21:33:34", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: pars
   302:20250122:213334.676 item "Basement Infrastructure Switch 1:_[fan_level]" became not supported: Preprocessing failed for: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
1. Result: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe.{ "at...
2. Failed: cannot extract value from json by path "$.fan_level": invalid object format, expected opening character '{' or '[' at: '/usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe
{ "at":"21:33:34", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: pars

@patricegautier
Copy link
Owner

patricegautier commented Jan 23, 2025 via email

@UntestedEngineer
Copy link
Author

See below. I redacted any IPs/MACs, RSA keys and usernames:

----------------------------------
Fri Jan 24 16:29:35 UTC 2025 switch_ip
-d switch_ip -u username -i /var/lib/zabbix/ssh_keys/zb_id_rsa -t SWITCH -p {$UNIFI_SSHPASS_PASSWORD_PATH} -U -vvv -o 30 -b

  { "at":"16:29:35", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: parse error: Invalid numeric literal at line 1, column 14", "device":"switch_ip", "mcaDumpError":"Error" }
----------------------------------
Fri Jan 24 16:29:35 UTC 2025 switch_ip
-d switch_ip -u username -i /var/lib/zabbix/ssh_keys/zb_id_rsa -t SWITCH -p {$UNIFI_SSHPASS_PASSWORD_PATH} -U -vvv -o 30 -b

  OpenSSH_9.6p1 Ubuntu-3ubuntu13.5, OpenSSL 3.0.13 30 Jan 2024
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: include /etc/ssh/ssh_config.d/*.conf matched no files
debug1: /etc/ssh/ssh_config line 21: Applying options for *
debug2: resolve_canonicalize: hostname switch_ip is address
debug3: expanded UserKnownHostsFile '~/.ssh/known_hosts' -> '/var/lib/zabbix/.ssh/known_hosts'
debug3: expanded UserKnownHostsFile '~/.ssh/known_hosts2' -> '/var/lib/zabbix/.ssh/known_hosts2'
debug3: channel_clear_timeouts: clearing
debug3: ssh_connect_direct: entering
debug1: Connecting to switch_ip [switch_ip] port 22.
debug3: set_sock_tos: set socket 3 IP_TOS 0x10
debug2: fd 3 setting O_NONBLOCK
debug1: fd 3 clearing O_NONBLOCK
debug1: Connection established.
debug3: timeout: 5000 ms remain after connect
debug1: identity file /var/lib/zabbix/ssh_keys/zb_id_rsa type 0
debug1: identity file /var/lib/zabbix/ssh_keys/zb_id_rsa-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.5
debug1: Remote protocol version 2.0, remote software version dropbear_2022.83
debug1: compat_banner: no match: dropbear_2022.83
debug2: fd 3 setting O_NONBLOCK
debug1: Authenticating to switch_ip:22 as 'username'
debug3: record_hostkey: found key type RSA in file /var/lib/zabbix/.ssh/known_hosts:4
debug3: load_hostkeys_file: loaded 1 keys from switch_ip
debug1: load_hostkeys: fopen /var/lib/zabbix/.ssh/known_hosts2: No such file or directory
debug1: load_hostkeys: fopen /etc/ssh/ssh_known_hosts: No such file or directory
debug1: load_hostkeys: fopen /etc/ssh/ssh_known_hosts2: No such file or directory
debug3: order_hostkeyalgs: prefer hostkeyalgs: [email protected],[email protected],rsa-sha2-512,rsa-sha2-256,ssh-rsa
debug3: send packet: type 20
debug1: SSH2_MSG_KEXINIT sent
debug3: receive packet: type 20
debug1: SSH2_MSG_KEXINIT received
debug2: local client KEXINIT proposal
debug2: KEX algorithms: [email protected],curve25519-sha256,[email protected],ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group14-sha256,ext-info-c,[email protected]
debug2: host key algorithms: [email protected],[email protected],rsa-sha2-512,rsa-sha2-256,ssh-rsa,[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],ssh-ed25519,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,[email protected],[email protected]
debug2: ciphers ctos: [email protected],aes128-ctr,aes192-ctr,aes256-ctr,[email protected],[email protected]
debug2: ciphers stoc: [email protected],aes128-ctr,aes192-ctr,aes256-ctr,[email protected],[email protected]
debug2: MACs ctos: [email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],hmac-sha2-256,hmac-sha2-512,hmac-sha1
debug2: MACs stoc: [email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],hmac-sha2-256,hmac-sha2-512,hmac-sha1
debug2: compression ctos: none,[email protected],zlib
debug2: compression stoc: none,[email protected],zlib
debug2: languages ctos: 
debug2: languages stoc: 
debug2: first_kex_follows 0 
debug2: reserved 0 
debug2: peer server KEXINIT proposal
debug2: KEX algorithms: curve25519-sha256,[email protected],ecdh-sha2-nistp521,ecdh-sha2-nistp384,ecdh-sha2-nistp256,diffie-hellman-group14-sha256,diffie-hellman-group14-sha1,[email protected],[email protected]
debug2: host key algorithms: rsa-sha2-256,ssh-rsa
debug2: ciphers ctos: [email protected],aes128-ctr,aes256-ctr
debug2: ciphers stoc: [email protected],aes128-ctr,aes256-ctr
debug2: MACs ctos: hmac-sha1,hmac-sha2-256
debug2: MACs stoc: hmac-sha1,hmac-sha2-256
debug2: compression ctos: none
debug2: compression stoc: none
debug2: languages ctos: 
debug2: languages stoc: 
debug2: first_kex_follows 0 
debug2: reserved 0 
debug3: kex_choose_conf: will use strict KEX ordering
debug1: kex: algorithm: curve25519-sha256
debug1: kex: host key algorithm: rsa-sha2-256
debug1: kex: server->client cipher: [email protected] MAC: <implicit> compression: none
debug1: kex: client->server cipher: [email protected] MAC: <implicit> compression: none
debug3: send packet: type 30
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug3: receive packet: type 31
debug1: SSH2_MSG_KEX_ECDH_REPLY received
debug1: Server host key: key
debug3: record_hostkey: found key type RSA in file /var/lib/zabbix/.ssh/known_hosts:4
debug3: load_hostkeys_file: loaded 1 keys from switch_ip
debug1: load_hostkeys: fopen /var/lib/zabbix/.ssh/known_hosts2: No such file or directory
debug1: load_hostkeys: fopen /etc/ssh/ssh_known_hosts: No such file or directory
debug1: load_hostkeys: fopen /etc/ssh/ssh_known_hosts2: No such file or directory
debug1: Host 'switch_ip' is known and matches the RSA host key.
debug1: Found key in /var/lib/zabbix/.ssh/known_hosts:4
debug3: send packet: type 21
debug1: ssh_packet_send2_wrapped: resetting send seqnr 3
debug2: ssh_set_newkeys: mode 1
debug1: rekey out after 134217728 blocks
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug3: receive packet: type 21
debug1: ssh_packet_read_poll2: resetting read seqnr 3
debug1: SSH2_MSG_NEWKEYS received
debug2: ssh_set_newkeys: mode 0
debug1: rekey in after 134217728 blocks
debug3: send packet: type 5
debug3: receive packet: type 7
debug1: SSH2_MSG_EXT_INFO received
debug3: kex_input_ext_info: extension server-sig-algs
debug1: kex_ext_info_client_parse: server-sig-algs=<ssh-ed25519,[email protected],ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,[email protected],rsa-sha2-256,ssh-rsa,ssh-dss>
debug3: receive packet: type 6
debug2: service_accept: ssh-userauth
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug3: send packet: type 50
debug3: receive packet: type 51
debug1: Authentications that can continue: publickey,password
debug3: start over, passed a different list publickey,password
debug3: preferred gssapi-with-mic,publickey
debug3: authmethod_lookup publickey
debug3: remaining preferred: ,publickey
debug3: authmethod_is_enabled publickey
debug1: Next authentication method: publickey
debug1: Will attempt key: /var/lib/zabbix/ssh_keys/zb_id_rsa key explicit
debug2: pubkey_prepare: done
debug1: Offering public key: /var/lib/zabbix/ssh_keys/zb_id_rsa key explicit
debug3: send packet: type 50
debug2: we sent a publickey packet, wait for reply
debug3: receive packet: type 60
debug1: Server accepts key: /var/lib/zabbix/ssh_keys/zb_id_rsa key explicit
debug3: sign_and_send_pubkey: using publickey with key
debug3: sign_and_send_pubkey: signing using rsa-sha2-256 key
debug3: send packet: type 50
debug3: receive packet: type 52
Authenticated to switch_ip ([switch_ip]:22) using "publickey".
debug2: fd 4 setting O_NONBLOCK
debug2: fd 5 setting O_NONBLOCK
debug1: channel 0: new session [client-session] (inactive timeout: 0)
debug3: ssh_session2_open: channel_new: 0
debug2: channel 0: send open
debug3: send packet: type 90
debug1: Entering interactive session.
debug1: pledge: filesystem
debug3: client_repledge: enter
debug3: receive packet: type 91
debug2: channel_input_open_confirmation: channel 0: callback start
debug2: fd 3 setting TCP_NODELAY
debug3: set_sock_tos: set socket 3 IP_TOS 0x08
debug2: client_session2_setup: id 0
debug1: Sending environment.
debug3: Ignored env KUBERNETES_SERVICE_PORT_HTTPS
debug3: Ignored env KUBE_STATE_METRICS_PORT_8080_TCP_ADDR
debug3: Ignored env ZABBIX_USER_HOME_DIR
debug3: Ignored env DEBUG_MODE
debug3: Ignored env KUBERNETES_SERVICE_PORT
debug3: Ignored env ZABBIX_WEB_PORT_8080_TCP_PORT
debug3: Ignored env GRAFANA_SVC_TCP_PORT
debug3: Ignored env ZABBIX_WEB_PORT
debug3: Ignored env HOSTNAME
debug3: Ignored env MIBS
debug3: Ignored env GRAFANA_SVC_TCP_PORT_3000_TCP
debug3: Ignored env PROMETHEUS_SVC_TCP_SERVICE_HOST
debug3: Ignored env ZABBIX_WEB_SERVICE_HOST
debug3: Ignored env ZABBIX_SERVER_SERVICE_PORT
debug3: Ignored env NMAP_PRIVILEGED
debug3: Ignored env ZABBIX_SERVER_PORT_10051_TCP_PORT
debug3: Ignored env ZABBIX_WEB_PORT_8080_TCP_PROTO
debug3: Ignored env ZABBIX_SERVER_PORT_10051_TCP_ADDR
debug3: Ignored env PWD
debug3: Ignored env KUBE_STATE_METRICS_PORT_8080_TCP_PORT
debug3: Ignored env ZABBIX_WEB_SERVICE_PORT
debug3: Ignored env PROMETHEUS_SVC_TCP_PORT_80_TCP
debug3: Ignored env GRAFANA_SVC_TCP_SERVICE_HOST
debug3: Ignored env ZABBIX_SERVER_PORT_10051_TCP_PROTO
debug3: Ignored env KUBE_STATE_METRICS_PORT
debug3: Ignored env HOME
debug3: Ignored env KUBERNETES_PORT_443_TCP
debug3: Ignored env PROMETHEUS_SVC_TCP_SERVICE_PORT_WEB
debug3: Ignored env KUBE_STATE_METRICS_PORT_8080_TCP_PROTO
debug3: Ignored env PROMETHEUS_SVC_TCP_PORT_80_TCP_ADDR
debug3: Ignored env ZABBIX_SERVER_SERVICE_PORT_ZABBIX_TRAPPER
debug3: Ignored env KUBE_STATE_METRICS_SERVICE_PORT_METRICS
debug3: Ignored env GRAFANA_SVC_TCP_PORT_3000_TCP_ADDR
debug3: Ignored env PROMETHEUS_SVC_TCP_PORT
debug3: Ignored env GRAFANA_SVC_TCP_PORT_3000_TCP_PROTO
debug3: Ignored env TERM
debug3: Ignored env PROMETHEUS_SVC_TCP_PORT_80_TCP_PROTO
debug3: Ignored env ZABBIX_SERVER_SERVICE_HOST
debug3: Ignored env ZABBIX_SERVER_PORT_10051_TCP
debug3: Ignored env MIBDIRS
debug3: Ignored env SHLVL
debug3: Ignored env KUBE_STATE_METRICS_SERVICE_HOST
debug3: Ignored env KUBERNETES_PORT_443_TCP_PROTO
debug3: Ignored env KUBE_STATE_METRICS_SERVICE_PORT
debug3: Ignored env KUBERNETES_PORT_443_TCP_ADDR
debug3: Ignored env ZABBIX_WEB_PORT_8080_TCP
debug3: Ignored env KUBE_STATE_METRICS_PORT_8080_TCP
debug3: Ignored env ZABBIX_CONF_DIR
debug3: Ignored env ZABBIX_WEB_SERVICE_PORT_WEB_HTTP
debug3: Ignored env ZABBIX_WEB_PORT_8080_TCP_ADDR
debug3: Ignored env KUBERNETES_SERVICE_HOST
debug3: Ignored env PROMETHEUS_SVC_TCP_SERVICE_PORT
debug3: Ignored env KUBERNETES_PORT
debug3: Ignored env KUBERNETES_PORT_443_TCP_PORT
debug3: Ignored env ZABBIX_SERVER_PORT
debug3: Ignored env PATH
debug3: Ignored env GRAFANA_SVC_TCP_PORT_3000_TCP_PORT
debug3: Ignored env VERBOSE_SSH
debug3: Ignored env PROMETHEUS_SVC_TCP_PORT_80_TCP_PORT
debug3: Ignored env GRAFANA_SVC_TCP_SERVICE_PORT
debug3: Ignored env _
debug1: Sending command: sleep 1; mca-dump
debug2: channel 0: request exec confirm 1
debug3: send packet: type 98
debug3: client_repledge: enter
debug2: channel_input_open_confirmation: channel 0: callback done
debug2: channel 0: open confirm rwindow 24576 rmax 32759
debug2: channel 0: read failed rfd 4 maxlen 24576: Broken pipe
debug2: channel 0: read failed
debug2: chan_shutdown_read: channel 0: (i0 o0 sock -1 wfd 4 efd 6 [write])
debug2: channel 0: input open -> drain
debug2: channel 0: ibuf empty
debug2: channel 0: send eof
debug3: send packet: type 96
debug2: channel 0: input drain -> closed
debug3: receive packet: type 99
debug2: channel_input_status_confirm: type 99 id 0
debug2: exec request accepted on channel 0
debug3: receive packet: type 96
debug2: channel 0: rcvd eof
debug2: channel 0: output open -> drain
debug3: receive packet: type 98
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
debug3: receive packet: type 97
debug2: channel 0: rcvd close
debug3: channel 0: will not send data after close
{
        "anon_id": "09bff330-134f-4046-8488-c7c893cc51f6",
        "architecture": "armv7l",
        "ble_caps": 0,
        "board_rev": 22,
        "bomrev": "113-03166-22",
        "bomrev_id": "000c5e16",
        "boot": {
                "id": "66a5377b-2826-47d2-8364-1d50b679a1d1"
        },
        "bootid": 1,
        "bootrom_version": "unknown",
        "cfgversion": "0acfe8bfa809ddab",
        "connect_request_ip": "controller_ip",
        "connect_request_port": "54876",
        "default": false,
        "dhcp_server_table": [
                {
                        "blocked": false,
                        "ip": "ip",
                        "last_seen": 0,
                        "mac": "mac",
                        "port_idx": 5,
                        "vlan": 4003
                },
                {
                        "blocked": false,
                        "ip": "ip",
                        "last_seen": 0,
                        "mac": "mac",
                        "port_idx": 5,
                        "vlan": 4000
                },
                {
                        "blocked": false,
                        "ip": "ip",
                        "last_seen": 0,
                        "mac": "mac",
                        "port_idx": 5,
                        "vlan": 2001
                }
        ],
        "discovery_response": false,
        "dualboot": true,
        "ever_crash": false,
        "fw2_caps": 65536,
        "fw_caps": 2806560293,
        "gateway_ip": "gateway_ip",
        "gateway_mac": "mac",
        "guest_kicks": 0,
        "guest_token": "token",
        "has_eth1": false,
        "has_fan": false,
        "has_speaker": false,
        "has_temperature": false,
        "hash_id": "hash",
        "hostname": "BasementDesktopSwitch3",
        "hw_caps": 0,
        "if_table": [
                {
                        "full_duplex": true,
                        "ip": "switch_ip",
                        "mac": "mac",
                        "name": "eth0",
                        "netmask": "255.255.255.0",
                        "num_port": 5,
                        "rx_bytes": 65176193,
                        "rx_dropped": 0,
                        "rx_errors": 0,
                        "rx_multicast": 0,
                        "rx_packets": 682347,
                        "speed": 10,
                        "tx_bytes": 143582261,
                        "tx_dropped": 0,
                        "tx_errors": 0,
                        "tx_packets": 471306,
                        "up": true
                }
        ],
        "inform_min_interval": 30,
        "inform_url": "http://inform_ip:8080/inform",
        "internet": true,
        "ip": "switch_ip",
        "ipv6": [],
        "isolated": false,
        "kernel_version": "4.4.52",
        "last_error_conns": [
                {
                        "error_reason": 11,
                        "last_error_str": "Waiting for sshd (http://inform_ip:8080/inform)",
                        "last_managed_e_time": 0,
                        "last_managed_s_time": 0,
                        "timestamp": "2024-09-26T15:36:24"
                },
                {
                        "error_reason": 4,
                        "last_error_str": "Timeout (http://inform_ip:8080/inform)",
                        "last_managed_e_time": 0,
                        "last_managed_s_time": 161,
                        "timestamp": "2025-01-22T11:13:11"
                },
                {
                        "error_reason": 7,
                        "last_error_str": "Server Busy (http://inform_ip:8080/inform)",
                        "last_managed_e_time": 0,
                        "last_managed_s_time": 161,
                        "timestamp": "2025-01-22T11:13:16"
                }
        ],
        "lldp_table": [
                {
                        "chassis_id": "id",
                        "is_wired": true,
                        "local_port_idx": 5,
                        "local_port_name": "Port 5",
                        "port_id": "te4"
                }
        ],
        "locating": false,
        "mac": "mac",
        "manufacturer_id": 2,
        "model": "USFXG",
        "model_display": "USW-Flex-XG",
        "netmask": "255.255.255.0",
        "overheating": false,
        "port_table": [
                {
                        "autoneg": false,
                        "dot1x_mode": "force_auth",
                        "dot1x_status": "authorized",
                        "enable": true,
                        "flowctrl_rx": false,
                        "flowctrl_tx": false,
                        "full_duplex": false,
                        "is_uplink": false,
                        "jumbo": true,
                        "mac_table": [],
                        "mac_table_count": 0,
                        "media": "GE",
                        "poe_caps": 0,
                        "port_idx": 1,
                        "port_poe": false,
                        "rx_broadcast": 0,
                        "rx_bytes": 0,
                        "rx_dropped": 0,
                        "rx_errors": 0,
                        "rx_multicast": 0,
                        "rx_packets": 0,
                        "satisfaction": 100,
                        "satisfaction_reason": 0,
                        "speed": 0,
                        "speed_caps": 1048623,
                        "stp_pathcost": 2000000,
                        "stp_state": "disabled",
                        "tx_broadcast": 0,
                        "tx_bytes": 0,
                        "tx_dropped": 0,
                        "tx_errors": 0,
                        "tx_multicast": 0,
                        "tx_packets": 0,
                        "up": false
                },
                {
                        "autoneg": false,
                        "dot1x_mode": "force_auth",
                        "dot1x_status": "authorized",
                        "enable": true,
                        "flowctrl_rx": false,
                        "flowctrl_tx": false,
                        "full_duplex": false,
                        "is_uplink": false,
                        "jumbo": true,
                        "mac_table": [],
                        "mac_table_count": 0,
                        "media": "10GE",
                        "poe_caps": 0,
                        "port_idx": 2,
                        "port_poe": false,
                        "rx_broadcast": 592,
                        "rx_bytes": 460809,
                        "rx_dropped": 0,
                        "rx_errors": 0,
                        "rx_multicast": 203,
                        "rx_packets": 2327,
                        "satisfaction": 100,
                        "satisfaction_reason": 0,
                        "speed": 0,
                        "speed_caps": 1049068,
                        "stp_pathcost": 2000000,
                        "stp_state": "disabled",
                        "tx_broadcast": 15,
                        "tx_bytes": 20699,
                        "tx_dropped": 0,
                        "tx_errors": 0,
                        "tx_multicast": 22,
                        "tx_packets": 173,
                        "up": false
                },
                {
                        "autoneg": false,
                        "dot1x_mode": "force_auth",
                        "dot1x_status": "authorized",
                        "enable": true,
                        "flowctrl_rx": false,
                        "flowctrl_tx": false,
                        "full_duplex": false,
                        "is_uplink": false,
                        "jumbo": true,
                        "mac_table": [],
                        "mac_table_count": 0,
                        "media": "10GE",
                        "poe_caps": 0,
                        "port_idx": 3,
                        "port_poe": false,
                        "rx_broadcast": 0,
                        "rx_bytes": 0,
                        "rx_dropped": 0,
                        "rx_errors": 0,
                        "rx_multicast": 0,
                        "rx_packets": 0,
                        "satisfaction": 100,
                        "satisfaction_reason": 0,
                        "speed": 0,
                        "speed_caps": 1049068,
                        "stp_pathcost": 2000000,
                        "stp_state": "disabled",
                        "tx_broadcast": 0,
                        "tx_bytes": 0,
                        "tx_dropped": 0,
                        "tx_errors": 0,
                        "tx_multicast": 0,
                        "tx_packets": 0,
                        "up": false
                },
                {
                        "autoneg": false,
                        "dot1x_mode": "force_auth",
                        "dot1x_status": "authorized",
                        "enable": true,
                        "flowctrl_rx": false,
                        "flowctrl_tx": false,
                        "full_duplex": false,
                        "is_uplink": false,
                        "jumbo": true,
                        "mac_table": [],
                        "mac_table_count": 0,
                        "media": "10GE",
                        "poe_caps": 0,
                        "port_idx": 4,
                        "port_poe": false,
                        "rx_broadcast": 0,
                        "rx_bytes": 0,
                        "rx_dropped": 0,
                        "rx_errors": 0,
                        "rx_multicast": 0,
                        "rx_packets": 0,
                        "satisfaction": 100,
                        "satisfaction_reason": 0,
                        "speed": 0,
                        "speed_caps": 1049068,
                        "stp_pathcost": 2000000,
                        "stp_state": "disabled",
                        "tx_broadcast": 0,
                        "tx_bytes": 0,
                        "tx_dropped": 0,
                        "tx_errors": 0,
                        "tx_multicast": 0,
                        "tx_packets": 0,
                        "up": false
                },
                {
                        "autoneg": true,
                        "dot1x_mode": "force_auth",
                        "dot1x_status": "authorized",
                        "enable": true,
                        "flowctrl_rx": true,
                        "flowctrl_tx": true,
                        "full_duplex": true,
                        "is_uplink": true,
                        "jumbo": true,
                        "mac_table": [
                                {
                                        "age": 9,
                                        "ip": "ip",
                                        "mac": "mac",
                                        "static": false,
                                        "uptime": 473256,
                                        "vlan": 3996
                                },
                                {
                                        "age": 141,
                                        "mac": "mac",
                                        "static": false,
                                        "uptime": 141,
                                        "vlan": 4000
                                },
                                {
                                        "age": 9,
                                        "ip": "ip",
                                        "mac": "mac",
                                        "static": false,
                                        "uptime": 472844,
                                        "vlan": 3996
                                }
                        ],
                        "mac_table_count": 3,
                        "media": "10GE",
                        "poe_caps": 0,
                        "port_idx": 5,
                        "port_poe": false,
                        "rx_broadcast": 2873832,
                        "rx_bytes": 474219747,
                        "rx_dropped": 0,
                        "rx_errors": 0,
                        "rx_multicast": 1218469,
                        "rx_packets": 4549595,
                        "satisfaction": 100,
                        "satisfaction_reason": 0,
                        "speed": 10000,
                        "speed_caps": 1049068,
                        "stp_pathcost": 2000,
                        "stp_state": "forwarding",
                        "tx_broadcast": 70462,
                        "tx_bytes": 155767663,
                        "tx_dropped": 0,
                        "tx_errors": 0,
                        "tx_multicast": 16757,
                        "tx_packets": 513234,
                        "up": true
                }
        ],
        "power_source": "262144",
        "reboot_duration": 135,
        "required_version": "5.46.0",
        "root_switch": "mac",
        "satisfaction": 100,
        "satisfaction_reason": 0,
        "selfrun_beacon": true,
        "serial": "serial",
        "service_mac": "mac",
        "ssh_session_table": [],
        "state": 2,
        "stats_inform_interval": 1,
        "stp_priority": 12288,
        "stream_token": "",
        "switch_caps": {
                "etherlight_caps": 0,
                "feature_caps": 68725758,
                "lag_group": [
                        {
                                "port_range": "1-1"
                        },
                        {
                                "port_range": "2-5"
                        }
                ],
                "max_acl_port_range": 4,
                "max_aggregate_sessions": 2,
                "max_class_maps": 0,
                "max_custom_ip_acls": 12,
                "max_custom_mac_acls": 10,
                "max_global_acls": 128,
                "max_l3_intf": 0,
                "max_mac_based_vlan_count": 256,
                "max_mirror_sessions": 1,
                "max_qos_profiles": 0,
                "max_reserved_routes": 0,
                "max_static_routes": 0,
                "max_vlan_count": 1000
        },
        "sys_error_caps": 0,
        "sys_stats": {
                "loadavg_1": "0.00",
                "loadavg_15": "0.03",
                "loadavg_5": "0.06",
                "mem_buffer": 0,
                "mem_total": 460693504,
                "mem_used": 252215296
        },
        "sysid": 60736,
        "system-stats": {
                "cpu": "1.7",
                "mem": "54.7",
                "uptime": "473358"
        },
        "time": 1737736175,
        "time_ms": 470,
        "timestamp": "2025-01-24T11:29:35",
        "tm_ready": true,
        "total_mac_in_used": 3,
        "total_max_power": 0,
        "upgrade_duration": 320,
        "uplink": "eth0",
        "uptime": 473380,
        "uptime_str": "5d11h29m40s",
        "version": "7.1.26.15869"
}
debug3: channel 0: will not send data after close
debug2: channel 0: obuf empty
debug2: chan_shutdown_write: channel 0: (i3 o1 sock -1 wfd 5 efd 6 [write])
debug2: channel 0: output drain -> closed
debug2: channel 0: almost dead
debug2: channel 0: gc: notify user
debug2: channel 0: gc: user detached
debug2: channel 0: send close
debug3: send packet: type 97
debug2: channel 0: is dead
debug2: channel 0: garbage collecting
debug1: channel 0: free: client-session, nchannels 1
debug3: channel 0: status: The following connections are open:
  #0 client-session (t4 [session] r0 i3/0 o3/0 e[write]/0 fd -1/-1/6 sock -1 cc -1 io 0x00/0x00)

debug3: send packet: type 1
Transferred: sent 3228, received 17568 bytes, in 1.1 seconds
Bytes per second: sent 2911.7, received 15846.3
debug1: Exit status 0

@patricegautier
Copy link
Owner

patricegautier commented Jan 24, 2025 via email

@UntestedEngineer
Copy link
Author

Yes,

I just downloaded the zip file from the current master branch and created a kubernetes configmap from the command line:

kubectl create configmap zabbix-externalscripts-conf --from-file=mca-dump-short.sh --from-file=ssh-run.sh -n monitoring

@patricegautier
Copy link
Owner

Ok one more time if you wouldn't mind.. Please update mca-dump-short and reproduce again..

@UntestedEngineer
Copy link
Author

UntestedEngineer commented Jan 25, 2025

Here is the output from the Kubernetes pod logs (first snippet) and then the mcaDumpShort.err snippet underneath. This time I am provided the entire contents of mcaDumpShort.err instead of just a few lines that looked of interest.

As with the last update I redacted all IPs (replaced with "ip"), MACs (replaced with "mac"), RSA keys (replaced with "redacted". I also redacted all hostnames replaced with "hostname".

Files are too long to paste so I uploaded them.

all-pods.log

mcaDumpShort.log

By comparison see below file for an older version of mca-dump-short.sh that appears to still work without issue but does not have the updated code that you have put in over the past several commits.

old_mca-dump-short.txt

When I look at the tmp directory on the zabbix server (pod) I can see the jq files for the port print outs when using the old mca-dump-short.sh file:

.port_table[0] += { "port_desc": "(Basement Desktop Switch 2 Port 2)" }| .port_table[1] += { "port_desc": "(ccloudpve01 Onboard NIC)" }| .port_table[2] += { "port_desc": "(Basement Infrastructure Switch 1)" }| .port_table[3] += { "port_desc": "(Basement Infrastructure Switch 2)" }| .port_table[4] += { "port_desc": "(Desktop_Laptop QNAP Thunderbolt )" }| .port_table[5] += { "port_desc": "(ccloudpve01 QNAP Thunderbolt Ada)" }

It's not there when I use the new mca-dump-short.sh file, but instead the mcadumpshort.err

@patricegautier
Copy link
Owner

patricegautier commented Jan 26, 2025

Ok one more try to get better output. Can you please update mca-dump-short again, and from the command line on your zabbix server:

1- check SSH connectivity to the switch
ssh -i /var/lib/zabbix/ssh_keys/zb_id_rsa username@yourswitch

2- run
mca-dump-short.sh -d yourswitch -u username -i /var/lib/zabbix/ssh_keys/zb_id_rsa -t SWITCH -o 30 -b

Thanks

@UntestedEngineer
Copy link
Author

UntestedEngineer commented Jan 26, 2025

See attached and snippets.

mcaDumpShort.log
all-pods.log

zabbix@zabbix-server-6f6fd579dd-c67tj:~$ ssh -i /var/lib/zabbix/ssh_keys/zb_id_rsa username@ip


BusyBox v1.25.1 () built-in shell (ash)


  ___ ___      .__________.__
 |   |   |____ |__\_  ____/__|
 |   |   /    \|  ||  __) |  |   (c) 2010-2024
 |   |  |   |  \  ||  \   |  |   Ubiquiti Inc.
 |______|___|  /__||__/   |__|
            |_/                  https://www.ui.com

      Welcome to UniFi USW-Flex-XG!

********************************* NOTICE **********************************
* By logging in to, accessing, or using any Ubiquiti product, you are     *
* signifying that you have read our Terms of Service (ToS) and End User   *
* License Agreement (EULA), understand their terms, and agree to be       *
* fully bound to them. The use of SSH (Secure Shell) can potentially      *
* harm Ubiquiti devices and result in lost access to them and their data. *
* By proceeding, you acknowledge that the use of SSH to modify device(s)  *
* outside of their normal operational scope, or in any manner             *
* inconsistent with the ToS or EULA, will permanently and irrevocably     *
* void any applicable warranty.                                           *
***************************************************************************

BasementDesktopSwitch1-US.7.1.26#
zabbix@zabbix-server-6f6fd579dd-c67tj:/etc/zabbix$ /usr/lib/zabbix/externalscripts/mca-dump-short.sh -d ip -u username -i /var/lib/zabbix/ssh_keys/zb_id_rsa -t SWITCH -o 30 -b
{"anon_id":"80d0f596-cf6d-40f5-8a90-ef50a7150c12","architecture":"armv7l","ble_caps":0,"board_rev":22,"bomrev":"113-03166-22","bomrev_id":"000c5e16","boot":{"id":"d9a0f1f1-1f8a-40ca-b850-6906ae1196b3"},"bootid":0,"bootrom_version":"unknown","cfgversion":"51bf5979dd11c2f8","connect_request_ip":"10.42.9
.1","connect_request_port":"34773","default":false,"dhcp_server_table":[{"blocked":false,"ip":"ip","last_seen":0,"mac":"mac","port_idx":5,"vlan":2001}],"discovery_response":false,"dualboot":true,"ever_crash":false,"fw2_caps":65536,"fw_caps":2806560293,"gateway_ip":"ip","gateway_mac":"mac","guest_kicks":0,"guest_token":"CA185CEF8927E4C339CD00A1FB4CF4F8","has_eth1":false,"has_fan":false,"has_speaker":false,"has_temperature":false,"hash_id":"5a90ef50a7150c12","hostname":"BasementDesktopSwitch1","hw_caps":0,"if_table":[{"full_duplex":true,"ip":"ip","mac":"mac","name":"eth0","netmask":"ip","num_port":5,"rx_bytes":9211116,"rx_dropped":0,"rx_errors":0,"rx_multicast":0,"rx_packets":88464,"speed":10,"tx_bytes":22180574,"tx_dropped":0,"tx_errors":0,"tx_packets":55625,"up":true}],"inform_min_interval":30,"inform_url":
"http://ip:8080/inform","internet":true,"ip":"ip","ipv6":[],"isolated":false,"kernel_version":"4.4.52","last_error_conns":[{"error_reason":12,"last_error_str":"Unknown[12] (http://ip:8080/inform)","last_managed_e_time":0,"last_managed_s_time":99,"timestamp":"2025-0
1-25T22:00:08"},{"error_reason":7,"last_error_str":"Server Busy (http://ip:8080/inform)","last_managed_e_time":0,"last_managed_s_time":99,"timestamp":"2025-01-25T22:00:45"}],"lldp_table":[{"chassis_id":"CURTIS-DESKTOP","is_wired":true,"local_port_idx":2,"local_port_name":"Curtis Deskop  O
nboard NIC","port_id":"10:ff:e0:3"},{"chassis_id":"mac","is_wired":true,"local_port_idx":4,"local_port_name":"Basement Desktop Switch 2 Port 4","port_id":"te3"},{"chassis_id":"mac","is_wired":true,"local_port_idx":5,"local_port_name":"Basement Infrastructure Switch 1","port
_id":"Port 9"}],"locating":false,"mac":"mac","manufacturer_id":2,"model":"USFXG","model_display":"USW-Flex-XG","netmask":"ip","overheating":false,"port_table":[{"autoneg":true,"dot1x_mode":"force_auth","dot1x_status":"authorized","enable":true,"flowctrl_rx":true,"flowctrl_tx":
true,"full_duplex":true,"is_uplink":false,"jumbo":true,"mac_table_count":1,"media":"GE","poe_caps":0,"port_idx":1,"port_poe":false,"rx_broadcast":1,"rx_bytes":3628029,"rx_dropped":0,"rx_errors":0,"rx_multicast":26,"rx_packets":9842,"satisfaction":100,"satisfaction_reason":0,"speed":1000,"speed_caps":1
048623,"stp_pathcost":20000,"stp_state":"forwarding","tx_broadcast":3381,"tx_bytes":12205765,"tx_dropped":0,"tx_errors":0,"tx_multicast":59246,"tx_packets":74334,"up":true},{"autoneg":true,"dot1x_mode":"force_auth","dot1x_status":"authorized","enable":true,"flowctrl_rx":true,"flowctrl_tx":true,"full_d
uplex":true,"is_uplink":false,"jumbo":true,"mac_table_count":2,"media":"10GE","poe_caps":0,"port_idx":2,"port_poe":false,"rx_broadcast":9361,"rx_bytes":5929850682,"rx_dropped":0,"rx_errors":0,"rx_multicast":15353,"rx_packets":17591139,"satisfaction":100,"satisfaction_reason":0,"speed":2500,"speed_caps
":1049068,"stp_pathcost":8000,"stp_state":"forwarding","tx_broadcast":1872,"tx_bytes":7203128051,"tx_dropped":0,"tx_errors":1,"tx_multicast":39437,"tx_packets":20177234,"up":true},{"autoneg":true,"dot1x_mode":"force_auth","dot1x_status":"authorized","enable":true,"flowctrl_rx":true,"flowctrl_tx":true,
"full_duplex":true,"is_uplink":false,"jumbo":true,"mac_table_count":4,"media":"10GE","poe_caps":0,"port_idx":3,"port_poe":false,"rx_broadcast":2466,"rx_bytes":27713791965,"rx_dropped":0,"rx_errors":0,"rx_multicast":417,"rx_packets":42633963,"satisfaction":100,"satisfaction_reason":0,"speed":10000,"spe
ed_caps":1049068,"stp_pathcost":2000,"stp_state":"forwarding","tx_broadcast":452954,"tx_bytes":181671488768,"tx_dropped":0,"tx_errors":1,"tx_multicast":169572,"tx_packets":144847976,"up":true},{"autoneg":true,"dot1x_mode":"force_auth","dot1x_status":"authorized","enable":true,"flowctrl_rx":true,"flowc
trl_tx":true,"full_duplex":true,"is_uplink":false,"jumbo":true,"mac_table_count":1,"media":"10GE","poe_caps":0,"port_idx":4,"port_poe":false,"rx_broadcast":43,"rx_bytes":57048702,"rx_dropped":0,"rx_errors":0,"rx_multicast":2433,"rx_packets":159666,"satisfaction":100,"satisfaction_reason":0,"speed":100
00,"speed_caps":1049068,"stp_pathcost":2000,"stp_state":"forwarding","tx_broadcast":467633,"tx_bytes":214229224,"tx_dropped":0,"tx_errors":0,"tx_multicast":172509,"tx_packets":846882,"up":true},{"autoneg":true,"dot1x_mode":"force_auth","dot1x_status":"authorized","enable":true,"flowctrl_rx":true,"flow
ctrl_tx":true,"full_duplex":true,"is_uplink":true,"jumbo":true,"mac_table_count":2,"media":"10GE","poe_caps":0,"port_idx":5,"port_poe":false,"rx_broadcast":452674,"rx_bytes":188850986689,"rx_dropped":0,"rx_errors":0,"rx_multicast":157268,"rx_packets":164862550,"satisfaction":100,"satisfaction_reason":
0,"speed":10000,"speed_caps":1049068,"stp_pathcost":2000,"stp_state":"forwarding","tx_broadcast":12724,"tx_bytes":33567626491,"tx_dropped":0,"tx_errors":0,"tx_multicast":17678,"tx_packets":60079101,"up":true}],"power_source":"262144","reboot_duration":135,"required_version":"5.46.0","root_switch":"mac","satisfaction":100,"satisfaction_reason":0,"selfrun_beacon":true,"serial":"78455866E827","service_mac":"mac","ssh_session_table":[],"state":2,"stats_inform_interval":1,"stp_priority":12288,"stream_token":"","switch_caps":{"etherlight_caps":0,"feature_caps":68725758,"lag_g
roup":[{"port_range":"1-1"},{"port_range":"2-5"}],"max_acl_port_range":4,"max_aggregate_sessions":2,"max_class_maps":0,"max_custom_ip_acls":12,"max_custom_mac_acls":10,"max_global_acls":128,"max_l3_intf":0,"max_mac_based_vlan_count":256,"max_mirror_sessions":1,"max_qos_profiles":0,"max_reserved_routes
":0,"max_static_routes":0,"max_vlan_count":1000},"sys_error_caps":0,"sys_stats":{"loadavg_1":"0.22","loadavg_15":"0.02","loadavg_5":"0.06","mem_buffer":0,"mem_total":460693504,"mem_used":251506688},"sysid":60736,"system-stats":{"cpu":"2.0","mem":"54.6","uptime":"73536"},"time":1737908157,"time_ms":520
,"timestamp":"2025-01-26T11:15:57","tm_ready":true,"total_mac_in_used":10,"total_max_power":0,"upgrade_duration":320,"uplink":"eth0","uptime":73589,"uptime_str":"20h26m29s","version":"ip"}

@patricegautier
Copy link
Owner

the plot thickens.. Looks to me that everything is ok when invoked on the command line (except the output's line length was capped, but I am going to chalk that off to GitHub somewhere). I also have a flex-xg here and it works fine. It's on firmware 7.1.26 for what it's worth..

@UntestedEngineer
Copy link
Author

How do you run your zabbix server? In a container or monolithic?

Every time you update the shell script I download, delete the configmap then run:

kubectl create configmap zabbix-externalscripts-conf --from-file=mca-dump-short.sh --from-file=ssh-run.sh -n monitoring

Afterwards I cycle the pod/deployment.

I even tried removing the yaml templates and readding.

@patricegautier
Copy link
Owner

patricegautier commented Jan 27, 2025 via email

@UntestedEngineer
Copy link
Author

Each time you ask me to test I remove the configmap and create it with the new mca file. I cycle the pod then provide the logs. Afterwards I remove the config map and recreate it with an older known working one referenced in a prior update so I don't have the consistent failures.

https://github.com/user-attachments/files/18545053/old_mca-dump-short.txt

@patricegautier
Copy link
Owner

Understood.

To be clear, when you are reproducing the problem with the latest mca-dump-short, you are still seeing that same error in mcaDump.err:

/usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 438: echo: write error: Broken pipe
{ "at":"06:02:36", "r":"jq --indent 0 del (.port_table[]?.mac_table) returned status 5; jq: pars

@UntestedEngineer
Copy link
Author

UntestedEngineer commented Jan 28, 2025

Yes,

all-pods.log

Unless Kubernetes is messing up the configmap upon creation with spacing issues,, however I have done this in the past and I don't see any issues with the configmap.

kubectl create configmap zabbix-externalscripts-conf --from-file=mca-dump-short.sh --from-file=ssh-run.sh -n monitoring

When I run the following command manually from the container shell:

zabbix@zabbix-server-6f6fd579dd-mq6kg:/tmp$ /usr/lib/zabbix/externalscripts/mca-dump-short.sh -d ip -u username-i /var/lib/zabbix/ssh_keys/zb_id_rsa -t SWITCH_DISCOVERY -o 30 -b -vvv
jqProgramFile:


{ "at":"15:51:35", "r":"insertPortNamesIntoJson failed with error code 3; jq: error: Top-level program not given (try .)jq: 1 compile error", "device":"ip", "mcaDumpError":"Error" }

When I roll back to the older mca-dump-short.sh file that I referenced in this issue invoking this command manually produces the list of ports and related info.

I also manually invoked SWITCH and SWITCH_FEATURE_DISCOVERY from the container shell and those return expected jq output, but the SWITCH_DISCOVERY has this weird issue.

@patricegautier
Copy link
Owner

to be honest I am fishing a bit here, but you might try the latest mca-dump and see if it changes anything..

@patricegautier
Copy link
Owner

btw is there anything notable about the configuration of those switches? special firmware? lots of clients?...

@UntestedEngineer
Copy link
Author

UntestedEngineer commented Jan 29, 2025

The issue is still present when using the code from commit:
77011b5

all-pods.log

There is nothing fancy about my switches. I have:

  • (2) US-16-XG
  • (2) USW-Enterprise-8POE
  • (2) USW-Flex-10G
  • (4) U6-Enterpise

All six switches are running 7.1.26.

The US-16-XG are the STP root of the network with one a lower STP cost than the other. The USW-Enterprise-8POE sit downstream of the US-16-XG and are multihomed to each US-16-XG (one port is STP blocking. The USW-Flex-10G sit downstream of the US-16-XG where each Flex is single homed to a US-16-XG and the Flex are chain together with an STP block on one side. The (4) U6-Enterprise are split 2 and 2 between the USW-Enterprise-8POE.

I have 20 VLANs defined and all of my switch to switch links are Trunk ports allowing all VLANs. Ports facing the APs on the USW-Enterprise-8POE allow only the specific SSID VLANs. Firewall (Fortigate IP Gateway) uplinks on the US-16-XG are set to Trunk allowing all VLANs. All the rest of the ports connected to the switches with clients are tagged for relevant Access VLAN for non-tagging on the host level.

Client Statistics:
WiFi: 32
Wired: 32

The Unifi Switch templates I am using have the following Preprocessing and LLD Macros. And to note, as soon as I roll back to an older mca-dump-short.sh file I have no issues. The jq script appears to only be failing on SWITCH_DISCOVERY for all switches I listed above, not just the USW-Flex-10G. The "SWITCH" and "SWITCH_FEATURE_DISCOVERY" is successful.

Image

Image

Image

@patricegautier
Copy link
Owner

patricegautier commented Jan 30, 2025

Ok the SWITCH_DISCOVERY bit I think it the smoking gun.. I fixed a couple of bugs in this area. They don't exactly explain the symptoms you are seeing, but are definitely in the right area.. Can you try the latest again?

please run mca-dump-short.sh -t SWITCH_DISCOVERY from the command line on each of the switches to see if it behaves correctly now.

It's also likely I need to update the list of switch models in the script, so if you send me the contents of /etc/board.info from the switches that were misbehaving that would be great

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants