µTasker Forum > µTasker general

SNMP Traps not sending timely and/or locking up, appears to be ARP problem

(1/3) > >>

Ray:
Hi Mark,
I really thought I had the SNMP traps working, but somehow It is again failing.  To troubleshoot, I temporarily added 7 MIB spots for the ARP table as a debug because CLI isn't available.  On the included wireshark filtered capture  the first entry is ARP to the SNMP manager, and my 2 test traps flow freely   .  then appx 75 seconds in it appears to be an ARP refresh?
then traps are locked out   then 5 minutes later a gateway ARP refresh? then 5 minutes later traps begin for a short time, then ARP blocking again.  Eventually the system locks up with this - but my Modbus polling is good and SNMP GET communication is unaffected?   Just traps.

Since this is very odd behavior, I'm guessing that my method for sending traps is the problem. 
However, my ARP table is full of unwanted entries despite having ARP_IGNORE_FOREIGN_ENTRIES defined.



so I think I've got 2 questions:
1)  Is the ARP_IGNORE_FOREIGN_ENTRIES  supposed to ignore all random ARP requests on the network?  It doesn't appear to do this based on the ipaddresses in the ARP table

2)  Do SNMP traps need to be implemented in a specific way so that they get an ARP notification?  do they need to be called within the snmp void fnSNMP(TTASKTABLE *ptrTaskTable)??


Thanks
Ray




in my MIB handler, the extern unsigned char fnInitialiseSNMP(void)  ( called in Application.c - in my fnPIT_timerTask  every 5 seconds)

        if (ulCount_5000ms) --ulCount_5000ms;
        else
        {
          ulCount_5000ms = 500;
          if (SnmpStarted == 0)
          {
           if( fnInitialiseSNMP() == 1) SnmpStarted = 1;
          }
          else
          {
            if (SendColdStartTrap == 0)
            {
                fnSendSNMPTrap(SNMP_COLDSTART, 0, 1);//ALL_SNMP_MANAGERS);
                SendColdStartTrap = 1; 
            }
            else// cold start has been sent
            {
              fnSendSNMPTrap(SNMP_COLDSTART, 0, 1);//  test with just the easy to send trap
            }
          }

Ray:
I should add I appear to be getting the same result in the simulator.  This is where my debug effort currently is.   

mark:
Hi Ray

In fnHandleARP_response()

you could try removing the registration of received ARP requests that were not destined to your IP address as follows, which may help reduce the ARP entries that you are not interested in.

    else {                                                               // it was not an ARP to our IP address but we can still add it to our table or refresh the entry
    #if !defined ARP_IGNORE_FOREIGN_ENTRIES
        if (uMemcmp(ucRequestingIP, cucBroadcast, IPV4_LENGTH) != 0) {   // ignore broadcasts
            fnAddARP(ucRequestingIP, ucRequestingMAC, &arp_details);     // {14}
        }
    #endif
    #if defined USE_IP_STATS
        fnIncrementEthernetStats(SEEN_FOREIGN_ARP_FRAMES, _NETWORK_ID);  // update statistics for foreign addresses
    #endif
    }

ARP entries will probably not be an issue though.


Looking at the wireshark recording I see that the traps are initially sent to 172.22.1.94

15mins later I see that there are ARPs being sent out to resolve the address 172.22.0.1, which are not answered. These maybe due to the traps that you are trying to send but may be due to other causes.

In any case you need to find out why the ARP (assuming associated with the trap after a certain time) is presumably being sent to a different address, potentially on a different sub-net, since this is possibly the issue.
Since you are working with the simulator this should be quite easy to do:
- set a break point in the ARP transmission when it starts to see how the destination address is being defined.
- let it run until the problem starts
- enable the break point again to see what is causing the ARP to be sent (possibly the trap want to send data) and compare how the destination address is being defined. I expect you'll find a difference that will explain why it stops working after a certain time.
- Traps can be sent form anywhere and don't need to be called in the SNMP task.

Regards

Mark

Ray:
Hi Mark,
Yes, I believe ARP is not the root cause, rather, it is possible the ARP process is just a little overwhelmed in my noisy network envornment.
The ARP request for 172.22.0.1 is for the gateway, probably for NTP - I've disabled this for now.


I mistakenly changed one of the ARP_IGNORE_FOREIGN_ENTRIES  in static void fnSendARP_response(ARP_INPUT *ptrArpInput)
On line 682, if we received our own ARP request, it wasn't being added, I have fixed this.
What I meant to comment out was your suggestion in fnHandleARP_response()
That is now commented out with the preprocessor, however it didn't fix the problem of extra ARP entires.

Additionally, there is an instance of fnAddARP() located in ip.c  line 688
This was adding all the misc ARP's it received, I have diabled with the preprocessor, now my ARP table has 3 and only 3 entries (+ broadcast)

This didn't resovle my trap problem. The symptom was trap manager 1 worked but 2 or 3 didn't.   
As always your amazing debugger to the rescue and discovered in static int fnSendTrap() function, line 1099 fnSendUDP()  has extra information OR'd into the SocketHandle and would fail the first check.   

Commenting this allows traps beyond manager 1 to send /* | ptrSNMP_manager_details[iManagerRef].snmp_manager_details | ((iManagerRef & USER_INFO_MASK) << USER_INFO_SHIFT)*/
I have no idea what those values are for, but commenting them out allows my 3 managers to receive traps.

if (fnSendUDP((USOCKET)(SNMPSocketNr /* | ptrSNMP_manager_details[iManagerRef].snmp_manager_details | ((iManagerRef & USER_INFO_MASK) << USER_INFO_SHIFT)*/)  ,(unsigned char*)ptrSNMP_manager_details[iManagerRef].snmp_manager_ip_address, SNMP_MANAGER_PORT, (unsigned char*)&UDP_Message.tUDP_Header, (unsigned short)iNewLength, OWN_TASK) == NO_ARP_ENTRY)

which failes the first check of fnSendUDP(USOCKET SocketHandle, unsigned char *dest_IP, unsigned short usRemotePort, unsigned char *ptrBuf, unsigned short usDataLen, UTASK_TASK OwnerTask)

    if (_UDP_SOCKET_MASK(SocketHandle) > UDP_SOCKETS) {                  // {7}
        return INVALID_SOCKET_HANDLE;
    }
   
Cautiously, all is well...I'm running a 5 day blast on coldstart traps to make sure we don't bog down.

Thank You
Ray

mark:
Hi Ray

In ip.c I see that only ARPs directed directly to your IP address are entered so I am not sure that it is appropriate to remove that when ARP_IGNORE_FOREIGN_ENTRIES is enabled.
However the ARP table should not be critical since the worst thing that can happen is that a resolution needs to first be performed when the destination is not yet known.
You can also increase the size of the ARP table so that entries don't need to be deleted when new ones are entere but there is not space for all.

Regards

Mark

Navigation

[0] Message Index

[#] Next page

Go to full version