Author Topic: DHCP behaviour  (Read 14888 times)

MarcV

  • Guest
DHCP behaviour
« on: February 03, 2009, 09:21:12 PM »
Hello Mark,

I'm currently trying out a small application on the coldfire EVB board. I have the ethernet communication set up to use DHCP.

Normally, when the ethernet connection is present at power up, the connection is present at the fixed, hard coded IP address and then switches to the DHCP assigned address.
But when there's *no* physical network connection present when the coldfire is powered up, the IP address is set to the fixed, hard coded address when the connection is made, but after a while the connection is lost (no address assigned via DHCP).

I wanted to know if this is normal behaviour or maybe a bug?

Offline mark

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3234
    • View Profile
    • uTasker
Re: DHCP behaviour
« Reply #1 on: February 03, 2009, 10:32:49 PM »
Hi Marc

'Normally' I would expect the board to try to get its settings via DHCP when it starts (whether the Ethernet connection is available of not). If this is not successful I would expect that it would finally use its fixed IP address setting (as long as 0.0.0.0 is set).

Have you looked at the Wireshark recordings to check that it is doing as described in http://www.utasker.com/docs/uTasker/uTaskerDHCP.PDF ?

Surprising is that there is a difference when the Network is not connected. DHCP doesn't actually know whether the cable is connected or not.
Can you show your IP settings and network option?

Regards

Mark

MarcV

  • Guest
Re: DHCP behaviour
« Reply #2 on: February 04, 2009, 08:17:28 AM »
Hello Mark,

I did not do the test with Wireshark yet. I will do that later today when I'm back home.

These are my current IP settings (I hope this is what you asked for...):

Code: [Select]
// The application is responsible for defining the IP configuration - here are the default settings
static const NETWORK_PARAMETERS network_default = {
    (LAN_100M /*| FULL_DUPLEX*/ | RX_FLOW_CONTROL),                    // usNetworkOptions - see driver.h for other possibilities
    {0x00, 0xcf, 0x4d, 0x56, 0x30, 0xff},                                // ucOurMAC - when no other value can be read from parameters this will be used
    { 192, 168, 1, 101 },                                                  // ucOurIP - our default IP address
    { 255, 255, 255, 0 },                                                // ucNetMask - Our default network mask
    { 192, 168, 1, 1 },                                                  // ucDefGW - Our default gateway
    { 192, 168, 1, 1 },                                                  // ucDNS_server - Our default DNS server
};

// The default user settings (factory settings)
const PARS cParameters = {
    PARAMETER_BLOCK_VERSION,                                             // version number for parameter validity control
    (unsigned short)(2*60),                                              // default telnet_timeout - 2 minutes
    (CHAR_8 + NO_PARITY + ONE_STOP + USE_XON_OFF + CHAR_MODE + DCE_MODE),// serial interface settings
    23,                                                                  // TELNET port number
    (ACTIVE_DHCP + ACTIVE_LOGIN + ACTIVE_FTP_SERVER /*+ ACTIVE_FTP_LOGIN*/ + ACTIVE_WEB_SERVER + ACTIVE_TELNET_SERVER + SMTP_LOGIN), // active servers (ACTIVE_DHCP and ACTIVE_FTP_LOGIN disabled)
    SERIAL_BAUD_19200,                                                   // baud rate of serial interface
    {0,0,0,0},                                                           // trusted dial out IP address (null IP means no checking)
    {'A', 'D', 'M', 'I', 'N', '&', ' ', ' '},                            // default user name - & closes sequence
    {'u', 'T', 'a', 's', 'k', 'e', 'r', '&'},                            // default user password - & closes sequence

The connection always starts at IP address 192.168.1.101 at power up when the LAN cable is connected, and then switches to the DHCP assigned address. When the cable is not connected at power up, I get a connection at IP 192.168.1.101 when the LAN cable is plugged in, but after a while the connection is lost...
« Last Edit: February 04, 2009, 08:20:58 AM by MarcV »

Offline alager

  • Jr. Member
  • **
  • Posts: 92
    • View Profile
Re: DHCP behaviour
« Reply #3 on: October 14, 2009, 06:24:32 PM »
I'm dealing with this same behavior....what was the resolution?


Thanks,
Aaron

Offline alager

  • Jr. Member
  • **
  • Posts: 92
    • View Profile
Re: DHCP behaviour
« Reply #4 on: October 14, 2009, 07:57:53 PM »
Well here is what I found, and did.

Change in application.c in #USE_DHCP:
fnStartDHCP((UTASK_TASK)(FORCE_INIT | OWN_TASK));            // activate DHCP
to
fnStartDHCP((UTASK_TASK)(OWN_TASK));

I found that the FORCE_INIT would clear out the default IP address.

The next step was to handle the case of no network cable plugged in for a while (longer than the 2 minute timeout) after power is applied.
in dhcp.c
Code: [Select]
extern int fnStartDHCP(UTASK_TASK Task){
if (usConnectedToBB & LINK_STATUS){ //only do this if the link is up. {AL}
if ((DHCPSocketNr >= 0) || ((DHCPSocketNr = fnGetUDP_socket(TOS_MINIMISE_DELAY, fnDHCPListner, (UDP_OPT_SEND_CS | UDP_OPT_CHECK_CS))) >= 0)) {
fnBindSocket(DHCPSocketNr, DHCP_CLIENT_PORT);
MasterTask = (Task & ~FORCE_INIT);
if (!(Task & FORCE_INIT) && (uMemcmp(&network.ucOurIP[0], cucNullMACIP, IPV4_LENGTH))) {   // if we have a non-zero IP address we will try to re-obtain it
uMemcpy(ucDHCP_IP, &network.ucOurIP[0], IPV4_LENGTH);        // copy our IP address to the DHCP preferred address
uMemset(&network.ucOurIP[0], 0, IPV4_LENGTH);                // remove the local IP since it may only be used after being validated
ucDHCP_state = DHCP_STATE_INIT_REBOOT;                       // we already have a previous value - we will try to obtain it again
}
else {
ucDHCP_state = DHCP_STATE_INIT;                              // we have none so we must start fresh
}
fnRandomise((4*SEC), E_START_DHCP);                              // perform DHCP state/event after short delay                    
return 0;                                                        // OK
}    
return NO_UDP_SOCKET_FOR_DHCP;                                       // error
} else {
uTaskerGlobalMonoTimer( OWN_TASK, (DELAY_LIMIT)(2*SEC), E_RETRY_DHCP); // try again in 2 seconds{AL}
return LINK_NOT_READY; // return this if the link is not up yet
}
}

// DHCP task
//
extern void fnDHCP(TTASKTABLE *ptrTaskTable)  
{    
    QUEUE_HANDLE PortIDInternal = ptrTaskTable->TaskID;                  // queue ID for task input
    unsigned char ucInputMessage[SMALL_QUEUE];                           // reserve space for receiving messages

    if ( fnRead( PortIDInternal, ucInputMessage, HEADER_LENGTH )) {      // check input queue
        //if ( ucInputMessage[ MSG_SOURCE_TASK ] == TASK_ARP) {
switch (ucInputMessage[ MSG_SOURCE_TASK ]) {     //check which type of message we have {AL}
case TASK_ARP:
// Note that we receive ARP messages only on our attempt to send a test message to a node with our allocated IP address.
// Since DHCP uses broadcast messages until this point there can be no ARP errors
fnRead( PortIDInternal, ucInputMessage, ucInputMessage[MSG_CONTENT_LENGTH]); // read the contents
if (ARP_RESOLUTION_SUCCESS == ucInputMessage[ 0 ]) {
if (ucResendAfterArp) {
fnSendDHCP(ucResendAfterArp);
}
else {
fnStateEventDHCP(E_DHCP_COLLISION);                  // this is bad news. We have received IP data from DHCP server but found that someone is already using the address. We have to try again
}
}
else if (ARP_RESOLUTION_FAILED == ucInputMessage[ 0 ]) {
// we have probed to ensure than no one else has the IP address which we have received
// the probing failed which means we can use it - inform application...
fnStateEventDHCP(E_DHCP_BIND_NOW);
}
ucResendAfterArp = 0;
break;
case INTERRUPT_EVENT:   //Here is where the dhcp failure is reported {AL}
switch (ucInputMessage[MSG_INTERRUPT_EVENT]) {
case DHCP_SUCCESSFUL:                                        // we can now use the network connection
break;

case DHCP_MISSING_SERVER:
fnStopDHCP();                                            // DHCP server is missing so stop and continue with backup address (if available)
break;
}
break;
case TIMER_EVENT:
if (ucInputMessage[ MSG_TIMER_EVENT ] == E_RETRY_DHCP) { //{AL} retry DHCP if LINK was down
fnStartDHCP((UTASK_TASK)(OWN_TASK));
} else {
fnStateEventDHCP(ucInputMessage[ MSG_TIMER_EVENT ]);           // timer event
}
break;
        }
    }
}


Then in NetworkIndicators.c I set or clear the bit LINK_STATUS, for link up or link down.

So the operation is now:
If link is down check again in 2 seconds.
If link is up and dhcp responds, use that IP (IP is 0.0.0.0 while waiting for dhcp).
if link is up and dhcp isn't responding (IP is 0.0.0.0 while waiting for dhcp), timeout and use default IP
if link is up and static IP is assigned, use that.


I hope this helps someone.

Aaron

UPDATE:Since everything waits for link to be stable, this does take longer for the system to start.  About 30 seconds, as opposed to the original 10s.
« Last Edit: October 16, 2009, 06:09:15 PM by alager »

Offline mark

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3234
    • View Profile
    • uTasker
Re: DHCP behaviour
« Reply #5 on: October 16, 2009, 07:58:04 PM »
Hi Aaron

Yes, the force option is intended to speed things up.

According to the DHCP spec, it is possible to try to obtain a specific address [known as reboot case] (when it doesn't start with 0.0.0.0) in which case the DHCP client tries several times to obtain it before falling back to the same case as when it has no specific address. Usually DHCP servers don't allow this specific address to be obtained so the process, taking about 15s in all, can be a bit of a waste of time.

The force option skips the reboot case phase even when a specific address is held (not 0.0.0.0).

Regards

Mark

Offline alager

  • Jr. Member
  • **
  • Posts: 92
    • View Profile
Re: DHCP behaviour
« Reply #6 on: October 28, 2009, 07:19:25 PM »
Mark,

The problem is that the FORCE_INIT clears out the "default IP".  So when dhcp fails (the network has no dhcp server), then the default IP is set to 0.0.0.0, which doesn't allow the unit to work.

My goal was to allow the default IP to stay intact, so when there is no dhcp server our factory programmed default IP can be utilized.  It would be nice to have it operate faster, like FORCE_INIT, but keep the default IP on dhcp failure.


Aaron

Offline mark

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3234
    • View Profile
    • uTasker
Re: DHCP behaviour
« Reply #7 on: February 17, 2010, 04:07:03 PM »
Hi Aaron

I am not sure whether we discussed this offline or not so here is a belated response:

- when DHCP is in operation it is in fact not allowed to use any IP address (this applies during the DHCP phase, after the lease timeout without a release renewal, or when no IP address is allowed). By setting the address to 0.0.0.0 it ensures that nothing can be send and so violate this requirement.
- if DHCP fails one should theoretically disconnect and don't use the IP connection - therefore the behavior in the failure case also respects this.

However we all known that there are good reasons to not respect all specifications to the word. But in this case it is up to the user to make this decision and so the DHCP module doesn't do it automatically.

The solution is to react to the DHCP_MISSING_SERVER interrupt event which will be received by the owner task on failure (the demo application stops the DHCP server by default and so IP is effectively dead...it could restart it again, and again, ...) and set the IP address back from the user's parameters. Then IP operation with a fixed address is continued.

Regards

Mark

Offline aaronlawrence

  • Jr. Member
  • **
  • Posts: 66
    • View Profile
Re: DHCP behaviour
« Reply #8 on: March 04, 2010, 02:42:55 AM »
Continuing this discussion:

What ARE we supposed to do after the various DHCP events?
Do we have to stop it, restart it, or will it continue by itself?
DHCP_COLLISION suggests it will retry by itself?
DHCP_MISSING_SERVER, since your example shows stop, presumably will continue by itself... but your comment above suggests we need to restart it.


Offline mark

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3234
    • View Profile
    • uTasker
Re: DHCP behaviour
« Reply #9 on: March 04, 2010, 03:58:13 PM »
Hi Aaron

The following events are possible:

DHCP_SUCCESSFUL - the DHCP process has completed successfully. There is now a local IP address (and other networks settings) available and this can now be used. Note that before this event occurs no network operation will be possible since there is no IP address available.

DHCP_COLLISION   - this means that the IP address that was given by the server was found to already be in use on the local network (before declaring DHCP_SUCCESSFUL the module first sends some PINGs out to ensure that the newly obtained address is not already being used - in this case there was an answer to this check. This event doesn't need to be acted on since it is just information. The DHCP module will decline the address that it just received and start the process again after a delay of 11..12s (this is specified).

DHCP_LEASE_TERMINATED - This means that the original lease time expired and attempts to extend the lease time (or get a new address) failed. This points to the fact that the DHCP server has been deactivated or similar, since a failure would mean that various attempts leading up the the lease timeout all failed. This means also that the original IP settings have now been lost and no network activity is allowed. The process automatically continues again after the 11..12s delay to try to get new setting from any source.

DHCP_MISSING_SERVER - this means that all attempts to obtain IP settings have failed. This occurs after the specified exponential repetition timeout reaches the maximum value of 64s and is set back to 4s. The process doesn't actually stop but continues, but from the beginning again; if there is never a server the event will be received every several minutes. The network will not be usable until it is either eventually successful or the application decides to stop the process and use some other fixed settings. This event could follow an original DHCP_COLLISION or DHCP_LEASE_TERMINATED, indicating that the retries also were unsuccessful.

Therefore it is in fact not necessary to restart on such an event (I may have not noted this correctly before) but the application will usually chose to stop the process (give up) if it becomes clear that there is no DHCP server available and network operation would otherwise be impossible - note that without successful DHCP or after a lease timeout without new settings a device is not allowed to use the network. Since this doesn't apply to a device with fixed settings the change over to fixed settings is really the only alternative in such a case.

Note that the details of the process, including timer involved are in the document http://www.utasker.com/docs/uTasker/uTaskerDHCP.PDF

Regards

Mark



Offline aaronlawrence

  • Jr. Member
  • **
  • Posts: 66
    • View Profile
Re: DHCP behaviour
« Reply #10 on: March 06, 2010, 02:09:06 AM »
DHCP_SUCCESSFUL - ... Note that before this event occurs no network operation will be possible since there is no IP address available.

That's what I expected, but it doesn't seem to be totally true. My code was opening a UDP and TCP socket anyway, both succeeded, and I was able to receive and transmit on the UDP  socket. Transmitted packets went out with an address of 0.0.0.0. I suppose this is because DHCP is really using UDP, and there is nothing to stop the application also using it.

Quote
DHCP_MISSING_SERVER - this means that all attempts to obtain IP settings have failed. This occurs after the specified exponential repetition timeout reaches the maximum value of 64s and is set back to 4s.

Is this by the spec? It doesn't make much sense to me to jump back to frequent attempts; it would seem more logical to keep trying at 64s intervals. But I haven't read the RFC.

Quote
the application will usually chose to stop the process (give up) if it becomes clear that there is no DHCP server available and network operation would otherwise be impossible

This is where you would drop into zeroconf/automatic private IP protocol :)

Quote
Note that the details of the process, including timer involved are in the document http://www.utasker.com/docs/uTasker/uTaskerDHCP.PDF
Yes, this was a good document but does not cover the events, hence my question.


Offline mark

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3234
    • View Profile
    • uTasker
Re: DHCP behaviour
« Reply #11 on: March 06, 2010, 12:36:42 PM »
Hi Aaron

There is nothing actively stopping Ethernet frames from being sent by the application (on IP 0.0.0.0).
Normally, if this were an issue the DHCP_SUCCESSFUL event would be used to start the application part so that it doesn't try working before.
A DHCP_MISSING_SERVER would be used to stop all application activity.

Whether this is useful or practical is another question.

Regards

Mark