┬ÁTasker Forum

┬ÁTasker Forum => ┬ÁTasker general => Topic started by: alager on December 08, 2010, 10:20:17 PM

Title: LAN_REPORT_ACTIVITY activity LED stops after a while
Post by: alager on December 08, 2010, 10:20:17 PM

We've noticed that when LAN_REPORT_ACTIVITY is set, the activity LED stops after a while  ???.  The only way to get it back to normal is by resetting the chip (MCF52233).  The link LED still works and is updated normally when this occurs.  And for completeness, yes there is network traffic to the device. 

Any ideas?

Title: Re: LAN_REPORT_ACTIVITY activity LED stops after a while
Post by: mark on December 09, 2010, 01:12:20 AM
Hi Aaron

I haven't hear of this before but a possible explanation is that the queue to the task controlling the LED has a problem. It is set to be quite large (check in TaskConfig.h) because there is a message each time there is activity on the interface (thus there can be a lot of messages). If the queue overflows it could cause problems - I would set the queue size also to a multiple of the event message length (only fixed size events are sent) because then it is not possible that one is sent that only partly fits into the queue (which could cause corruption which effectively blocks it) - the worst case is an event loss which is not really important since the next one will blink the LED again anyway. This is in fact also set like that for the Ethernet task for the same reason [(HEADER_LENGTH * 12)] - in fact the Ethernet task can't be sent more messages than it has Ethernet input buffers (this is limited by the input buffer operation). But the activity task may also receive unrelated events (?).

If you are in this state and can debug it would be interesting to know what happans when the interrupt tries to send an event to the task (whether the queue is stuck full for example).

If you are not using V1.4 check also the following patch in V1.4 concerting the events:

   07.12.2009 Avoid blocking interrupt when sending LAN_REPORT_ACTIVITY event messages {113}

This stops interrupt being blocked longer than expected... I don't know whether it is related though.



P.S I just noticed that you say the link LED is OK. This would probably exclude queue corruption. This would make it more difficult to explain unless a timer can be lost at the task (the activity events set the LED and start a timer which should cause a blinking effect).
Or maybe the LED output control is still working but the port has been reconfigured by another module (eg set to input...?).

Title: Re: LAN_REPORT_ACTIVITY activity LED stops after a while
Post by: alager on December 09, 2010, 06:07:13 PM
I think it may be the queue size, I had reduced the size of the queue for the fnNetworkIndicator to SMALL_QUEUE, from LARGE_QUE.  I'll change it back and see if things return to normal.

Title: Re: LAN_REPORT_ACTIVITY activity LED stops after a while
Post by: alager on December 23, 2010, 09:49:24 PM
Just following up on this topic, it was the queue size.  Once I returned the network indicator task to the large queue size, the activity LED issue disappeared.

I do have some questions regarding this though.
1) What gets corrupted when the queue over flows, and can this cause problems in other areas/tasks/timers ect?
2) Is it possible to overflow the large queue?  What keeps it from overflowing?
3) Is there a way to keep the queue from overflowing, so that a  smaller queue can be used?  Because in this case it's okay to miss a message on occasion since it just means a missed potential blink.


PS. I do not have the above mentioned bug fix {113} running.  I've just created a task here to go through all the diffs between our current project (1.3 based) and 1.4, to add those items that make sense.
Title: Re: LAN_REPORT_ACTIVITY activity LED stops after a while
Post by: mark on December 25, 2010, 02:58:45 PM
Hi Aaron

There is no simple answer becasue what happens depends on various factors. However the following can be said:
1) A queue overflow is not an overflow in the sense that something else gets overwritten. It means that not all of a message couple be put into the queue.
Queues have a finite size and all work with a standard queue driver - thsi queue driver will put as much of a message into the defined queue. If anthing doesn't fit it gets discarded. This means that it is possible that only a part of a message is committed to teh queue when there is not enough space for a complete message.
2) The messages that we are interested here are interrupt event messages. These are often sent from interrupt routines (as in your particular case) and have a fixed length of 5 bytes. If there is not enough queue space to put these 5 bytes into it may mean that 0 bytes or up to 4 bytes are copied.
3) No other tasks, areas or timers are directly effected by a lost or corrupted message.
4) How the receiver reacts to a partial message depends on how it is written. It will generally read the header length and then treat the 5 byte content. Usually it is assumed that 5 bytes have been read (when zero is not returned by the read()) and so it may be that it then interets random content if not all of the event were put into the queue.
5) The length of queue needed to ensure that the queue will never get full depends on the speed of the events being generated and the speed that the queue is being emptied. The size required to never get a problem in this case would probably be to have a queue size equal to >= (5 bytes * maximum Ethernet buffers) but of course having one twice the size should probably ensure some reserve too.
6) In case of the network indicator task, wich receives only interrupt events the following change could be useful:

    while (fnRead(PortIDInternal, ucInputMessage, HEADER_LENGTH)) {      // check input queue

    while ((fnRead(PortIDInternal, ucInputMessage, HEADER_LENGTH)) == HEADER_LENGTH) {      // check input queue

This would cause incomplete events to be read but ignored (assuming it mucst be a partial message).

However also setting the queue size to a multiple of the HEADER_LENGTH (5) would also ensure than no partical events can be enetered into the queue.

7) In cases where a message loss would cause a problem the technique discussed in the following thread can be used to avoid trying to put messages to queue when th emessages won't fit: http://www.utasker.com/forum/index.php?topic=1203.0

8 ) In this particular case it is probably not a bit issue if the LED doesn't flash once when a message is lost so there would be no benefits in checking for space - a generous queue size (if resources allow it) is of course the simplest method of ensuring that it never occurs too.

9) I don't actually understand how the LED flashing could be stopped by a lost, or partial event, since it seems as though your linkup/down LED was still responding. This probably excludes that the queue became blocked by reading bad data (this would only happen when the queue reads became de-synchronised with the content - eg. it read 5 bytes and the first was never the start of a telegram but some part of it - which shouldn't be able to happen anyway with its loop alway clearing the queue before terminating (?)