Author Topic: A USB problem  (Read 16213 times)

Offline Chuck

  • Newbie
  • *
  • Posts: 9
    • View Profile
A USB problem
« on: February 05, 2010, 12:32:25 AM »
Mark,

While doing USB stability test for my project, I have this subtle but very consistent problem.
I have a task that sends same size data (4 - 300 bytes depending on setting) to a PC through USB every 0.1 s.

It runs OK for days, if data size is small (probably < 64 byte) OR Ethernet is not connected.
When data size is bigger AND Ethernet is connected to the LAN, the USB communication hangs in a few hours.

It happens with no active Ehternet communication other than broadcasts from a switch.
There are UART communication is going on all the time, but it does not seem to cause any problem.

My guess is somehow Ethernet interrupts, taking place while the USB stack prepares the second or subsequent packets, may cause a problem.

How can I find a clue for this problem?

Chuck



Offline mark

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3236
    • View Profile
    • uTasker
Re: A USB problem
« Reply #1 on: February 05, 2010, 01:36:07 AM »
Hi Chuck

If you have Ethernet and USB at the same time I assume that you are working with an M5225X.
Check whether LAN_REPORT_ACTIVITY is active and verify that you have the patch in M5223x.c:
   07.12.2009 Avoid blocking interrupt when sending LAN_REPORT_ACTIVITY event messages {113}
In any case remove LAN_REPORT_ACTIVITY since it is not useful with the M5225X since its PHY controls the LEDs well.

Ethernet interrupts should be much less disturbing than UART interrupts since they don't actually do much themselves and there won't be that many when just broadcasts are received.

Another possible explanation (just a thought) is that there may be a DMA interaction since Ethernet and USB are using DMA - but this is very unlikely.

If the first point proves to be a dead-end I will try repeating the test to capture it.

Regards

Mark


Offline Chuck

  • Newbie
  • *
  • Posts: 9
    • View Profile
Re: A USB problem
« Reply #2 on: February 05, 2010, 06:52:26 PM »
Hi Mark,

I forgot to mention that I am using M52259 with uTasker V1.4 not V1.42.
There is no patch you mentioned in the version that I am using.
But, LAN_REPORT_ACTIVITY is not used, because LED indicators are connected to an external PHY.

Since you mention DMA, here is something unclear to me about DMA.
If two DMA's try to access the same memory block, does one have to wait until the other finishes or do they run perfectly in parallel?

If it is not perfect, can Ethernet DMA throw off USB transmission timing?

If USB stack send corrupted packet due to off-timing and  not get ACKed, can it retransmit the packet later, or the packet is lost and USB hangs?

I know there are too many ifs, but I don't know anything about two DMAs and USB retransmission, and I just get curious.. :-)
Thanks.

Chuck.

Offline mark

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3236
    • View Profile
    • uTasker
Re: A USB problem
« Reply #3 on: February 05, 2010, 08:30:15 PM »
Hi Chuck

I don't know the details about the DMA in this case since the DMA belongs to the peripherals - it is not controlled by the general purpose DMA controller that can be programmed by the user. I think that we can however consider that these can generally work without any errors occurring due to multiple peripheral DMA attempts at the same time - also if it were to the same address.

What the peripheral DMA controllers can not do is access FLASH directly. It also need to be given permission to access RAM, which is done. The FLASH access takes place via a back-door address which is also automatically controlled in the uTasker USB driver so that the user can send data originating from const tables etc. Depending on how the USB is configured it will be sending data defined by the user with a length and a pointer (here the case of FLASH based data is taken care of automatically if it needs to be), or it will be taking data from an output buffer (the user wrote the data to the buffer, which is RAM based) and the interrupt driver does the rest.

There may be some priority set up possibilities to control the various peripheral DMA channels but I don't known the details - Ethernet probably has highest priority by default - maybe there is some burst control setting, but again I don't known the details; my presumption is that the standard settings will be well adjusted to the typical use and so would not be a potential cause of a device stopping

If I remember correctly, if the USB DMA controller tries to access memory that it is not allowed to access (eg. attempt to read directly from FLASH) it results in the USB controller sending frames filled with 0x00. It doesn't result in the USB stalling. Therefore it seems that this will not be a possible reason.

Since the exact transmission method may be a factor it may be best if you send me your USB interface code (usually the file usb_application.c but possibly with additional files which interact) so that I can repeat tests here. I expect that there will be some reason why an USB interrupt is missed or falsely handled (I don't see any relation to Ethernet yet) which then results in the USB output buffer driver remaining in that blocked state. Thsi may or may not cause other endpoints or the opposite direction to also fail; maybe you can identify which state these are in?

Regards

Mark


Offline Chuck

  • Newbie
  • *
  • Posts: 9
    • View Profile
Re: A USB problem
« Reply #4 on: February 05, 2010, 11:20:25 PM »
Hi Mark,

I just ran uTasker demo program V1.4 on M52259 eval board and reproduced the problem in the following way.

Connect a USB cable, and open the corresponding virtual com port on a Hyper terminal (HT1).
Get the uTasker main menu.
Select 6 Go to USB menu.
Select usb-serial menu.

Connect a serial cable, open the port on another Hyper terminal (HT2).
Go to Hyperterminal menu Transfer/Send Text File...
Pick a long text file (~100Kb) and start sending.
The text displays on HT1.

Go to a web browser and open the uTasker home page running on the eval board.
Or hit the refresh button.

Once the web browser start loading, HT1 stops displaying.
I think this is the problem that I have.

Regards,
Chuck

Offline mark

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3236
    • View Profile
    • uTasker
Re: A USB problem
« Reply #5 on: February 06, 2010, 08:50:29 PM »
Hi Chuck

Thanks for the details. I repeated the test and could confirm the problem.

In fact the CPU lands in an un-handled interrupt when the test is made - sometimes quite quickly and some times after some time. What I did was send 2 large txt files in both directions at the same time and then did lots of refreshes of the start side of the web server - this has quite a large picture in it which causes a lot of TCP frames to be sent, which helps the stress-test.

At first I though that the solution was in fact quite simple because I realised that both the USB-OTG interrupt and the Ethernet Rx interrupt are set for level 6, priority 7. This is because the USB driver was originally developed on the M5222X which has USB but not Ethernet. The Kirin3 arrived later and then the interrupts level/priorities overlap, which is not actually allowed.

So I simply dropped the USB interrupt's priority to level 6, priority 0 (for example) and expected the occasional strange interrupt to be solved.

At first it did seem better but it still eventually resulted in an un-expected interrupt - I realsied that it was a spurious interrupt and the interrupt level was being set to 3. I have never experienced a spurious interrupt on the Coldfire before but I could see that it always happened when the Ethernet interrupt, the UART Rx interrupt and the UART Tx DMA interrupt were set; I am not sure which combination is critical but both the UART Rx interrupt and the UART Tx DMA interrupt are level 3 interrupts (my feeling is that it is in fact not necessarily USB related since the USB interrupt was not always set when it happens).

I tried setting the two UART interrupts to different levels but this didn't help - I also checked carefully that no interrupt (level/priority) combination collides.

So I disabled the UART Tx DMA and let the UART transmitter run in interrupt mode. Then I could send files of about 10MByte in both directions and refresh the web server as much as I wanted without the problem. I counted about 20'000 TCP receptions and about 30'000 TCP transmissions during the intensive bi-directional USB<->UART transfers. This thus seems to be a work around for the spurious interrupt.

Therefore these two changes should make your system stable:

1) Change the USB-OTG interrupt setting (in app_hw_m5223x.h) from
#define USB_OTG_INTERRUPT_PRIORITY          (INTERRUPT_LEVEL_6 | INTERRUPT_PRIORITY_7)
to
#define USB_OTG_INTERRUPT_PRIORITY          (INTERRUPT_LEVEL_6 | INTERRUPT_PRIORITY_0)
[it can have any unused combination - CAN is not active so this is actually using the CAN11 interrupt as default in the project]

2) Disable UART DMA operation (same file)
  //#define SERIAL_SUPPORT_DMA // disabled so Tx uses interrupt on each character

As mentioned before, I have never experienced a spurious interrupt before and I also don't known why running the UART Tx in DMA mode causes this to happen. In addition I don't understand why it has only been noticed with USB and Ethernet operation at the same time (note however we did mention that DMA may be involved, but what is its relation to the interrupts?).

To get to the bottom of this will require further investigation. How best to solve it also needs to be looked into. It may be that by simply handling the spurious interrupt is adequate but I also have a feeling that there is some USB or UART data corruption taking place when it takes place - just before one case I captured some data corruption taking place, so the basic disturbance probably needs to be identified (see terminal emulator output below when it once stopped before disabling UART Tx DMA).

Tell me how you get on and I will try to get some ideas about why such a combination can result in spurious interrupts taking place.

Regards

Mark



1) End of terminal emulator data as the spurious interrupt took place (from Virtual COM)

1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvw67890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcde


2) Reception from UART0

1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234567890abcdefghijklmnrstuvwxyz
1234567890abcdefghijklmnopqrstuvwxyz
1234


Note that the normal pattern is corrupted in both directions and shortly after the CPU lands in the spurious interrupt handler (which hangs)


Offline mark

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3236
    • View Profile
    • uTasker
Re: A USB problem
« Reply #6 on: February 07, 2010, 10:46:32 PM »
Hi All

Please note that the investigation into the underlying problem is being discussed here:

http://forums.freescale.com/freescale/board/message?board.id=CFCOMM&message.id=8524#M8524


Regards

Mark


Offline mark

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3236
    • View Profile
    • uTasker
Re: A USB problem
« Reply #7 on: February 08, 2010, 01:19:16 AM »
Hi Chuck

From the work documented at the Freescale forum (see link in previous post) a problem with the XON/XOFF operation in UART TX DMA mode has been identified. Depending on your present configuration you may find that only the first change (USB-OTG priority correction) is needed.

The identified problem can occur when:
1) UART is operating in TX DMA mode - AND
2) The mode is set to SW flow control - AND generally
3) The rx buffer size (RX_BUFFER_SIZE in app_hw_m5223x.h) is set to a fairly small value so that loading (eg. due to intensive web server use) causes the high-water mark to be reached - increasing the buffer size may also be adequate in normal use

I don't think that the same problem exists in HW flow control since I did a fair amount of tests with this recently, but I will check.

Using interrupt driven mode avoids the problem in the meantime until the driver is corrected to ensure correct SW flow control operation in the Tx DMA mode.

Please confirm that one of the workarounds solves your problem for the moment.

Regards

Mark

Offline Chuck

  • Newbie
  • *
  • Posts: 9
    • View Profile
Re: A USB problem
« Reply #8 on: February 08, 2010, 07:03:26 PM »
Mark,

Wow, you have been through long way, while I am away. Thank you for nicely fixing the problem.
Because I am running UARTs with hardware flow control, I just adopted the first fix to resolve the interrupt conflict.
It looks running OK so far. I will let you know, if the problem lingers.

Thanks again

Regards,

Chuck

Offline mark

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3236
    • View Profile
    • uTasker
Re: A USB problem
« Reply #9 on: February 08, 2010, 10:28:20 PM »
Hi Chuck

I am pleased to hear that there is an improvement; if I had been using HW flow control for the tests I would probably have also been finished in half an hour or so but the XOFF bug was not that easy to identify. But, I suppose better to have found it now that later - it is in fact a bit tricky because I think that the flow control development and testing concentrated on stopping correctly but in this case it is that the processor has to quickly inform the other side (while in the middle of sending a block itself) to stop (easy to do in interrupt driven mode). But the DMA part certainly complicates it so I didn't try a quick fix until considering just how it should be best done. I think that it will be suitable to abort any transmission in progress, switch to tx interrupt mode to 'insert' the XOFF (and later the XON) and after the character has been sent switch back to DMA operation for the waiting block; this might need a bit of practice to get right... [nothing to do with USB in the end so USB does still look to be well behaved ;-) ].

Regards

Mark