Author Topic: SD Task locking (Read 4291 times)

AlexS · « **on:** May 27, 2021, 03:00:48 PM »

Hi,

Our firmware's architecture involves two FreeRTOS tasks, one running uTasker and the other periodically calling utWriteFile() at a frequency of around 50Hz. What I saw was that the fnMassStorage task would simply lock trying to talk to the SD card, essentially blocking all the other uTasker tasks from running. Unfortunately due to other issues I wasn't able to dig deeper into the problem, but was curious as to whether this has been seen before. We are currently running uTasker with Git commit SHA1: 3c417a7002c9d9840e025bb445208070a62327df

Thanks!

mark · « **Reply #1 on:** May 28, 2021, 12:29:58 AM »

Hi Alex

Is you are running utFAT accesses form a higher priority FreeRTOS task these should have direct access without passing through the mass storage task.
However it may be that the mass storage task (in the uTasker FreeRTOS task) is "polling" the SD card to see whether it is still there/operational, which could be risky due to the fact that HW accesses to the SD card may be interrupted by file system accesses from the higher priority task - possibly this could result in corruption leaving the mass storage task operating in a loop of some sort (?)

I would disable the polling option since the FreeRTOS task is also regularly accessing the card and will recognise in case it is removed and therefore the mass storage task will never be active (after the card has mounted). Like this such a conflîct woudl be excluded as cause of such difficulties.

Regards

Mark

AlexS · « **Reply #2 on:** June 03, 2021, 02:45:55 PM »

Hi Mark,

I only got the chance to look closer into this, both the datalogging FreeRTOS task and the uTasker FreeRTOS Task are running with the same priority. The only thing that's done through the datalogging FreeRTOS task is the actual file writing. Now, the way our application is structured, there's no guarantee that the user has an SD card inserted, nor that he wants to log data to it. Is there any document which details the operation of the fnMAssStorage() task? As I understand its operation now, I have a few questions:

1. I noticed that it rarely runs once data is constantly being written to the filesystem. I haven't dug too deep, but seems odd to me, as I don't see where the task is suspended in the code and I think it should run for every iteration of the main uTasker FreeRTOS task.
2. When you mention "disable the polling option", do you mean disabling the entire fnMassStorage() task? I was thinking to suspend it once I start logging to file and resuming it once that's done.

Regards,
Alex

Further observations: when the utWriteFile() locks, the fnMassStorage() task comes alive much more often.

mark · « **Reply #3 on:** June 04, 2021, 09:40:00 PM »

Hi Alex

The thing to bear in mind is that the mass storage task only really runs when the SD card (or memory stick) is being mounted.

When files are read or written the generally don't require the mass storage task to be used - they are function interacting directly with the FAT layer.

The other circumstances when the mass storage task runs is when it is counting the number of free clusters (this can take some time and so is performed as a background task) when the user does a directory inquiry. Or when the drive is being re-formatted.
If there is no card insert switch the option
#define T_CHECK_CARD_REMOVAL ((DELAY_LIMIT)(SEC * 10)) // if the card has no detection switch it can be polled to detect removal
is usually used.

This will then read a card's register every 10s as a method of detecting whether the card is still inserted. This is the one that I am referring to due to the fact that in a multi-tasking system this and normal read/writes may pre-empt each other and cause a failure.

Should a problem be detected with the SD card that makes it no longer look mounted the detection polling will stop (since it knows it is not there) and a faster polling (3s?) for insertion/mounting started. This possibly explains the frequency you see. You possibly also see that it tries a few times quite rapidly and, when mounting was not possible, it then waits for a longer period before it performs a few faster attempts again.

If there is a conflict due to pre-emption you may find that disabling T_CHECK_CARD_REMOVAL (and thus the poll every 10s) ensures the mass storage task never operates in normal use.

Good luck

Regards

Mark

AlexS · « **Reply #4 on:** June 07, 2021, 03:19:06 PM »

Thanks, Mark! Unfortunately it's still doing it even after removing the T_CHECK_CARD_REMOVAL define. What I'm seeing is that the fnMassStorageTask() is activated by the fnSendSD_command() .

I've started going into the code behind the SD operations and I can get an idea of what is happening, but not why I'm afraid. In my datalogging FreeRTOS task I call utWriteFile(), as I mentioned earlier. As it writes data, I can see that after a short time, there are more and more calls to fnSendSD_command() required to perform the write. That also activates the MASS_STORAGE task which then gets blocked at the line below and locks the main processing task as well. I'm writing in 512 bytes chunks 50 times per second, so I don't expect there to be any issues with write speed.

Code: [Select]

while ((SDHC_PRSSTAT & (SDHC_PRSSTAT_DLA | SDHC_PRSSTAT_SDOFF)) == SDHC_PRSSTAT_DLA) { // ensure that the previous command has completed (in multiple block reads the clock may be gated off to pause and so this is accepted)

What I also noticed is that utWriteFile() returns success even if the underlying fnCommitFileInfo() call returned ERROR_CARD_TIMEOUT.

PS: The problem is much more likely to appear as the system is responding to relatively high frequency pin interrupts that have the highest level of priority in the system and thus can pre-empt any other operation, which makes me think the issue is somehow directly related to that. Furthermore, it comes only when the writes pass the sector size boundary of 4096 bytes (every 8 writes). Hope this information helps.

mark · « **Reply #5 on:** June 09, 2021, 04:46:09 PM »

Alex

The mass storage task reschedules itself in some circumstances. Look for things like (with OWN_TASK)

uTaskerStateChange(OWN_TASK, UTASKER_ACTIVATE) and uTaskerStateChange(OWN_TASK, UTASKER_GO)

It mainly does this when mounting an formatting cards and so I wouldn't normally expect it to do so in 'normal' operation.

However there are such in fnSendSD_command(), which can be used when the card reports that it is busy for a certain period of time. The idea is that the system can yield and then check again later, although I believe this is not really used since fnSendSD_command() is called in a loop anyway.

Assuming that the basic issue is conflicts between the task running and the writes (pre-emptive) you could check to see whether one of these is being triggered (maybe by the pre-emptive writes) and remove the re-scheduling commands (possibly redundant anyway).
In any case I would keep a track on the state on the mass storage task and and avoid calling SD card operations when it is not in the stopped state (like a mutex on it) - which may avoid conflicts leading to problems. If the mass storage task never is scheduled again after mounting this would also do it (better) - if the above is an explanation of why it is being set to that state.

Good luck

Regards

Mark

AlexS · « **Reply #6 on:** June 14, 2021, 12:09:32 PM »

Quote

Assuming that the basic issue is conflicts between the task running and the writes (pre-emptive) you could check to see whether one of these is being triggered (maybe by the pre-emptive writes) and remove the re-scheduling commands (possibly redundant anyway).

I made it such that I never attempt to write while the MASS_STORAGE task is active - indicating there're still operations to be finalized.

I'm debugging using four pins to signal various events across the application and can see that the writes start taking longer for a particular cluster and that the locks that actually bother me are coming from the MASS_STORAGE scheduling that sometimes occurs from this line

Code: [Select]

fnLoadPartialData(ptr_utDisk, ulClusterSector, (unsigned char *)&ulCluster, (unsigned short)((ptr_location->ulCluster & 0x7f) * sizeof(unsigned long)), sizeof(unsigned long))

in the

Code: [Select]

fnNextSectorCreate() function. To be honest, I'm not sure I can debug this further as it takes a rather long time to reproduce (around 30 minutes) and there's limited information that I can gather each time. My main concern is keeping the other uTasker tasks running and try to somehow recover the SD card operations afterwards.

Attached are two screenshots from my logic analyzer showing the events before a crash. "normal.png" shows what normal operation looks like, while "locking.png" shows the events just before everything locks.

D1 - low - MASS_STORAGE task active (when everything runs as expected, there's no activity here). Just before a crash I can see loads of postponed writes being done through the MASS_STORAGE_TASK
D2 - low - my datalogging task writing to file
D3 - toggling - each state is one full cycle of processing activity that needs to happen in the application
D4 - low - moving over a cluster boundary

My conclusions are that the SD cards starts to react much slower than usual (high activity on the MASS_STORAGE task), then, as it tries to move to the next cluster, it locks while trying to read/create the required filesystem data for the next cluster.

mark · « **Reply #7 on:** June 15, 2021, 03:39:14 PM »

Hi Alex

I have used SD card to log data and have seen that the sometimes become very slow for a short amount of time. I believe that this is when the card is performing house keeping functions (clearing up deleted sectors and such) and these periods of remaining busy can last up to around half a second.

Since you see the mass storage task become active at the point in time the problem starts I assume that such a period of internal SD card activity has kicked in and the access timeout kicks the mass storage tasks into a polling mode after which (I still expect due to conflicts caused by pre-emption) a mass storage task subroutine hangs rather than timing out.

I would remove the mass storage task polling triggers (see previous post) since then the mass storage task will never start polling and only the write access will be affected by the slow accesses (timeouts) which should exclude any risks of pre-empting the mass storage task's polling operation.

This will not make the accesses faster since if the card is reporting that it is busy for half a second the writing task will need to wait but it will not block the overall system operation.

Regards

Mark

AlexS · « **Reply #8 on:** June 16, 2021, 12:06:50 PM »

Hi Mark,

Good and bad news

The good news is that the problem seems to have been fixed. The bad news is that if I remove the code that switches the mass storage task to polling mode from the fnSendSDCommand() function, the SD card initialization doesn't work at all anymore. So what I've done as a quick test and hack was declare a global variable that doesn't allow the mass storage task to be switched to active state once I begin writing to file and this seems to have cured the problem. Saying that, I need a better permanent solution of synchronizing, I was thinking of keeping this global variable solution and just adding a mutex to synchronize access to it from the two FreeRTOS tasks. Can you see any potential issues with this approach?

PS: There are a couple of SD-related functions that allocate relatively large buffers (512 bytes) on the stack. Is that intentional?

Thanks!

mark · « **Reply #9 on:** June 17, 2021, 10:54:05 PM »

Alex

I am pleased to hear that you have found a workaround. Indeed, the mass storage task need to be able to perform polling during the mounting phase since this takes some time with the card reporting that is is busy at different stages.
As with any system that is shared by users that can pre-empt each other some form of protection is usually required and taking the mass-storage task's polling during normal operation out of the equation looks to have proven that one. Generally I think that some form of general mutex on any use of the file system would be the right way to do it.

There are some write operations that require a sector of the card to be first read, modified, and then written back. When these are performed in a single function a 512 byte buffer on the stack is used so what you have seen is normal. Any calling tasks require adequate stack size for this.

Regards

Mark¨

µTasker Forum

News:

Author Topic: SD Task locking (Read 4291 times)

AlexS

SD Task locking

mark

Re: SD Task locking

AlexS

Re: SD Task locking

mark

Re: SD Task locking

AlexS

Re: SD Task locking

mark

Re: SD Task locking

AlexS

Re: SD Task locking

mark

Re: SD Task locking

AlexS

Re: SD Task locking

mark

Re: SD Task locking