Assuming that the basic issue is conflicts between the task running and the writes (pre-emptive) you could check to see whether one of these is being triggered (maybe by the pre-emptive writes) and remove the re-scheduling commands (possibly redundant anyway).
I made it such that I never attempt to write while the MASS_STORAGE task is active - indicating there're still operations to be finalized.
I'm debugging using four pins to signal various events across the application and can see that the writes start taking longer for a particular cluster and that the locks that actually bother me are coming from the MASS_STORAGE scheduling that sometimes occurs from this line
fnLoadPartialData(ptr_utDisk, ulClusterSector, (unsigned char *)&ulCluster, (unsigned short)((ptr_location->ulCluster & 0x7f) * sizeof(unsigned long)), sizeof(unsigned long))
in the
fnNextSectorCreate()
function. To be honest, I'm not sure I can debug this further as it takes a rather long time to reproduce (around 30 minutes) and there's limited information that I can gather each time. My main concern is keeping the other uTasker tasks running and try to somehow recover the SD card operations afterwards.
Attached are two screenshots from my logic analyzer showing the events before a crash. "normal.png" shows what normal operation looks like, while "locking.png" shows the events just before everything locks.
D1 - low - MASS_STORAGE task active (when everything runs as expected, there's no activity here). Just before a crash I can see loads of postponed writes being done through the MASS_STORAGE_TASK
D2 - low - my datalogging task writing to file
D3 - toggling - each state is one full cycle of processing activity that needs to happen in the application
D4 - low - moving over a cluster boundary
My conclusions are that the SD cards starts to react much slower than usual (high activity on the MASS_STORAGE task), then, as it tries to move to the next cluster, it locks while trying to read/create the required filesystem data for the next cluster.