┬ÁTasker Forum

┬ÁTasker Forum => utFAT => Topic started by: AlexS on January 24, 2019, 04:36:21 PM

Title: SD write speed and processor load
Post by: AlexS on January 24, 2019, 04:36:21 PM

We run a Kinetis K66 with a microSD card slot connected in 4-bit mode. I've done some basic benchmarks this morning with the standard write size we're going to be doing most of the time (1.4KB) and I noticed some that the main iteration time for the main task reduced significantly. It used to be able to run one processing iteration every 1.7ms. After I've added the 1.4KB writes to it, the duration for one iteration increased to 10-35ms. Is there any setting in utFAT that would improve things? The file system I've used when formatting was FAT32 and the default allocation unit size.

Title: Re: SD write speed and processor load
Post by: mark on January 24, 2019, 06:53:08 PM
Hi Alex

The problem with SD cards is that they are quite slow. Writing time is not predictable and can even be up to half a second for a sector write on a well used card that needs to do internal house work in the process.

utFAT is blocking and so the speed of the SD card determines the time that it takes to complete and no file system settings can change this (the only rule for most efficient writing it to write in 512 byte sizes since that matches the card's HW).
When reading/writing large blocks of data there is the option to do block operations:
#define UTFAT_MULTIPLE_BLOCK_WRITE                                   // use multiple block writes where possible (write speed efficiency)
#define UTFAT_MULTIPLE_BLOCK_READ                                    // use multiple block writes where possible (read speed efficiency)

but these are only relevant with linear blocks and don't help with small, random blocks. They also restrict the operation to single files because there is a problem with the SDHC controller in the Kinetis parts that doesn't allow a linear transfer to be interrupted (eg. because another file needs to be read in the meantime).

For situations where the general operation needs to be well defined (eg. your 1.7ms iteration which should not greatly be impacted by SD card writes) I would consider the following (other techniques may be available based on interrupts and a file queue but would need to be created):
- activate the FreeRTOS mode of operation and add a task that is to be responsible for SD card writes
- when you have data to be written give it to this task to do
- the task will need to be configured to operate in a time-sliced mode (eg. allowed to run say 200us max, to limit max. jitter value)
- the task can then do the utFAT write (no other utFAT operation should be made in uTasker) and will not cause more that the defined jitter and signal when it has finished so that further writes can be made (or such writes can be generally queued to it).

I haven't actually done this but the FreeRTOS option is essentially available for such requirements and only increases code size by about 6k and RAM utilisation by about the same. You can enable it with RUN_IN_FREE_RTOS and FREE_RTOS_BLINKY for first test (just check that the settings in FreeRTOSConfig.h [in application folder] match with the processor (#define ARM_MATH_CM4, #define CORE_CLOCK (120000000) etc.).
This should transform your existing work into a version working in a FreeRTOS task (an output will be toggled in another FreeRTOS task - see FreeRTOSapplication.c) to verify that the present operation is otherwise not influenced by the change.
Following this the required task operation can be added.

NOTE: The above is only relevant for the supported package (not the open source version)
Also I don't have great experience with FreeRTOS capabilities and control, but it was integrated to allow such things to be achieved but keeping general parts (that can be done more easily and without potential pre-emptive and memory issues) in the uTasker environment.
At the same time it allows FreeRTOS users to integrate uTasker solutions (such as peripheral libraries, protocols) into their projects by simply adding a 'uTasker' task.
Finally, when FreeRTOS is used the SUPPORT_LOW_POWER option cannot be used (it is disabled automatically) and also simulation is not possible (although simulation may be added at some point - but is a bit complicated...).

Does this sound like an approach for you?



Title: Re: SD write speed and processor load
Post by: AlexS on January 24, 2019, 08:03:28 PM

Thanks for your speedy reply. I think my particular use case should allow me to do some neat optimizations that should broaden the number of options available. Essentially, the card will never have to be written to and read from randomly. It will write to a file for a number of minutes, then the writing will stop and the file will be read and then sent to the PC through USB. Also, my writes are always sequential and of the same size, depending on the user's configuration. When you say it only helps with linear blocks, do you mean I should always write more a multiple of block size data or the speed-up totally dependent of the SD card's filesystem organization?

I'm a bit reluctant to commit to a big switch such as using FreeRTOS so late in our project, so I was thinking I could obtain the same effect by dividing the data to be written in multiple chunks. I have quite a bit of leeway in terms of the variation of the main processing task (can easily go up to 3-4ms without affecting the functioning of the product). I'm quite surprised because I've seen benchmarks with the Kinetis that got way more than the measly ~20KBps I got with my first test. I'm thinking it could be the card, so I'll do some more testing over the next few hours and see where we end up.

Title: Re: SD write speed and processor load
Post by: mark on January 25, 2019, 01:21:08 AM

Using the FreeRTOS configuration will not be any effort nor change the behavior (when it doesn't introduce new activity) however, when I re-read your requirement, I notice that you would like to write this data in "each" cycle and not as a parallel task, which means that I don't think that changing the config will help.

1. 20kBytes/s writes are about what I expect for random FAT writing.
2. Some cards may be faster but they are mainly optimised for large linear transfers (single file videos, big photos, etc.) and are not necessarily that fast when just writing small FAT chunks.
3. Block mode reads/writes are faster (which I presume is how the MBytes/s writes can be achieved). I know that I can achieve about 20MB/s read speed on a K66 to USB-MSD (HS USB and not FS, since FS itself limits to about 1MB/s) (including the USB part) when using block mode and it falls to about 600kB/s in sector read mode. I didn't get huge increased in write mode but that may also have been due to the more tricky USB direction but I was not that interested in the read speed when I did these tests.
4. Also consider using the file open modes with  UTFAT_WITH_DATA_CACHE and UTFAT_COMMIT_FILE_ON_CLOSE (see the utFAT option for details):
- the caching mode is useful to avoid small writes (allowing lots of small writes to cache and committing only when sector content is available)
- the ON_CLOSE option means that data can be written to the file (data content and FAT clusters) without updating the file object, which reduces the write content needed (beware that even writing large data content still needs some random write to the cluster area which reduces speed) and the file object is updated only when the file is closed. The downside is that if a file is not closed (crash, reset, power cycle) data content will be lost even though it was physically written to the card since the file object doesn't know about it (thinks the file is still empty). If you can accept such data loss it is however somewhat faster.
5. If you have a single file the block mode option could be used and may give a useful increase in performance (it effectively tells the card that it is going to receive a write to a number of sectors in a row and so it can already do preparation before subsequent writes are made, which saves the waits on each new sector write). I don't have a great deal of experience with this because I haven't generally used it much (due to multi-file restrictions) but it may be suitable for you.
6. Avoid writing non-512 byte chunks if possible (buffer in RAM until you can commit a chunk) to avoid writing a sector more than once with partial data because the second write to a sector means the SD card needs to swap blocks around that slows it down. If the file cache option is enabled the cache will in fact control this (but needs a close to commit any final caches data).
Practically it may also be an idea to increase your data write size from 1.4k to exactly (3 x 512) bytes so that you always write 3 sectors per cycle rather than it varying between 2 and 3 when caching.

With some of these options you can probably avoid a number of (unnecessary) writes and get the time down somewhat (at least to prove the effects).


Title: Re: SD write speed and processor load
Post by: AlexS on January 25, 2019, 03:47:44 PM

I managed to get 100 KB/s by using FAT32 instead of FAT as a filesystem on that particular card. You mentioned at #3 using block mode. Did you mean simply writing data in 512 byte chunks or were you referring to a setting of some sort?

Thanks again for your input. The current settings would yield a 4.8ms increase per loop iteration which, although not ideal, is very encouraging as it means the FreeRTOS solution could be used and not interfere with the actual operation. A 50Hz write frequency would only increase the loop time to around 2.2ms, which isn't a problem at all.

Title: Re: SD write speed and processor load
Post by: mark on January 25, 2019, 04:24:50 PM

SD cards can be written using SECTOR write or BLOCK write commands (and equivalent for reads).
The SECTOR write allows writing to ANY random sector on the card and so is fully flexible.
BLOCK write tells the SD card that you are about to send a number of sectors that are simply one after another, which can only be used when you really are going to do so (eg. when writing data to an empty cluster and not randomly updating file objects and such). Since the SD card knows that it is going to get this data it has the change to get itself ready for it and write time becomes a lot faster (it doesn't need to react randomly to a new address only when the next data is arriving).

The define UTFAT_MULTIPLE_BLOCK_WRITE enables the block mode support in utFAT. The user doesn't need to do anything differently when writing, BUT, it should not use more than one file at a time and a SD card detect based on a PIN must be used (and not the polling method) because once a BLOCK command of a certain length has been reported it is not possible to stop/abort it (due to bug in the Kinetis SDHC) - resulting in failure if a second file needs to be accessed during a write sequence.

Practically I only use block mode for USB-MSD where it is very suitable due to the fact that the host reads and writes in blocks. This however only when the application doesn't access the file system at the same time.