µTasker USB-MSD Loader Mac OS X Compatibility
USB-MSD is popular as a method of upgrading software to embedded devices. This is due to the fact that USB-MSD doesn't require a driver to be installed (although this is getting more complicated - eg. with Windows 8 which will only accept signed drivers without awkward workarounds) and also because the in-built file manager can be used to see existing software in the device (assuming this is supported by the device's software). Copying new firmware to what the PC sees as an external hard drive is quick and easy (drag-and-drop can be used as standard) and so no special software is require on the PC for this comfortable operation. The result is that if a file can be distributed containing the new firmware anyone with a PC can load it to the device without needing to use any non-standard software, and on any platform.
Unfortunately the PC's operating systems are continuing to be developed and don't really care about the fact that what is sees as a small external hard drive may in fact be an embedded device emulating one for firmware updating convenience. This has also been encountered with the Windows 8.1 update suddenly causing such solutions, such as the mbed loader, to no longer operate correctly.
Windows 8.1 is supported by the µTasker USB-MSD loader and it was time to look as the Mac OS X behavior, which is also known to do its own 'invisible' things that can cause difficulties
Connecting to the Mac
When an external drive is connected to the Mac it initially behaves as all USB hosts do; it enumerates the device, checks its readiness and reads out its boot sector and FAT so that it has a local copy to work with. PCs may do a bit of checking that the content is all as expected and give a warning in case there looks to be some corruption - often with an offer to fix it in such a case.
As in the Windows 8.1 case where the PC (when not yet existant) creates a hidden system directory called "System Volume Information" in which it places a file called "IndexerVolumeGuid" filled with some 76 or so bytes of what looks like unicode characters, the Mac OS X does something similar too. The exact sequence turn out to be as follows:
- Creates a hidden file called "_._Trashes" with 4k content (mostly filled with zeroes but with as string "Mac OS X" near to the beginning.
- Creates hidden directories called ".Trashes" and ".fseventsd"
- Creates a file in.fseventsd called "fseventsd-uuid"in which a long ASCII hex number is written (similar to a serial number).
In addition to the files and directories being created on attachment, there are also changes to the files at (possibly) random points in time - for example about 25s after attachment a write to a file (write of data to change the file's object and file's content) can occur. Generally, the embedded device must be aware that writes to these files may take place at any point in time and be able to handle this without it causing errors or write to application space in the false belief that it is part of upload data.
Performing Firmware Updates
The software upload is a file copy to the PC host. It is observed that when a file is copied by the Mac OS X it also writes an extra file. For example, if the file uTaskerV1.4_BM.bin is copied to the disk there is also a hidden 4k file written with content similar to ._.Trashes which is named ._uTaskerV1.4_BM.bin.
The meaning of the data and the reason for writing it was not further investgated and is also not really relevant for the USB-MSD boot loader operation as long as it doesn't cause problems. Essentially the loader has to be able to accept writes to the (hidden) files at any point in time (as well as immediately before accepting the actual firmware data).
After the copy of the file is completed it is also observed that this file is deleted.
Since the USB-MSD loader is emulating a FAT it doesn't usually need to have a great amount of knowledge about the actual files(s) - it is essentially interested in "catching" the data content and writing it linearly into the application space (usually internal processor Flash but could also be external parallel or serially connected memory). Since the USB-MSD loader also 'creates' the emulated FAT it ensures that it is defragmented so that writes are then simply in a linear order.
Writes to the FAT during connection are not a real problem since the PC keeps a cache of the FAT and so no information needs to be maintained about changes by the PC host; it will not read anything back because it already has its own cache of it and after a re-start the USB-MSD loader starts with an empty disk again (apart from an emulated image of the firmware that may be already loaded). Details about the actual emulation are included in the µTasker Serial Loader User's Guide. Since the initial writes take place immediately on connection all writes can in fact be basically ignored. The complications start due to the fact that the actual firmware data is not written to the start of the cluster area as would be the simplest case - also valid for Windows 7 - but is still only a minor complication since it means that it starts at an offset which can be easily compensated for. In case the name, size and time/date stamps from the uploaded data are to be used, which is the case in the µTasker, this needs to be filtered from the hidden files and directories that may have been written to the root directory. Again this is not a major task, knowing that the uploaded file is in fact the last one to be placed in the root directory.
The main difficulty is therefore to actually recognise when data that is later being written is due to the firmware upload taking place or when it is due to writes to hidden files. Initially one could imagine that this is quite easy to do because the writes to hidden files can be detected due to the name of the file being written. Unfortunately this proves to be trickier than anticipated due to two reasons main:
- The USB-MSD loader usually doesn't interpret the content of writes to the file objects and so doesn't usally have any code available to do it. If code were to be added it would increase its size and complexity, needed to handle both long and short files names.
- More serious is that the Mac OS X has full control of what it is doing with the disk and doesn't necessarily write the file object until it has already written file data - this is seen during the creation of the initial files. In the worst case, all received data needs to be buffered in RAM until it can be determined whether it was intended for the application space (the upload file being written) or whether it is OS data (that can be ignored). The size of RAM buffers needed could easily be greater than that physically available in the processor, which will tend to have larger program sizes than RAM.
What is noticed with the write of a file is the way that the hidden file is written. It takes place in three steps:
- first the content is written to cluster space, which means that 4k is written to cluster sectors without any reference to what they belong to.
- then the FAT is updated to include these written clusters, making them belong to FAT allocated space (although not yet related to a file).
- as third step the directory content is updated with the name of this hidden file (._uTaskerV1.4_BM.bin) with its size and time stamps, but at the same time the name of the file that will be copied to the disk (uTaskerV1.4_BM.bin), presently with zero size, is also include before the name of the hidden file, with its size and now referencing the clusters that it's file occupies.
The copy now continues as the data content of the new file is written to the cluster space, after which the FAT is updated so that the newly written clusters belong to FAT allocated space. Finally the directory object is updated with the size of the newly copied file and a reference to where its content is located.
This seems to allows a strategy to be defined which ensures that the writes of hidden file data can be filtered.
- During a period of 5s after USB enumeration has completed writes to disk are generally ignored since it is known that some files and directories may be created. A copy of the root directory is however maintained so that its content can be later analysed.
- After this initial period, when no download will be started since it takes longer than this for a user to be able to start a firmware upload, any received data content is still ignored but all writes to the root directory are analysed for differences. Of interest is when new objects are written (the root directory content increases). Only when it is detected that a new visible file has been added to the root directory is data reception enabled.
- Once data reception is enabled, cluster content written to the disk is saved to the application space in internal Flash - beginning at the start and increasing linearly.
- When a further write takes place to the root directory which adds a cluster reference and size to this visible file further data reception is disabled again. This also causes the loader to disconnect as USB device and restart so that the newly loaded application can be started.
However it was found that the details are not always like this and there is a problem when the hidden file that is written is not in the same directory object sector occupied by the name of the file that is to be written. When this unfortunate combination exists the operation takes places as follows:
- first the content of the hidden file is written to cluster space, which means that 4k is written to cluster sectors without any reference to what it belongs to.
- then the FAT is updated to include these written clusters, making them belong to FAT allocated space (although not yet related to a file).
- as third step the directory content is updated with the name of this hidden file but only the final part of the hidden file name occupying a sector when the name is straddling two sectors. This leaves a hole in the directory that will be filled out later.
- then data content from the file being copied is written to cluster space (with no reference in the root directory yet that it belongs to it).
- Once the copy is complete, or at some point during transfer when the file is larger, there is a write to the root directory filling out the missing details concerning the file object. There can already be 32k of data that was written to cluster space before this happends.
- After further cluster data has been written the file object is again updated.
This shows that there is in fact no safe method of knowing which file cluster data that is being received actually belongs to until after the file object is updated - when this update takes place is not always identical!
In this example situation 4k of data was received and it could (possibly) be identified shortly after that it belonged to a hidden file but then a larger block (32k) was received also with no indication as to what they belonged to until later. Since small embedded devices may have difficulty buffering blocks of data of the sizes involved (eg. parts with 8k RAM in total) another technique was needed.
The next revised strategy is based on the fact that the content of the 4k block that is written is known to include a string "Mac OS X" close to the start of its content, as well as a string "ATTR;" a little later on, but still in the first sector received. Furthermore, the rest of the content - although not of string type - looks to be always the same or at least have further values that can be recognised. The patter of firmware to be loaded may be quite random in nature but the first bytes usually follow some rules such as being a valid stack pointer address and a program start address. Therefore, by analysing the first sector content of cluster block being received it should be possible to identify with a very high probability whether it is OS or file copy content. This results in the revised technique:
- During a period of 5s after USB enumeration has completed, writes to disk are generally ignored since it is known that some files and directories may be created. A copy of the root directory is however maintained so that its content can be later analysed for generating a file object corresponding to the firmware.
- After this initial period, when no programming will be started since it takes longer than this for a user to be able to start a firmware upload, any received data content is still ignored but the initial sector content of writes to the cluster area are analysed. If the sector can be matched to the content written to hidden OS files the write is ignored. If, however, the content of a block write doesn't corrolate with the pattern it results in data reception being enabled.
- Once data reception is enabled, cluster content written to the disk is saved to the application space in internal Flash - starting at the beginning and increasing linearly.
- When a period of silence (no further writes) of 1.5 s is detected the file transfer is considered to have completed and its file object is committed - the application can then be executed.
This revised strategy was now successful in allowing an upload to operate as long as it was performed rapidly after connection. This is because it correctly recognises the temporary hidden file that is written along with the file copy but will incorrectly accept random writes to hidden files that were created at connection time - a write of this type typically takes place after about 25s.
This mean't that a further revision was still required to be able to recognise the write to an existing hidden file which has seemingly random data content and this was performed by keeping a copy of the FAT table (only first sector) that was being modified by the Mac OS X. Based on this it is possible to work out whether the block write being performed is for a cluster in any hidden file's cluster chain. In case this is the case the data write is ignored.
To keep the FAT cluster chain as simple as possible and the need to keep a backup of just one sector of the FAT the number of clusters must be limited. A single FAT12 sector holds 341 cluster entries, meaning that a total of 170k content can be managed when the cluster size of 512 bytes - remembering that the Mac OS X uses about 18k of it for its hidden files.
Assuming that a maximum upload size of 1MB is adequate, a cluster size increase to 4k (maximum for FAT12) allowed this to fit in a single FAT12 sector and achieve the objectives. Although the hidden data will occupy more space due to the smallest chunk used for an object being this size it is not of actual consequence since this is only virtual in nature; the efficiency of storage of the data in application space is not affected.
A final revision to the strategy was to also check the first two long words in the data being received. Assuming that the binary data is not encrypted in any way the first two long words of an ARM project are expected to be the initial stack pointer (located towards the top of internal SRAM) and the start address (located in internal Flash address space, usually after the end of the boot loader). By also requiring the content to contain valid values for the processor in question a further level of security was added which allows the initial 5s of silence after connection to be avoided.
Once file writing behavior has been handled the next thing that needs to be dealt with is file deletion. The MAac OS X handles this also using a number of steps that are not obvious until the operation has been studied in detail. Essentially it achieves it as follows:
- Data is written to a hidden file (this again contains the "Mac OS X" string and is 4k in size).
- A FAT cluster entry is added for this file, which doesn't yet belong to a file, as well as a small number of additional clusters.
- Data content is written to two cluster areas, although there is no file associated with them.
- A write to the directory space is performed which deletes the file in question.
- One of the clusters is written to again with information about the file that was deleted.
- The FAT is not (immediately) updated with the used clusters (although this may take place at a later point in time) - it is also noted that the clusters occupied by the delected file are also not (yet) deleted from the FAT.
Again, the object was not to know exactly why the sequence is performed but to understand what is to be expected and how a USB-MSD loader can reliably recognise that a delete has taken place and filter particular cluster data rather than interpreting it as firmware upload data.
During the delete sequence it is seen that cluster data writes are performed, whereby one block contains the "Mac OS X" string. There are however additional cluster writes which have different content (essentially unrecognisable) which are made to clusters that have been added to the FAT before use, but have not been assigned to any files. This adds a further cluster write type that can't be recognised by the rules developed for the file copy and it is the write to the cluster immediately after the delete that causes difficulties due to the fact that it could be understood as a file transfer after the write.
Three possibilities were seen to filter this cluster write:
- don't accept a write until a delay after the delete has expired.
- recognise that the same cluster (not assigned to any file) is being written to as just before the file delete took place.
- recognise that the cluster write is to a cluster that is assigned in the FAT but not (yet) to a file.
The method chosen was to keep a backup of the last cluster that was written so that it can be compared with a write after a delete and so ignored when matching.
A file overwrite is essentially a delete followed by a file copy. Fortunately this doesn't pose any further problems and is automatically handled by the previous methods.
The µTasker USB-MD loader enables firmware saved in the processor to be read back to the PC host. This is optional and can also be password protected. If protected, reads return the file with a content of only zeros. To unprotect, a password file is copied to the disk which contains a secret password string which is checked by comparing it against received cluster data content. Once the password has been verified, reads cause the firmware content to be returned. The protection is then only set again when the device is disconnected.
Again, the methods used to allow firmware writing already allow the password file to be copied and checked without further difficulties. File reads also don't pose any complications.
The Mac OS X USB-MSD host operation adds complications to a USB-MSD based loader due to its writing of hidden directories and files. Especially tricky is the fact that it writes data to these files at seemingly random points in time and also while writing new file data, as well as even when deleting existing files. Furthermore, it doesn't keep to a certain write order (FAT, cluster, file object) and so it is not always easily possible to know which file data being received belongs to until after the write has completed.
To avoid the need to buffer large amounts of data until the file details are also known, a strategy was developed based on correlating known string content in the the files and also relating cluster writes to FAT clusters known to be allocated to hidden files.
Although it is not known whether the Mac OS X operation will again change over time a good basis has been created to deal with any issues that may arise in the future.
The USB operation is not particularly easy to follow and so the Beagle USB 480 from Total Phase proved very useful, especially since it decodes the USB-MSD class content to make interpretation much easier.
Code analysis was further greatly simplified by using USB simulation scripts to play the sequences through the code for development and debugging so that the correct operation could be achieved in the device in real-time. The two USB simulation scripts that were used for the work and that can be played through the serial loader code in the µTasker are below for interested reads and also as reference to typical Mac OS X typical behavior.
- MAX_OS_X_connect_EP1.sim - MSD sequence on endpoint 1 when the Mac has enumerated and writes hidden files to an initially empty disk.
- MAX_OS_X_delete_EP1.sim - MSD sequence on endpoint 1 when the Mac connects to a disk with firmware loaded, plus its delete sequence.
For specific questions and feedback on this topic please use the following forum entry: USB-MSD firmware loading and Mac OS X compatibility
µTasker USB-MSD Loader Mac OS X Compatibility. Copyright (c) 2004..2018 M.J.Butcher Consulting