Author Topic: LCD demo on MCB2300 using GNU - volatile register...  (Read 10037 times)

Offline mark

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3234
    • View Profile
    • uTasker
LCD demo on MCB2300 using GNU - volatile register...
« on: January 31, 2008, 11:40:51 PM »
Hi All

The LPC23XX Beta version includes an LCD demo which is set up for use on the Keil MCB2300 board. This also runs when compiled with the IAR compiler (at least the version which I use..), but it was known that the LCD remains blank when compiled with Crossworks (GNU). Since the LCD is not the most important thing this (small) details didn't hold back the release but a quick attempt to solve it beforehand didn't result in any quick fix.

But today I did fix it after some head scratching stooped over the logic analyser screen. The timing was a bit different but nothing which seemed out of specification. But it finally became clear that the display was simply not driving the bus when read - which turned out to be not exactly true but rather the LPC23XX was not setting the bus lines to high impedance (not necessarily obvious on analyser display).

In such cases it is invariably due to some optimisation and in this case it turned out to be a register that was not declared as volatile causing the bus direction control to be totally removed.

To get this working, the following line in LPC23XX.h fixes it in case any Crossworks (GNU) user would like to see it running correctly.
#define FIO1DIR                          *(volatile unsigned long*)(FAST_GPIO_BLOCK + 0x20)

But this throws up a few background questions which should perhaps be looked at in a little more detail:

Why is there this problem when compiled with GNU but not with IAR?
It all has to do with how the compiler optimiser handles the case and possibly also what level of optimisation is actually being used. The uTasker project always uses the highest level of optimisation for size and so it is also expected that the compiler does what it can to save unnecessary code space.

How and to what degree this takes place depends on the compiler intelligence (or stupidity depending on how you look at it) and generally a quality compiler (which one tends to pay a lot more money for) is expected to squeeze more code into a certain amount of code space (this will often also translate into code which needs less instructions to run and so is not a bad choice.)

In the example, the following lines of C-code are considered to be totally unnecessary and so are removed by the GNU compiler with full optimisation (stupid or clever?) and considered to be necessary by the IAR compiler (dumb or very clever??):

FIO1DIR &= ~LCD_BUS_MASK;               // set LCD bus to be input
..write data and clock
FIO1DIR |= LCD_BUS_MASK;                 // set LCD bus back to driven state


Since the value of FIO1DIR is the same after the second operation as before the first operation was performed the optimiser may simply remove both of the lines, and save a few instructions in the process to allow (non-operational code) to fit into less FLASH space and be executed faster!

How does the use of volatile help control this?
The volatile type attribute was added to ANSI-C in its second edition. It informs the compiler that it must not perform optimisations on the variable and is typically used when the value stored in a variable can not be considered as remaining stable - eg. a port input is an obvious case since it can change at any time. Other example would be a variable which can be accessed by a DMA controller and thus changed at almost and time with out this being visible in the program flow.
 
Finally- why are all registers not simply set to be volatile and then such problems should be automatically avoided?
There are some headers for ports which contain only volatile defines. These will not be able to suffer from optimiser operation and so are obviously very safe.

However consider the following:
#define FIO1DIR                          *(unsigned long*)(FAST_GPIO_BLOCK + 0x20)
#define FIO1PIN                          *(volatile unsigned long*)(FAST_GPIO_BLOCK + 0x34)
#define FIO1SET                          *(volatile unsigned long*)(FAST_GPIO_BLOCK + 0x38)


This is a small extract from the uTasker header for the fast ports in the LPC23XX.
By looking at it it is clear that the FIO1PIN register value (which reflects the state of inputs) can change without this change being a consequence fo a CPU access. The optimiser can not assume any value just because the code flow has previously set a value to it (in the case of this register it will in fact never be written to).

Consider this example
if (FIO1PIN == FIO1PIN ) {
}
A variable without volatile type attribute will always result in the comparison being true and so the compiler can simply always execute the true case. In the case of a volatile type the compiler should never assume that the result is true since the content could well change between the two accesses to the same register!!

FIO1SET is also defined as volatile because of the fact that it can be written with bits to set outputs but reading it back makes no sense - code will probably never attempt to read it, but it is clear from the define that its content will not necessarily be the same value when read back as when it was written to.

if (FIO1SET == FIO1SET ) {
}
Although this is a very silly piece of code it would usually result in the check being unequal. Without the volatile type attribute the optimiser would almost always result in the wrong path being take!

FIO1DIR is a read/write register and there is no possibility of its content changing between accesses, so why should it be defined as volatile? Its content is purely code dependent and this is clear by looking at the header file. There is no need to consult the user manual to be sure of this fact and its definition is thus just as valid as the volatile definition for the FIO1SET, where is is also perfectly clear that the content of the register when read back is NOT necessarily the same as the value written (the exact reason for the difference in content is not the same but the baisc characteristic is nevertheless clear).

Consider when FIO1DIR were define as volatile and the following code written:

FIO1DIR = 0x0000000;
..
FIO1DIR |= 0x0000004;
..
FIO1DIR |= 0x0005010;
..

here the optimiser has no choice but to read the present value at each step and perform an OR function before writing the new value.

Without the volatile attribute the optimiser could do the following:

FIO1DIR = 0x0000000;
..
FIO1DIR = 0x0000004;
..
FIO1DIR = 0x0005014;

This results in faster and more efficient code, even if the programmer didn't specifically notice that this is in this case a more efficient (and non-dangerous) method to achieve the same results.

So my conviction is that there are some advantages of using the attribute in a selective way rather than simply declaring everything as volatile.


So how do we solve the problem with the GNU code not working correctly with the LCD demo?

The first solution has already been shown. By setting FIO1DIR with volatile attribute it is already cured. But the define doesn't inform us any more about its real characteristic and automatic compiler optimisation in other time critical code segments may be lost.

I see two methods which I will be considering for the next SP.

1.
#ifdef _GNU
    #define _VOLATILE volatile
#else
    #define _VOLATILE
#endif

#define FIO1DIR                          *(_VOLATILE unsigned long*)(FAST_GPIO_BLOCK + 0x20)
#define FIO1PIN                          *(volatile unsigned long*)(FAST_GPIO_BLOCK + 0x34)

Here it is clear that the volatile attribute assigned to the register is to ensure the GNU compiler doesn't (over)optimise but the programmer understands that the register's content is really purely code dependent. Non-GNU compilers (assuming they don't optimise - which may of course also change between compiler releases!!!) will still be free to optimise where seen to be advantageous.

2.
#define VOL_FIO1DIR                   *(volatile unsigned long*)(FAST_GPIO_BLOCK + 0x20)
#define FIO1DIR                          *(unsigned long*)(FAST_GPIO_BLOCK + 0x20)

Here there are simply 2 different register definitions and the programmer can decide when a particular access 'should really' be volatile in a critical code section (the programmer shoudl of course knwow such thing swhen programming and should be able to make the final decision...).

Does any one have any inputs to help decide which solution would be the best ??

Regards

Mark