Embedding it better...



## **Table of Contents**

| 1. Introduction                                     | 3  |
|-----------------------------------------------------|----|
| 2. Clocking                                         | 4  |
| 1.1.ARM Core Clock                                  | 5  |
| 1.2.IPG Clock – used by ADC and XBAR                | 13 |
| 1.3.PERCLK – used by PIT and GPT                    | 14 |
| 1.4.UART_CLK_ROOT – used by all LPUARTs             | 16 |
| 1.5.USDHC1 CLK ROOT/USDHC2 CLK ROOT                 |    |
| 1.6.SEMC_CLK_ROOT                                   | 18 |
| 1.7.FLEXSPI CLK ROOT                                | 20 |
| 1.8.LPSPI CLK ROOT                                  | 22 |
| 1.9.TRACE_CLK_ROOT                                  | 22 |
| 1.10.SAI1_CLK_ROOT/SAI2_CLK_ROOT/SAI2_CLK_ROOT      | 22 |
| 1.11.LPI2C_CLK_ROOT – used by all LPI2C controllers | 23 |
| 1.12.CAN_CLK_ROOT                                   |    |
| 1.13.SPDIF0_CLK_ROOT                                | 28 |
| 1.14.FLEXIO1_CLK_ROOT                               | 28 |
| 3. Real Time Clock                                  |    |
| 4. Internal Clock Monitoring                        |    |
| 5. LPUART                                           | 31 |
| 6. LPI2C                                            | 32 |
| 7. FLEXCAN                                          | 33 |
| 8. PIT                                              |    |
| 9. DMA                                              |    |
| 10. GPIO                                            |    |
| 11. RAM and Cache                                   |    |
| 12. Boot Mode                                       | 41 |
| 13. Ethernet                                        |    |
| 14. Conclusion                                      |    |
| Appendix A – Hardware Dependencies                  |    |
| a)Space for first Appendix                          | 49 |

## 1. Introduction

See i.MX RT 1021 document – only changes are here

The **MIMXRT1015-EVK** is used as test vehicle throughout the document but any board using the part can be easily configured based on the contained details.

# 2. Clocking

#### 1.1. ARM Core Clock

The ARM core is clocked by an internal clock signal called **AHB\_CLK\_ROOT**, which can be up to 500MHz. It may be automatically gated off in certain low power mode but these modes are of no concern to the discussion of the clocking capabilities and configuration. The following diagrams show the complete set of possibilities together with the project defines that control them – as well as divider settings and their ranges. The AHB\_CLK\_ROOT speed is probably the most important clock setting and the one that one will define as first frequency setting.

It is seen that there are essentially 7 possible settings. There are in fact some addition permutations but they don't make any sense to use in the i.MX RT 1021 because they are just further duplications of what is possible, with no added advantage; for example there are further paths that could select OSC\_CLK as source which are not shown to avoid unnecessary complications.

The 7 possibilities are now shown, whereby in each case the user can define an optional output divider between 1 and 8. It will be seen that three PLL sources are possible as source for this clock whereby it is useful to understand that these PLLs are all powered down by default and bypassed so that their input clock (OSC\_CLK in every case) is available at their outputs. If they are used they are powered up, their lock waited for and then the bypass removed. Each of these PLLs is a fixed frequency PLLs, meaning that the VCO output is defined to generate a fixed frequency from the fixed 24MHz OSC CLK input. However, some of the PLLs have PFDs (Phase Fractional Dividers) which can be tapped as output too - for example the System PLL (PLL2), which has its main output at 528MHz also has 4 PFD outputs (PFD0..PFD3) with frequencies of 352MHz, 594MHz, 396MHz and 594MHz respectively by default. Each of the PFD output frequencies can however be individually programmed or disabled, whereby the formula for the respective PDF frequency is PLL fixed ((frequency \* 18) / fraction) where fraction can be any integer value between 12 and 35. The following table shows the complete list of frequencies that can be selected for PLL2 and PLL3, whereby the PFD3 or each are optional core clock references as detailed further below.

| Fraction | System PLL (528MHz PLL2)<br>((528MHz * 18) / fraction) | USB1 PLL (480MHz PLL3)<br>((480MHz * 18) / fraction) |
|----------|--------------------------------------------------------|------------------------------------------------------|
| 12       | 792MHz                                                 | 720MHz - default PFD0                                |
| 13       | 731.0769231MHz                                         | 664.6153846MHz - default PFD1                        |
| 14       | 678.8571429MHz                                         | 617.1428571MHz                                       |
| 15       | 633.6MHz                                               | 576MHz                                               |
| 16       | 594MHz - default PFD1 and PDF3                         | 540MHz                                               |
| 17       | 559.0588235MHz                                         | 508.2352941MHz - default PFD2                        |
| 18       | 528MHz                                                 | 480MHz                                               |
| 19       | 500.2105263MHz                                         | 454.7368421MHz - default PFD3                        |
| 20       | 475.2MHz                                               | 432MHz                                               |
| 21       | 452.5714285MHz                                         | 411.4285714MHz                                       |
| 22       | 432MHz                                                 | 392.7272727MHz                                       |
| 23       | 413.2173913MHz                                         | 375.6521739MHz                                       |
| 24       | 396MHz - default PFD2                                  | 360MHz                                               |
| 25       | 380.16MHz                                              | 345.6MHz                                             |
| 26       | 365.5384615MHz                                         | 332.3076923MHz                                       |

## μ<u>Tasker</u> – i.MX RT 1015

| 27 | 352MHz - default PFD0 | 320MHz         |
|----|-----------------------|----------------|
| 28 | 339.4285714MHz        | 308.5714286MHz |
| 29 | 327.7241379MHz        | 297.9310345MHz |
| 30 | 316.8MHz              | 288MHz         |
| 31 | 306.58064524MHz       | 278.7096774MHz |
| 32 | 297MHz                | 270MHz         |
| 33 | 288MHz                | 261.8181818MHz |
| 34 | 279.5294118MHz        | 254.1176471MHz |
| 35 | 271.5428571MHz        | 246.8571429MHz |

#define RUN\_FROM\_DEFAULT\_CLOCK



Choose this setting to use the default clock configuration, which is in effect the OSC\_CLK (bypassed at the system PLL and switched through as shown to give 24MHz). The user can optionally reduce this frequency by a factor or /1 to /8 with the define #define

AHB CLK ROOT DIVIDE set to the value desired (defaults to 1 if not defined).

#define PERIPH\_CLK\_SOURCE\_OSC



This is a variation of the same theme as the default clock setting but instead of routing the OSC\_CLK from the bypassed system PLL it is routed via a different path. In this configuration there is a second optional pre-scaler, #define

PERIPH\_CLK\_SOURCE\_OSC\_DIVIDE, which allows further reduction of the core frequency if needed. If not defined the divider defaults to 1.

#define PERIPH CLK SOURCE PLL6 500M



This configuration is useful for simply obtaining the maximum operating frequency of the i.MX RT 1021 by using the output of the fixed 500MHz ENET PLL. When used, the ENET PLL is powered up, lock is waited for and then its bypass is removed so that the clock can be routed to the core [when Ethernet is used this PLL is also required for its operation]. In addition to the AHB\_CLK\_ROOT divider PERIPH\_CLK\_SOURCE\_PLL6\_DIVIDE can optionally be defined to pre-scale the PLL output by 1..8 (when not defined the default is 1).

#define PERIPH CLK SOURCE PLL3 SW CLK



This configuration uses the USB1 PLL (PLL3) as reference. This is a 480MHz PLL that can be additionally pre-scaled by 1..8 by using the define

PERIPH\_CLK\_SOURCE\_PLL6\_DIVIDE. If not defined the pre-scaler defaults to 1. As in the case of the other PLLs this configuration causes the PLL to be powered, lock waited for and then its bypass removed so that the signal can be routed to the core.

#define PERIPH CLK SOURCE PLL2 528M



This configuration uses the System PLL (PLL2) as reference. This is a fixed 528MHz PLL and, as in the case of the other PLLs, this configuration causes the PLL to be powered, lock waited for and then its bypass removed so that the signal can be routed to the core.

It is to be noted that 528MHz is beyond the specification of AHB\_CLK\_ROOT and so an AHB CLK ROOT pre-scaler divide of at least 2 is needed!

#define PERIPH CLK SOURCE PLL2 PFD3



This configuration uses the System PLL (PLL2) as reference. This is a fixed 528MHz PLL but its PFD3 output is sourced instead.

The PFD3 frequency is calculated by ((528MHz \* 18) / PLL2\_PFD3\_FRACTION) and so the illustrated value of 32 results in 297MHz. The list of possible PLL2-PFD3 frequencies can be found in the introduction to this chapter.

As in the case of the other PLLs, this configuration causes the PLL to be powered, lock waited for and then its bypass removed so that the signal can be routed to the core.

#define PERIPH\_CLK\_SOURCE\_PLL3\_PFD3



This configuration uses the USB1 PLL (PLL3) as reference. This is a fixed 480MHz PLL but its PFD3 output is sourced instead.

The PFD3 frequency is calculated by ((480MHz \* 18) / PLL3\_PFD3\_FRACTION) and so the illustrated value of 35 results in 246.857MHz. The list of possible PLL3-PFD3 frequencies can be found in the introduction to this chapter.

As in the case of the other PLLs, this configuration causes the PLL to be powered, lock waited for and then its bypass removed so that the signal can be routed to the core.

The choice of the core clock represents the major work of setting up the clocks. The following details are then specific to peripherals used in the system. *Peripherals of no interest don't need to be specifically configured since they will use defaults and be gated off by the the control code.* 

#### 1.2. IPG Clock – used by ADC and XBAR

IPG\_CLK\_ROOT is the internal clock that feeds the ADC and XBAR. It is derived exclusively from the AHB\_CLK\_ROOT, which was configured in the initial step. IPG\_CLK\_ROOT is equal to AHB\_CLK\_ROOT divided by 1 to 4.



is used to configure this ratio, whereby if the define is not used it defaults to 1 (that is, IPG CLK ROOT is equal to AHB\_CLK\_ROOT).

Should nether the ADC nor the XBAR be used in the project this clock will automatically be disabled by the control code by disabling its clock gate.

The maximum IPG\_CLK\_ROOT frequency for the i.MX RT 1021 is 150MHz (max. for AHB\_CLK\_ROOT is 500MHz) and so it is important to ensure that the divider is set to ensure this speed is not exceeded.

The  $\mu$ Tasker project driver code will signal a build error if it detects that and such a frequency has been exceeded. The  $\mu$ Tasker simulator also checked run-time derived frequencies and will exception if it detects such violations.

#### 1.3. PERCLK – used by PIT and GPT

PERCLK\_CLK\_ROOT is the internal clock that feeds the PIT and GPT. It is derived either from the IPG\_CLK\_ROOT frequency, which was configured in the previous step, or from the 24.0MHz OSC CLK. In each case it has an optional pre-scaler of divide by 1..64.

#define PERCLK CLK ROOT SOURCE IPG CLK



#define PERCLK CLK ROOT DIVIDE 1 // 1..64

is used to configure this ratio, whereby if the define is not used it defaults to 1.

#define PERCLK CLK ROOT SOURCE OSC CLK



Should neither the PIT nor the GPT be used in the project this clock will automatically be disabled by the control code by disabling its clock gate.

The maximum PERCLK\_CLK\_ROOT frequency for the i.MX RT 1021 is 75MHz and so it is important to ensure that the divider is set to ensure this speed is not exceeded.

The  $\mu$ Tasker project driver code will signal a build error if it detects that and such a frequency has been exceeded. The  $\mu$ Tasker simulator also checked run-time derived frequencies and will exception if it detects such violations.

### 1.4. UART\_CLK\_ROOT - used by all LPUARTs

The LPUARTs in the i.MX RT 1021have a common clock that can be derived from either  $OSC\_CLK$  or from  $pll\_sw\_clk/6$ . A common pre-scaler allows the frequency to be divided by 1..64.

#define UART CLK ROOT FROM PLL3 SW CLK 6



In this configuration the USB1 PLL is used as reference (called pll\_sw\_clk) and divided by a fixed value of 6, resulting in 80MHz. An optional pre-scaler defined by UART CLK ROOT DIVIDER can divide this by 1 to 64 (when not defined, the default is 1)

#define UART\_CLK\_ROOT\_FROM\_OSC\_CLK



In this configuration the  $OSC\_CLK$  (24MHz) is used as reference, with the optional pre-scaler of 1 to 64.

Note that when PLL3 is not enabled it is left in its powered down, bypassed mode and the alternative clock would be 4MHz instead of 80MHz. This does in fact correspond to the default for the LPUART clock out or reset but this is not used as configuration option since it has no advantage.

UART\_CLK\_ROOT supplies all LPUARTs but the clock is automatically disabled at each individual LPUART input when the corresponding LPUART is not used.

#### 1.5. USDHC1\_CLK\_ROOT/USDHC2\_CLK\_ROOT

To add..

#### 1.6. SEMC\_CLK\_ROOT

By default the SEMC\_CLK\_ROOT is sourced from PERIPH\_CLK, which is the clock that supplies the core clock's pre-scaler. Out of reset it is divided by 3. This configuration is shown in the first image:



Other configuration possibilities are for PLL2-PFD2 or PLL3-PFD1 as shown in the further images.





The SEMC CLK ROOT frequency must not exceed 166MHz.

#### 1.7. FLEXSPI\_CLK\_ROOT

The default source of the <code>FLEXSPI\_CLK\_ROOT</code> is from the <code>SEMC\_CLK\_ROOT</code> (see previous section) with a pre-scaler of 2. As discussed in the previous section, the <code>SEMC\_CLK\_ROOT</code> is per default the <code>PERIPH\_CLK</code> (the clock root supplying the core clock pre-scaler) divided by 3. <code>FLEXSPI\_CLK\_ROOT</code> frequency must not exceed 322MHz.









## 1.8. LPSPI\_CLK\_ROOT

To add..

## 1.9. TRACE\_CLK\_ROOT

To add..

## 1.10. SAI1\_CLK\_ROOT/SAI2\_CLK\_ROOT/SAI2\_CLK\_ROOT

To add..

### 1.11. LPI2C\_CLK\_ROOT – used by all LPI2C controllers

The LPI2C controllers in the i.MX RT 1021have a common clock that can be derived from either  $OSC\_CLK$  or from  $pll\_sw\_clk/8$ . A common pre-scaler allows the frequency to be divided by 1..64.

#define LPI2C CLK ROOT FROM PLL3 SW CLK 8



In this configuration the USB1 PLL is used as reference (called  $pll_sw_clk$ ) and divided by a fixed value of 8, resulting in 60MHz. An optional pre-scaler defined by LPI2C\_CLK\_ROOT\_DIVIDER can divide this by 1 to 64 (when not defined, the default is 1)

#define LPI2C CLK ROOT FROM OSC CLK



In this configuration the OSC\_CLK (24MHz) is used as reference, with the optional pre-scaler of 1 to 64.

Note that when PLL3 is not enabled it is left in its powered down, bypassed mode and the alternative clock would be 3MHz instead of 60MHz. This does in fact correspond to the default for the LPI2C clock out or reset but this is not used as configuration option since it has no advantage.

LPI2C\_CLK\_ROOT supplies all LPI2C controllers but the clock is automatically disabled at each individual LPI2C input when the corresponding LPI2C controller is not used.

### 1.12. CAN CLK ROOT

The FLEXCAN controllers in the i.MX RT 1021have a common clock that can be derived from either OSC\_CLK,  $pll_sw_clk/6$  or from  $pll_sw_clk/8$ . A common pre-scaler allows the frequency to be divided by 1..64.

#define CAN CLK ROOT FROM PLL3 SW CLK 6



In this configuration the USB1 PLL is used as reference (called <code>pll\_sw\_clk</code>) and divided by a fixed value of 6, resulting in 80MHz. An optional pre-scaler defined by CAN\_CLK\_ROOT\_DIVIDER can divide this by 1 to 64 (when not defined, the default is 2)

#define CAN CLK ROOT FROM PLL3 SW CLK 8



In this configuration the USB1 PLL is used as reference (called  $pll_sw_clk$ ) and divided by a fixed value of 8, resulting in 60MHz. An optional pre-scaler defined by CAN\_CLK\_ROOT\_DIVIDER can divide this by 1 to 64 (when not defined, the default is 2)

#define CAN CLK ROOT FROM OSC CLK



In this configuration the OSC\_CLK (24MHz) is used as reference, with the optional pre-scaler of 1 to 64.

Note that when PLL3 is not enabled it is left in its powered down, bypassed mode and the alternative clocks would be 4/3MHz instead of 80/60MHz. This is not used as configuration option since it has no advantage.

CAN\_CLK\_ROOT supplies all FLEXCAN controllers but the clock is automatically disabled at each individual FLEXCAN input when the corresponding FLEXCAN controller is not used.

1.13. SPDIF0\_CLK\_ROOT

To add..

1.14. FLEXIO1\_CLK\_ROOT

To add..

### 3. Real Time Clock

The i.MX RT 1021 has a low power RTC in its SNVS (Secure Non-Volatile Storage Module) which can be powered by a coin cell (2.4V .. 3.6V) on its VDD\_SNVS\_IN pin. Although an internal ring oscillator can be used to supply a rough 32kHz clock to the module higher accuracy time keeping is achieved by using a 32kHz crystal connected to the RTC\_XTALI and RTC\_XTALO pins.

## 4. Internal Clock Monitoring

The i.MX RT 1021 has two peripheral outputs called CCM\_CLKO1 (on GPIO\_SD\_B1\_02, or GPIO3-22) and CCM\_CLKO2 (on GPIO\_SD\_B1\_03, or GPIO3-23) which can be attached to various internal clocks. This can be useful to verify that these clocks really have the frequencies that are expected, as well as generating signals for external usage. Fast internal signals can also be divided down by a pre-scaler with a value between 1 and 8.

These are the clocks that can be selected:

#### CCM CLKO1

PLL3\_SW\_CLK\_DIV2 PLL2\_DIV2 ENET\_PLL\_DIV2 SEMC\_CLK\_ROOT AHB\_CLK\_ROOT IPG\_CLK\_ROOT PERCLK\_ROOT PLL4\_MAIN\_CLK

#### CCM\_CLKO2

USDHC1\_CLK\_ROOT LPI2C\_CLK\_ROOT OSC\_CLK\_ROOT LPSPI\_CLK\_ROOT USDHC2\_CLK\_ROOT SAI1\_CLK\_ROOT SAI2\_CLK\_ROOT SAI3\_CLK\_ROOT TRACE\_CLK\_ROOT CAN\_CLK\_ROOT FLEXSPI\_CLK\_ROOT UART\_CLK\_ROOT SPDIFØ CLK\_ROOT

Two macros are made available to configure the pin and output the desired signals:

```
fnSetClock1Output(CLK, div)
fnSetClock2Output(CLK, div)
```

#### whereby examples of utilisation are:

#### 5. LPUART

The i.MX RT 1021 LPUART driver is shared with the Kinetis LPUART driver and supports interrupt and DMA driven modes. Minor differences due to the i.MX RT 1021 hardware are controlled by the platform definition <code>\_imX</code>. Sharing the driver is possible due to the high compatibility between the LPUART used in the Kinetis parts and i.MX RT 1021 and improves maintenance since only one source needs to be managed and the i.MX RT 1021 inherits the features from the mature Kinetis driver.

See the UART user's manual for general details of usage: <a href="http://www.utasker.com/docs/uTasker/uTaskerUART.PDF">http://www.utasker.com/docs/uTasker/uTaskerUART.PDF</a>

It should be kept in mind that i.MX RT 1021 peripherals tend to be numbered 1..n and not 0..n-1, as is the case with Kinetis peripherals. To avoid confusion it is recommended to use the defines

```
iMX_LPUART_1
iMX_LPUART_2
iMX_LPUART_n
```

instead of channel numbers, whereby iMX\_LPUART\_1 is in fact 0.

It is to be noted that the  $\mu$ Tasker project is often chosen due to its immediate support for free-running UART Rx DMA on all serial interfaces, which is something that is generally not found in other solutions. The i.MX RT 1021 thus could immediately inherit this operation.

#### 6. LPI2C

The i.MX RT 1021 LPI2C driver is shared with the Kinetis LPUART driver and supports interrupt and DMA driven modes. Minor differences due to the i.MX RT 1021 hardware are controlled by the platform definition <code>\_imx</code>. Sharing the driver is possible due to the high compatibility between the LPI2C used in the Kinetis parts and i.MX RT 1021 and improves maintenance since only one source needs to be managed and the i.MX RT 1021 inherits the features from the mature Kinetis driver.

See the I<sup>2</sup>C user's manual for general details of usage: http://www.utasker.com/docs/uTasker/uTasker I2C.pdf

It should be kept in mind that i.MX RT 1021 peripherals tend to be numbered 1..n and not 0..n-1, as is the case with Kinetis peripherals. To avoid confusion it is recommended to use the defines

iMX\_LPI2C\_1
iMX\_LPI2C\_2
iMX\_LPI2C\_n

instead of channel numbers, whereby imx\_LPI2C\_1 is in fact 0.

#### 7. FLEXCAN

The i.MX RT 1021 CAN driver is shared with the Kinetis CAN driver. Minor differences due to the i.MX RT 1021 hardware are controlled by the platform definition \_imx. Sharing the driver is possible due to the high compatibility between the FLEXCAN used in the Kinetis parts and i.MX RT 1021 and improves maintenance since only one source needs to be managed and the i.MX RT 1021 inherits the features from the mature Kinetis driver.

The FLEXCAN in the i.MX RT 1021 supports 64 receive buffers as opposed to the 16 in the FLEXCAN in the Kinetis parts.

See the CAN user's manual for general details of usage: <a href="http://www.utasker.com/docs/uTasker/uTaskerCAN.PDF">http://www.utasker.com/docs/uTasker/uTaskerCAN.PDF</a>

It should be kept in mind that i.MX RT 1021 peripherals tend to be numbered 1..n and not 0..n-1, as is the case with Kinetis peripherals. To avoid confusion it is recommended to use the defines

iMX\_FLEXCAN\_1
iMX\_FLEXCAN\_2

instead of channel numbers, whereby imx flexcan 1 is in fact 0.

#### **8. PIT**

The i.MX RT 1021 PIT driver is shared with the Kinetis PIT driver. Minor differences due to the i.MX RT 1021 hardware are controlled by the platform definition \_iMX. Sharing the driver is possible due to the high compatibility between the PIT used in the Kinetis parts and i.MX RT 1021 and improves maintenance since only one source needs to be managed and the i.MX RT 1021 inherits the features from the mature Kinetis driver.

See the HW timer user's manual for general details of usage: <a href="http://www.utasker.com/docs/uTasker/uTaskerHWTimers.PDF">http://www.utasker.com/docs/uTasker/uTaskerHWTimers.PDF</a>

(see TEST\_PIT [TEST\_PIT\_SINGLE\_SHOT, TEST\_PIT\_PERIODIC and TEST\_PIT\_64 BIT in ADC Timers.h as reference to use).

### 9. DMA

The i.MX RT 1021 DMA driver is shared with the Kinetis eDMA driver. Minor differences due to the i.MX RT 1021 hardware are controlled by the platform definition \_imx. Sharing the driver is possible due to the high compatibility between the eDMA and DMA MUX used in the Kinetis parts and i.MX RT 1021 and improves maintenance since only one source needs to be managed and the i.MX RT 1021 inherits the features from the mature Kinetis driver.

#### 10. **GPIO**

The i.MX RT 1021 GPIO / peripheral concept is quite different to the Kinetis concept. See the following video for an overview and also details concerning how the project was solved to ensure compatibility between Kinetis and i.MX RT:

https://www.youtube.com/watch?

v=SmFTi8hlba0&list=PLWKIVb MqDQFZAulrUywU30v869JBYi9Q&index=29

GPIOs can also be used as interrupts (see IRQ TEST in Port Interrupts.h as reference to use).

Each GPIO can be configured to generate an interrupt on low levels, high levels, falling edges or rising edges (or both falling and rising edges). The µTasker GPIO interrupt driver allows the user to assign an individual interrupt callback to each GPIO but it is useful to understand that the i.MX RT 1021 actually has the following interrupt vectors:

```
- PORT1-0 - individual vector for these pins
- PORT1-1
- PORT1-2
- PORT1-3
- PORT1-4
- PORT1-5
- PORT1-6
- PORT1-7
- PORT1-15..PORT1-8 - these 8 pins share a single vector
```

- PORT1-31..PORT1-16 these 16 pins share a single vector
- PORT2-15..PORT2-0 these 16 pins share a single vector - PORT2-31..PORT2-16 - these 16 pins share a single vector
- PORT3-15..PORT3-0 these 16 pins share a single vector
- PORT3-31.. PORT3-16 these 16 pins share a single vector
- PORT5-15..PORT5-0 these 16 pins share a single vector
- PORT5-31.. PORT5-16 these 16 pins share a single vector

PORT1-0..PORT1-7 are the most efficient interrupts since the handler doesn't need to identify which port bits caused the interrupt before dispatching the user interrupt callback. Ports with an interrupt vector shared by more than one pin are slightly less efficient due to the need to identify which source or sources caused the interrupt and then dispatch one or more call-backs. When multiple GPIO input interrupts are pending at the same time the callbacks are dispatched in the order of the lower pin number up to the highest pin number.

# 11. RAM and Cache

The i.MX RT 1021 contains 256k of internal RAM which is constructed of 8 banks of 32k each. These banks can each be assigned to one three areas (FlexRAM controller):

- OCRAM General RAM operates at 1/4 the core clock speed (32 bit wide). This is cacheable, meaning that if L1 cache is enabled data content that is already in cache is used to avoid needing to perform the OCRAM access.
- ITCM Instruction Tightly Coupled Memory (64 bit wide) that is optimised for instruction execution at the maximum core speed. Non-cacheable (also since already optimally fast) and so no potential cache synchronisation problems.
- DTCM Data Tightly Coupled Memory (64 bit wide) that is optimised for data access at the maximum core speed. Non-cacheable (also since already optimally fast) and so no potential cache synchronisation problems.

For full details concerning the FlexRAM and optimal configuration to match an applications memory requirements NXP has prepared the application note AN12077 which can be found at <a href="https://www.nxp.com/docs/en/application-note/AN12077.pdf">https://www.nxp.com/docs/en/application-note/AN12077.pdf</a>

The i.MX RT 1021 has L1 cache with 16kBytes instruction cache and 16kBytes data cache. NXP has prepared the application note AN12042 which can be found at <a href="https://www.nxp.com/docs/en/application-note/AN12042.pdf">https://www.nxp.com/docs/en/application-note/AN12042.pdf</a>

Use of the cache can ensure high speed operation even when the source of code or data is in a slower memory by avoiding to have to unnecessarily fetch the data when it has been loaded once to the cache.

When the cache is enabled it caches from OCRAM and QSPI-Flash; it neither caches ITCM nor DTCM, which are already tightly coupled to the core.

The application can decide whether it uses data or instruction cache with the defines

#define ENABLE\_INSTRUCTION\_CACHE

and

#define ENABLE\_DATA\_CACHE

The FlexRAM controller configures the RAM banks at reset based on eFuse settings. The standard setting (when nothing else has been programmed) is for 128k OCRAM, 64k DTCM and 64k ITCM; the ROM loader may use the first 64k of the OCRAM when it operates.

This default setting is assumed in the  $\mu$ Tasker project to avoid special configuration requirements and therefore out of reset the banks are configured to give this memory map and layout:



Assuming it is decided that an application were best configured to have 5 banks for ITCM (so that the code could be completely located there – 160k) and 3 banks for data (so that up to 96k of data could be accesses at optimal speed) and no OCRAM the 8 banks could be configured as follows:



The  $\mu$ Tasker FlexRAM driver always uses the bank order from instruction use to data use (if OCRAM were used it would be inserted between the two). The most important thing to understand is that when the bank use is modified the address range of the bank is also changed (it is moved in the memory map). This means that data that was in OCR before the change (in the default configuration) still exists in the bank memory but is now addressed in the ITC or DTC memory space instead.

This behaviour makes it complicated to change the memory configuration at run time because it means that any memory used before the change (eg. The stack or initialised variables) are usually at completely different locations after the change. Typically this will cause a program that simply changes the bank configuration without respecting the fact that its memory moves during the process to immediately fail. For this reason such changes are generally not performed during program operation; if such a configuration is changed it tends to be performed before any variable initialisation and also from code running in other sources and without stack dependency.

The µTasker concept assumes that code and variables fit in the internal RAM and so OCRAM is avoided. The division between ITC and OTC is performed at system initialisation automatically to allocate ITC banks to the code space and DTC banks to data space in such a way as to have as much DTC available as possible for heap and stack. If code of 130k were encountered it would thus assign 160 ITC and 96k DTC, as in the example. If less code were encountered additional banks would be assigned to DTC in order to maximise heap and stack availability. Code and data are automatically in the highest performance RAM areas and caching is not required to achieve optimal performance (without caching, no additional synchronisation of data is required).

There is an important reason for choosing the bank ordering: In the default configuration bank 7 is assigned to OCR but will not be used by the ROM loader (the ROM loader may use up to 64k only). After the bank swap is bank is the last bank in DTM, whereby the stack pointer is located near the top, but leaving some additional space above it for 'preserved' variables. The advantage of this is that an application can always write values to the preserved area before a reset and these values will not be modified by the ROM loader. The  $\mu$ Tasker boot loader or another application can then read these values, even if the application uses a different RAM bank configuration; as long as its stack pointer is put to near the end of the final bank it will automatically be referenced to the the preserved data area! The preserved area is used in the  $\mu$ Tasker project for communicating between applications and the  $\mu$ Task boot loaders, but can also be used by custom applications for holding data that is guaranteed to be preserved across warm resets.

Due to the nature of the memory operation of the i.MX RT 1021, its configuration requirement to achieve optimal performance and the desire to allow  $\mu$ Tasker users to benefit for these with no additional effort the RAM bank management is an integral part of the  $\mu$ Tasker boot strategy and the  $\mu$ Tasker Boot Loader (see Boot Mode section) an integral part of every project (apart from when a stand-alone application are loaded in a debug environment for test purposes).

# 12. Boot Mode

The i.MX RT 1021 doesn't have internal flash and needs to boot from an external program source. This is controlled by an internal ROM which offers various boot loader capabilities which are controlled by two inputs (SRC\_BOOT\_MODE00/GPIO2-IO16 and SRC\_BOOT\_MODE01/GPIO2-IO17) as well as some eFUSEs (or further pins). The  $\mu$ Tasker project avoids the use of eFUSES and instead controls the operation in its code so that chips can be used in their default configuration.

The i.MX RT 1021 supports three main modes:

| SRC_BOOT_MODE01/GPIO<br>2-IO17/GPIO_EMC_17 | SRC_BOOT_MODE00/GPIO2-<br>IO16/GPIO_EMC_16 |                                        |
|--------------------------------------------|--------------------------------------------|----------------------------------------|
| 0                                          | 0                                          | Boot from fuses                        |
| 0                                          | 1                                          | Serial downloader<br>(LPUART1 or USB1) |
| 1                                          | 0                                          | Internal boot                          |

For simplicity the  $\mu$ Tasker project assumes that internal boot option is used as standard, meaning that the ROM loader runs and the exact configuration is taken from eFUSEs or pin overrides, whereby the  $\mu$ Tasker assumes that serial (QPI) NOR-Flash is used: the MIMXRT1020-EVK has an IS25LP064A-JBLE to this effect, which is an 8 Mbyte part. It is connected in QSPI mode on the primary FlexSPI interface of the i.MX RT 1021.

To ensure that the NOR-Flash mode is used the processor pins *GPIO\_EMC\_25/*GPIO2\_IO25

GPIO\_EMC\_24/GPIO2\_IO24

GPIO\_EMC\_23/GPIO2\_IO23

GPIO\_EMC\_22/GPIO2\_IO22

should be pulled down at reset.

| SRC_BT_CFG07<br>/GPI02-<br>I025/GPI0_EM<br>C_25 | SRC_BT_CFG06/<br>GPIO2-<br>IO24/GPIO_EMC_<br>24 | SRC_BT_CFG05/<br>GPIO2-<br>IO23/GPIO_EMC_<br>23 | SRC_BT_CFG04/<br>GPIO2-<br>IO22/GPIO_EMC_<br>22 |                              |
|-------------------------------------------------|-------------------------------------------------|-------------------------------------------------|-------------------------------------------------|------------------------------|
| 0                                               | 0                                               | 0                                               | 0                                               | FlexSPI1<br>(Serial<br>NOR)  |
| 0                                               | 0                                               | 1                                               | x                                               | SD card                      |
| 1                                               | 0                                               | x                                               | x                                               | MMC/eMMC                     |
| 0                                               | 1                                               | х                                               | X                                               | SEMC<br>(NAND)               |
| 0                                               | 0                                               | 0                                               | 1                                               | SEMC<br>(NOR)                |
| 1                                               | 1                                               | х                                               | х                                               | FlexSPI1<br>(Serial<br>NAND) |

The MIMXRT1020-EVK has all configuration inputs pulled to ground by default and supplies a DIP switch with 4 switches to allow configuring the main boot mode and whether the FlexSPI or SD card is booted from.

The following setting is used:

#### SW8

| DIP-1<br>SRC_BT_CFG00<br>/GPI02-<br>I018/GPI0_EM<br>C_18 | DIP-2<br>SRC_BT_CFG05<br>/GPI02-<br>I023/GPI0_EM<br>C_23 | DIP-3<br>SRC_BOOT_MOD<br>E01/GPIO2-<br>IO17/GPIO_EM<br>C_17 | DIP-4<br>SRC_BOOT_MOD<br>E00/GPIO2-<br>IO16/GPIO_EM<br>C_16 |                                                                 |
|----------------------------------------------------------|----------------------------------------------------------|-------------------------------------------------------------|-------------------------------------------------------------|-----------------------------------------------------------------|
| OFF                                                      | OFF                                                      | ON                                                          | OFF                                                         | Internal boot<br>from FlexSPI1<br>(serial NOR)                  |
| ON                                                       | OFF                                                      | ON                                                          | OFF                                                         | Internal boot<br>from FlexSPI1<br>(serial NOR)<br>Encrypted XIP |

The eFUSES in a new part are set to supply the following configuration options in NOR Flash boot mode:

BOOT CFG1[0] = 0 = XIP is not encrypted

BOOT\_CFG2[2] = b00 = 500us hold time before read from device

BOOT CFG1[7..4] = b0000 = serial NOR device

BOOT\_CFG1[3..1] = b000 = NOR device supports 0x3b read by default (on primary pin-mux option)

The result is that when the processor starts the very first thing that it does (its internal ROM loader controlling it) is read a block of data from the start of SPI Flash (address 0x60000000 in the memory map) in 2 line mode at 30MHz. This block is expected to contain details about the NOR Flash that is connected (how many lines are connected, what speed is to be used to communicate with it – including setup times - what instruction sequences does it need, plus various other such information.

When the data read is valid (there is also a header to help identify valid content) the ROM loader sets up the FlexSPI interface accordingly, configures the SPI Flash further and begins communicating at the final speed.

The following diagram shows the  $\mu$ Tasker boot loader strategy which allows for running applications in internal SRAM (for maximum efficiency), stand-alone code updating (via USB, UART, Ethernet, and with existing protocols such as KBOOT), as well as OTA (Over The Air) loading by the working application via an Upload Image Area in QSPI with the support of the  $\mu$ Tasker Bare-Minimum loader for fail safe re-flashing (of new applications or of updated stand-alone Serial Loader).



#### The operation is as follows:

- 1. Each time the processor starts, the μTasker "BM" Loader is the first code to be run in QSPI-Flash. It checks to see whether there is a valid new Application or Serial loader in the Upload Image Area.
- 2. If there is no valid image found it jumps to the Serial Loader (the serial loader is always present)
- 3. If the serial loader doesn't detect that it should operate (usually via an input) and also detects a valid application in the application code area it calls the μTasker "BM" Loader again, signalling that it should now start this valid application.
- 4. The μTasker "BM" Loader copies the application code to internal RAM and allows this to start.

2a. If the  $\mu$ Tasker "BM" Loader finds either a valid new Application or a valid new Serial loader in the Upload Image Area it deletes the original code and updates it with the new source before resetting so that the new versions can operate.

3a. If the serial loader operates (due to detecting that it should remain in its mode or because there is no valid application code available) it allows new code to be loaded directly to the application area (using the chosen  $\mu$ Taster Serial Loader strategy, such as USB-MSD) before resetting so that the new code can subsequently be executed.

Typically the remaining QSPI-Flash space is used by the application for storing parameters (µParameterSystem) and files (µFileSystem).

The application is responsible for allowing OTA uploads to the Upload Image Area by whatever means is appropriate for the project/product. It can support both OTA loading of new applications (to replace itself) or new serial loader, after which the physical swap of the code is performed by simply commanding a reset so that the  $\mu$ Tasker "BM" Loader can complete the work.

Initially there is always a combined firmware consisting of the  $\mu$ Tasker "BM" Loader and the  $\mu$ Tasker Serial Loader programmed to the board so that the first application can be loaded via the serial loader. Alternatively an initial application can also be combined to immediately have a complete firmware set.

All internal operations performed by the µTasker "BM" Loader are fail safe to ensure that updating code from the Upload Image Area cannot fail and cause the product to subsequently not be operational. The application's OTA procedure must however ensure that non-operational images are ever loaded as valid new images to ensure that these cannot fail after such an upgrade!

The exact dimensions of the areas shown are configurable, as are the sizes of the OCR/ITC and DTC areas in RAM. Generally it is the "BM" loader that configures the RAM banks to optimally suit the application that is to be loaded and initialises the application's initial stack pointer suitably to the top of DTC memory (see the section on RAM and Cache for more details).

# 13. Ethernet

The i.MX RT 1021 has an internal 10/100M Ethernet controller but requires an external PHY for the connection to the cable.

There is a dedicated 500MHz fixed frequency PLL (PLL6) which supplies the Ethernet controller clocks. It also has a fixed 25MHz output referred to as ref\_enet\_pll2 and a configurable output called ref\_enet\_pll1, programmable to 25MHz, 50MHz, 100MHz or 125MHz.

The MIMXRT1020-EVK uses a Micrel KSZ8081RNB as PHY, connected in RMII mode. This PHY accepts a 25MHz reference clock from the processor (Ethernet controller) which is supplied on GPIO AD BØ Ø8 (GPIO1-08) – peripheral function ENET REF CLK1

# 14. Conclusion

The present state of development is that the i.MX RT 1021 (MIMXRT1020-EVK) can be operated from SRAM (with debugger) or from QSPI Flash (stand-alone) in various clocking configurations up to maximum speed. GPIO, port interrupts, LPUARTs (interrupt and DMA driven), Ethernet, PIT, dynamic low power (WAIT) mode and DMA operations (memory-to-memory) can be demonstrated. Instruction and dat cache can be optionally enabled/disabledAll such operation is simulated in visual studio.

#### Next immediate steps:

- Add QSPI loader to run exclusively in SRAM (for optimal speed efficiency)
- Demonstrate full Modbus serial and TCP operation.
- Verify dynamic low power operation

#### Subsequent steps:

- Add (new) USB device driver
- Add FreeRTOS configurations
- Verify on cheapest i.MX RT 1010 system (without ENET)

# Present problems/investigation:

- memory to memory DMA transfer is not giving any speed benefits over CPU copy: https://community.nxp.com/thread/518925
- Free-running LPUART DMA reception works correctly when cache is disabled but has disturbances when cache is enabled
- Ethernet works reliably after a power cycle but not always after a push-button or a commanded reset
- Push button resets are understood as power cycle resets by the reset monitor

Modifications:

V0.06 18.11.2019: Initial version in development

# Appendix A – Hardware Dependencies

a) Space for first Appendix