stm32 microcontroller (key focus)
Introduction to Microcontrollers
① What is a microcontroller?
A microcontroller, also known as a single-chip microcomputer, is not a chip that performs a single logic function, but rather integrates an entire computer system onto a single chip. It is essentially a miniature computer; compared to a full computer, a microcontroller only lacks I/O devices. In short: one chip becomes a computer. Its small size, light weight, and low cost provide convenient conditions for learning, application, and development. At the same time, learning to use a microcontroller is also the best choice for understanding computer principles and architecture.

② Applications of microcontrollers?
- Internet of Things (IoT)
- Medical equipment
- Industrial control
- Computer Network Communication (※)
- ... ...
③ Components of the STM32 Microcontroller
- CPU (Central Processing Unit)
- Chip block diagram

- Processor core (you should understand what a core is, as later you'll need to determine which version of software to install based on the core's architecture and operating system).
- Introduction and Function: All CPU computations, command reception/storage, and data processing are executed by the kernel.
- Instruction set categories: ARM architecture, x86 architecture, LoongArch architecture, RISC-V architecture
- ARM architecture instruction set
- Application: Widely used in mobile industries (phones, tablets, industrial PCs, etc.) and other scenarios requiring a strong energy efficiency ratio.
- Architecture classification: ARM32, AArch64 (ARM64), etc.
- Kernel classification: ARM Cortex-X (mobile), ARM Cortex-A (mobile), ARM Cortex-R (embedded), ARM Cortex-M (embedded)
- Understand the difference between a CPU SoC and a CPU core.
- ARM architecture instruction set
Xiaomi Xuanjie o1, Huawei Kirin 9000, and STMicroelectronics STM32F407VET6 are all based on the ARM architecture. The front-end design of their CPU cores is completed by the British company ARM, and the ARM core determines the CPU's performance, bus, floating-point unit, and more.
In contrast, Xiaomi, Huawei, and STMicroelectronics only perform back-end design on the CPU core, which has little impact on CPU performance. The vast majority of CPU characteristics are determined by ARM's front-end design.
So, when we use the F407IG and F407VE, since their CPU cores are both Cortex-M4, the CPU features are the same, and the code is almost identical as well.



10. X86 architecture instruction set
1. Application: Widely used in scenarios requiring high-performance computing, such as computers, soft routers, and industrial PCs.
2. Category: X86, AMD64 (X86_64)
11. LoongArch architecture
1. Applications: government procurement, military procurement, personal computers, etc.
2. Category: LoongArch 32-bit, LoongArch 64-bit
12. RISC-V architecture


5. GPIO (General-Purpose Input/Output ports)
1. Definition: Ports through which the CPU exchanges information with the outside world (Input, Output).
2. Understanding: The "gold fingers" on the CPU (note the distinction from pins on the circuit board PCB).
3. Location: The CPU is typically soldered onto the PCB. Generally, the GPIO ports are routed through certain circuits on the board and eventually brought out as board pins.
4. Quantity: The STM32F407 series has hundreds of GPIO ports. Due to the large number, they are divided into 7 groups (A, B, C...), with each group containing 16 I/O pins (0, 1, 2, 3...15).
5. Naming: P + GPIO group + IO port number (e.g., PA2, PB6, etc.)
6. Application
1. General-purpose output IO pin: outputs a high level or a low level.
2. General-purpose input/output (GPIO) pin: reads external high or low logic levels.
3. Reusable I/O pins: can be configured as communication I/O pins to interface with a computer, motor, Bluetooth module, etc. (timer PWM, UART serial, CAN communication, SWD debug communication, crystal oscillator I/O pins).

④ Schematic
6. Introduction: As the name suggests, it is a diagram that illustrates the connection principles between various components on a circuit board. (In the schematic, each component is represented in its entirety, serving as a diagram for secondary wiring.)


9. Components:
1. Components (including component ports)
2. wire
3. Network label
4. Power symbol
5. Wait...








⑤ Chip Datasheet
- Purpose: Query various chip information (such as CPU frequency, I/O definitions, clock tree, etc.)




Software Introduction
- STM32 library
- Various development approaches: registers, Standard Peripheral Library, HAL Library, LL Library.
- Simple understanding of register functions: A register is a storage structure that is closer to the CPU core, so it exchanges data with the CPU core faster than memory (RAM). Each register has a different function, and storing different values in the registers will cause the CPU to implement correspondingly different functions after reading them.
- Library: A library consists of source files plus header files. The STM32 library is compiled from a mix of assembly language and C language (HAL and LL libraries are compatible with C++). Currently available are the Standard Peripheral Library, HAL Library, and LL Library (the Standard Peripheral Library is deprecated; our lab uses the HAL and LL libraries).
- Pros and cons of various development approaches:
- Registers: This development approach offers high hardware execution efficiency, but because STM32 has an excessive number of registers, writing with registers results in poor readability and is cumbersome and tedious. Therefore, it is not recommended to write entirely with registers, though they can be used occasionally in certain scenarios (such as directly shifting registers for a running light, or directly assigning values to the CCR register when adjusting the PWM duty cycle).
- Standard library: It is too old and has been phased out. This library was developed using assembly and C language, and the code is highly readable. However, due to some initial design issues with the standard library and certain patent-related problems, it can cause issues (such as IIC communication). Additionally, clock configuration is overly cumbersome and tedious, so starting from the class of 2021, we no longer use the standard library.
- HAL library, LL library (highly recommended): Libraries promoted by ARM and STMicroelectronics, compliant with the ARM CMSIS standard, a standard that all embedded developers today need to follow. These libraries are developed using assembly language and C language, and are compatible with C++ (with
extern "C"conditional compilation in the header files). Developing with C++ OOP (Object-Oriented Programming) is a thousand times more convenient. The HAL and LL libraries are still maintained by ST, and they resolve various issues found in the Standard Peripheral Library, such as the inability to use hardware I2C properly, and make clock configuration much easier. Introduction to the ARM CMSIS standard: https://www.arm.com/technologies/cmsis

- Various development approaches: registers, Standard Peripheral Library, HAL Library, LL Library.





- Software Development Introduction:
- Environment Setup Tutorial: STM32 Windows Development Environment Software Installation Tutorial
- ARM Keil MDK
- Introduction: Capable of developing microcontrollers based on ARM Cortex series cores (such as STM32), as well as other types of microcontrollers (e.g., 8051 microcontrollers).
- Purpose: Perform code editing, compiling, building, downloading, and debugging for microcontrollers.
- Version Selection
- MDK 5.3 and above: Recommended, but requires manual installation of the ARMCC compiler. It can only be developed on the Windows platform, and its graphical interface is quite ugly, with no dark mode, making it extremely harsh on the eyes for nighttime development. However, because it uses the ARMCC and ARMCLANG compilers, the generated output is much smaller than that of the ARM-GCC compiler, and due to its excellent compatibility with ARM Cortex cores, MDK 5.3 is still the chosen version.
- MDK 6 and above: see VScode below for details.
- Compromise solution (Keil MDK5 + VSCode + Keil Assistant): See VSCode below for details.


- STM32 CubeMX
- Introduction: A software developed by ST for generating partial driver layer code for the STM32 HAL library using a graphical interface, supporting only the STM32 series microcontrollers.
- Purpose: For later development use, to perform basic configuration of the STM32 microcontroller's driver layer (such as clock tree, GPIO, various peripheral communications, interrupts, embedded real-time operating system configuration, etc.). Beginners are strictly prohibited from using STM32 CubeMX in the early stages, otherwise it's as if they haven't learned anything. Early-stage beginners may only use this software to generate clock functions; all other operations are absolutely not allowed. They can learn about it, but it must not be used as the primary development tool. (Once they have roughly mastered CAN communication, DMA, etc., they may use this software.)
- STM32 CubeIDE (optional, do not use it unless necessary)
- Introduction: A cross-platform STM32 microcontroller development platform that only supports the STM32 series microcontrollers and exclusively uses the ARM-GCC compiler (this compiler is far inferior to the ARM-Clang compiler on MDK5 and MDK6, and even falls short of the ARM-CC compiler in some performance aspects).
- VScode
- Introduction: Developed by Microsoft, the open-source, world's most versatile editor.
- Purpose: It is just an editor (similar to Notepad) and does not come with a built-in compiler (such as GCC, MSVC, ARMCC (AC5), ARMCLANG (AC6), ARM-GCC). You need to configure the environment yourself to properly develop C/C++, CMake, Python, ROS2, microcontrollers, etc.
- Advantages: ① The graphical interface is very elegant, ② It is cross-platform and can be used on Windows, Linux, and MacOS, ③ It has a wide variety of useful plugins.
- Disadvantages: ① VSCode is developed using Electron, which essentially bundles a Google Chromium browser kernel, making it very memory-intensive. ② Additionally, the environment is difficult to configure, but this is something that must be learned.
- Plugin:
- Keil Studio Pack (Keil MDK 6, as of January 2, 2024, it is recommended to be proficient with Keil 5 before using it). MDK 6 is now largely complete and usable, but it is not recommended. MDK 6 has a steep learning curve and is not beginner-friendly. Since MDK 5 is still being updated and maintained, it is recommended to use MDK 5.3 or higher. However, MDK 6 is developed based on MS VSCode editor, enabling cross-platform development on Windows, Linux, and MacOS, and its interface is very polished, so it has a promising future. ARM Keil MDK6 Tutorial

- Keil Assistant (recommended for later development; it can replace Keil MDK 5.3 for code editing, but compiling, building, downloading, and debugging are still recommended to be done in Keil MDK 5.3) Use MDK5 software on Windows along with the Keil Assistant plugin for VSCode for development. Tutorial on developing STM32 and 51 microcontrollers with VS Code, using VSCode to replace Keil - Bilibili https://www.bilibili.com/video/BV18e4y1H7xX

Clock tree
Steps for configuring the clock using CubeMX:
- Clock Configuration Introduction: This is something every project needs to do — providing the CPU with a normal heartbeat.
- Purpose: Provides the CPU with a normal heartbeat, and also provides each peripheral with a heartbeat, enabling the CPU and its peripherals to function properly (for example, the accuracy of delay functions and the accuracy of timer PWM waveforms).
- STM32F1 Series CPU Clock Block Diagram:

- Configuration notes:
- Pay attention to the actual crystal oscillator frequency of the HSE on the circuit board. Setting it too high may cause overclocking, leading to serious issues.
- When configuring, it is recommended to use CubeMX to set up the clock functions, then copy them into the ALIENTEK template project (since writing clock functions from scratch is too difficult).
- CubeMX Reference Document: DJI Development Board Type C Embedded Software Tutorial Document.pdf
- Configuration steps (only a few key points are highlighted here; for detailed steps, please refer to the DJI C Board development documentation):
- Open the DJI C Board development documentation.

- Find the directory and click on 0.4.2.

- Follow the steps in 0.4.2 to begin (each step must be completed, especially selecting Serial Wire for Debug — failing to do so will cause the project code to soft-brick the board).
- Be careful about the board model. The DJI board is an STM32F407IGH6, so we need to select based on our actual board model.

- When configuring the clock tree, pay attention to the HSE clock frequency and configure it according to the actual HSE frequency shown on the schematic.


- The code path must be entirely in English, and there must not be two consecutive spaces. It is recommended to avoid spaces altogether and use underscores between words (do not place it on the desktop).

- Explanation


- Open the MDK 5 project generated by CubeMX.

- Copy another one and open the Zhengdian Atom template project.


- Find the definition of the clock function
void SystemClock_Config(void)in themain.cfile of the CubeMX HAL library project.

- Copy the entire definition of the
void SystemClock_Config(void)function.

- Then open the Zhengdian Atom project, locate the
sys_stm32_clock_init(RCC_PLL_MUL9)function in themainfunction, right-click on the function, and go to definition of "sys_stm32_clock_init" to find the definition of this function.

- If the issue below pops up, please follow the instructions in this box to resolve it — the explanation is very clear. (If you can't understand English, search it on Baidu to practice your search skills.)

- After going to the definition of "sys_stm32_clock_init", find the definition of this function, delete the entire function, and copy the clock function from the CubeMX HAL library you just copied here. Also, delete Error_Handler(); directly, or replace it with while(1);
- Find the definition of the
sys_stm32_clock_initfunction.

- After selecting, delete.

- Copy the clock function from the CubeMX HAL library and paste it here.

- Replace
Error_Handler();withwhile(1);or delete it entirely.


- Find the header file
sys.hcorresponding to the source filesys.cwhere thevoid SystemClock_Config(void)function is located.

- Find the
sys_stm32_clock_init(uint32_t plln)function, delete it, and replace it with the declaration ofvoid SystemClock_Config(void).


- Return to the main function, locate the
sys_stm32_clock_init(RCC_PLL_MUL9);function, delete it, and call our new clock function.


- Modify HSE_VALUE
- Just type
HSE_VALUEanywhere, then go to definition (after going to definition, you can delete this manually writtenHSE_VALUE).

- Modify the value of HSE_VALUE (write 8000000U if it is 8MHz, or 12000000U if it is 12MHz).
- Open the DJI C Board development documentation.
From looking at the schematic, this board runs at 8MHz. (The exact value depends on your board's HSE schematic, corresponding to the OSCIN and OSCOUT I/O pins.)


27. Remove the original HSE_VALUE that was written solely for go to definition.
28. Remove the redundant code.

29. What specific value should be filled in for the entry parameter of `delay_init` on line 9? First, check its definition.
30. Looking at the definition of `delay_init`, we can see that its input parameter is `sysclk` (system clock).

31. Looking at the CubeMX clock tree diagram, the SYSCLK value is 72MHz.

32. Change the value of `delay_init` to the value of `SYSCLK` in the clock tree.

33. Then compile all files.

34. Zero errors and zero warnings means the configuration was successful. If there are errors or warnings, please search on Baidu or Google.

② How to query the clock frequency of a peripheral (using a timer as an example)
- Open tim.c

- Find the Msp initialization weak function (determine which TIMx it is from the TIM base handle).

- Find the definition of __HAL_RCC_XXX_CLK_ENABLE().

- Based on the function definition, it can be seen that TIM1 is mounted on APB2.

- Query the clock tree, find the APB2 Timer Clock, and you'll get that TIM1's TCLK is 168 MHz.

- Therefore, the TCLK frequency of TIM1 is 168MHz.
STM32 program composition
Basic Introduction (Main Function, etc.)
- Project structure: An STM32 project is composed of libraries written in C and assembly language, so it includes a main function and follows the structure of C/C++ languages.
- Program execution order: Apart from precompilation and similar steps, the program starts running from the main function and closely follows the C/C++ execution order. Beginning from the main function, the code runs line by line, and then enters an infinite loop.

- Essential components inside the main function: an infinite loop while(true) or for(;;), because the microcontroller must keep running continuously, so an infinite loop is required.
- HAL library and user-defined library
- Library:
- .h file declares functions
- .c/.cpp files define functions
- .c/.cpp file calling a function

- Library:


Introduction to Interrupt Service Functions
- Special function (interrupt service function): The interrupt service function is defined by assembly language and is more closely tied to the chip hardware. It is triggered by chip interrupt events and does not follow the conventional C/C++ calling sequence.

- Interrupt service functions are invoked by interrupt events. Once an interrupt event occurs, execution immediately switches from the current running location to the interrupt service function. After the interrupt service function finishes, it returns to the original location and resumes execution.
- Interrupt events: for example, the external interrupt event on line X, the SysTick timer interrupt (the implementation method for ordinary delay functions), the UART receive interrupt event, the UART transmit interrupt event, the TIM timer overflow update interrupt, the TIM timer input capture interrupt, the CAN communication transmit interrupt event, the CAN communication receive interrupt event, the RTOS PendSV interrupt, and so on. (The interrupt service functions corresponding to each event are generally different, but some interrupt events may share the same interrupt service function.)
- Interrupt service function handling process:
- The CPU detects the occurrence of an interrupt event.
- Preserve the context, push the current PC address onto the stack (Program Counter).
- Jump to the interrupt service function and execute the interrupt service routine.
- Restore the scene, sending the value at the top of the stack back to the PC;
- Jump to the interrupted position and begin executing the next instruction.



- Interrupt Priority and Grouping
- Priority: Preemption Priority and Subpriority
- Group 0-5

- Change the grouping (change in HAL_Init)




- Interrupt service routine content:
- First, check the interrupt flag to determine which interrupt event was triggered.
- Clear the corresponding flag bit to prevent the interrupt from being triggered continuously, allowing the next interrupt to operate normally.
- Receive data, etc. (optional)
- Logical business code implementation (optional, such as data processing, etc.)
- Characteristics of interrupt service functions:
- Interrupt service functions cannot accept parameters.
- Interrupt service functions cannot have a return value.
- Interrupt service routines should be kept short and efficient.
- Under unavoidable circumstances, do not use delay functions in interrupt service routines. If a delay is necessary, ensure the priorities of the delay and the interrupt are properly configured; otherwise, the program may freeze (except for delays used in external interrupts for software debouncing).
- Do not use the printf function inside interrupt service routines, as it can cause reentrancy and performance issues.
- For example:

- USART1_IRQHandler function
- Chinese name: Serial1_Interrupt Service Routine
- Declaration and Definition: Declared in assembly, requiring the user to define it themselves (if enabled and the user does not define it, the program will get stuck in the assembly code).
- Call condition: invoked by a CPU interrupt event
- Purpose: Called by the CPU to invoke an urgent interrupt routine (the interrupt routine refers to the content within the interrupt service function).
- HAL_UART_IRQHandler(handle)
- Chinese name: Serial Port_Interrupt Common Service Function ("Common" means that all serial ports, such as Serial 1, 2, 3, 4, 5, etc., share this single function to implement functionality, with the specific triggered function determined by the subsequent handle).
- Declaration and Definition: HAL library declarations and definitions written by STMicroelectronics
- Call condition: called by the interrupt service function
- Purpose:
- First, check the interrupt flag to determine which interrupt event was triggered.
- Clear the corresponding flag bit to prevent the interrupt from being triggered continuously, allowing the next interrupt to operate normally.
- Receive data, etc. (optional)
- Call the interrupt callback function corresponding to the interrupt event.
- Other operations (such as special cases where interrupts are disabled within the serial receive interrupt handler, meaning interrupts are turned off).
- HAL_UART_RxCpltCallback(handle)
- Chinese name: Serial Port Interrupt Callback Function (Because it is called by the interrupt service routine, the handle is determined by the interrupt service routine that invokes it).
- Declaration: The HAL library written by ST is declared as a weak function, requiring the user to define it themselves.
- Call condition: called by an interrupted public service function
- Purpose: First determine which handle made the call, then implement the corresponding business logic (optional, such as data processing, etc.).


- USART1_IRQHandler function

A Simple Understanding of RTOS and ROS/ROS2
- Advanced (non-bare-metal development, based on RTOS system development)
- Common RTOS (Embedded Real-Time Operating Systems): FreeRTOS, NuttX, RT-Thread, μC/OS-II, Xiaomi VelaOS
- FreeRTOS official website: https://www.freertos.org/zh-cn-cmn-s/

- FreeRTOS Simplified: An Operating System with Multithreading Library Features and POSIX Standard Compatibility
- Multithreading: The system has multiple tasks (threads), each of which runs independently and simultaneously (you can think of each task as a main function, and all these tasks are executing at the same time. The specific implementation method will be learned later; the principle involves PendSV interrupts, etc.).
- Advanced (non-bare-metal development, based on RTOS and ROS2_MicroROS)

- Usage: ESP32 uses Arduino libraries + FreeRTOS + MicroROS and communicates with STM32 via serial port.
- One of the main functions: it enables more secure and stable communication with the ROS2 on the host computer (PC, industrial PC) via WiFi, which is much better than using serial communication (rosserial) directly (DDS distributed).
- Detailed explanation of MicroROS vs ROSserial link: https://mp.weixin.qq.com/s/1lQXAA3sV-4GpXAzHiGChQ
register
- Understanding: These are small storage areas inside the CPU used to temporarily hold data involved in computations and the results of those computations.

- Implemented functions:


- How do registers function within the C-based HAL library? (Or rather, what is the principle behind the C-language HAL library implementing STM32 microcontroller control?)


Vinci Robot Team Standard Engineering Format
English
Must use English. File names, function names, and variable names must be in English! (Step out of your Chinese comfort zone — at the very least, you should know some professional English.)
Atom-Positive HAL Library Project Standard:


Vinci Robot Team STM32 Project Standard (Cube + C Language):

- applications application layer

- BSP driver layer

- Middlewares

- Core (location of the main function, and where the HAL library header files are configured via conditional compilation)


Vinci Robot Team STM32 C/C++ Project Standard (similar to Zhengdian Atom, trial run, not recommended, 建议用下一节的类Cube_Cpp):

- The application layer, driver layer, and others adopt a modular integrated approach, instead of using the Src and Inc separation method.


- Public Compatibility Layer:
- C++ sub-main compatibility library

- Purpose: Create a regular function in a .cpp file that calls C++ code, and is then called by the main function in a C language main.c file.


- Weak Function _ Callback Function Library (The source file of this file must have
extern "C"globally, because weak functions are a C language feature and C++ cannot recognize them properly.)




(Recommended) Vinci Robotics Team STM32Cube C/C++ Project Standard (Cube-like, trial run, recommended):
First, open CubeMX to configure the project. For example, here we are using bare-metal development to make an LED blink.
Then select OpenFolder to open the folder.

Clone the necessary files from GitHub.
Repository link:
https://github.com/tungchiahui/CubeMX\_MDK5to6\_Template
Or simply open the terminal and type
git clone https://github.com/tungchiahui/CubeMX_MDK5to6_Template.git
Open the cloned template and the project generated by CubeMX just now.


Open the 工程文件移植(创建新模板请看这里) folder in the template, then copy all the files inside it into the CubeMX project file.

After moving:

Open the project and set up the project.
- Open the MDK5 project.


- Click Options for Target

- Change the compiler to
ARMClang[ARM Compiler6 (AC6)]replacingARMCC[ARM Compiler5 (AC5)]

- Add the header file path (Include Path)


Add the Inc folder in applications.

Add the Inc folder in bsp/boards.

Click OK.

- Add source files .c/.cpp, etc.
Open Manage Project Items

Create two groups
The names of the groups are respectively called
applications
bsp/boards

Add the startup_main.cpp file in the Core/Src directory to the Application/User/Core group.

Add bsp_delay.cpp from the bsp/boards/Src directory to the bsp/boards group.

You can see that all the files in the project are ready.

Compile and configure some necessary code.


You can right-click the header file, then click Open Document "xxx.h" to open it and check whether the header file was imported successfully.

Find the main.c file and prepare to call the C++ class main function startup_main(); within the main() function.

Between the two comment lines USER CODE BEGIN Includes and USER CODE END Includes, 引用startup_main.h (because code not placed between BEGIN and END will be lost after CubeMX reconfiguration)

Between the two comment lines USER CODE BEGIN and USER CODE END 调用startup_main();(because code not placed between BEGIN and END will be erased after CubeMX reconfiguration)


Open startup_main.h

Change the value of the isRTOS macro: set it to 0 for bare-metal development, or change it to 1 if FreeRTOS is used.

At this point, you can freely call code from the C/C++ library within the startup_main() function.

The header file format of a C++ library
Take bsp_delay.h as an example.
#ifndef __BSP_DELAY_H_
#define __BSP_DELAY_H_
#ifdef __cplusplus
extern "C"
{
#endif
#include "startup_main.h"
class BSP_Delay
{
public:
class F1
{
public:
void Init(uint16_t sysclk);
void us(uint32_t nus);
void ms(uint16_t nms);
}f1;
class F4
{
public:
void Init(uint16_t sysclk);
void us(uint32_t nus);
void ms(uint16_t nms);
}f4;
class FreeRTOS
{
public:
void Init(void);
}freertos;
};
extern BSP_Delay bsp_delay;
#ifdef __cplusplus
}
#endif
#endif
Conditional compilation is definitely necessary—one is to prevent duplicate inclusion of header files, and the other is to link C++ as C language. (If you've forgotten, please refer to Vinci Robotics Team C/C++ Resources.)
Then include the startup_main.h header file.
Then create the class for this module, for example class BSP_LED, etc. Since this is a delay class, it would be class BSP_Delay.
Then write the declaration inside the class.
Note: Do not write any code implementation in the .h file, meaning you cannot write any function definitions.
Then on line 35, extern BSP_Delay bsp_delay; declares the object (variable) that was created in the main function, making it accessible to other source files.
In theory, you should be able to understand what was mentioned above. If you really can't, just follow the example and copy it — as you keep doing so, you'll gradually come to understand.

The source file format of a C++ library
Using bsp_delay.cpp as an example
#include "bsp_delay.h"
#if isRTOS == 1
#include "cmsis_os.h"
#endif
static uint32_t g_fac_us = 0; /* us延时倍乘数 */
BSP_Delay bsp_delay;
/**
* @brief 初始化延迟函数
* @param sysclk: 系统时钟频率, 即CPU频率(HCLK)
* @retval 无
*/
void BSP_Delay::F1::Init(uint16_t sysclk)
{
SysTick->CTRL = 0; /* 清Systick状态,以便下一步重设,如果这里开了中断会关闭其中断 */
HAL_SYSTICK_CLKSourceConfig(SYSTICK_CLKSOURCE_HCLK_DIV8); /* SYSTICK使用内核时钟源8分频,因systick的计数器最大值只有2^24 */
g_fac_us = sysclk / 8; /* 不论是否使用OS,g_fac_us都需要使用,作为1us的基础时基 */
}
/**
* @brief 延时nus
* @param nus: 要延时的us数.
* @note 注意: nus的值,不要大于1864135us(最大值即2^24 / g_fac_us @g_fac_us = 9)
* @retval 无
*/
void BSP_Delay::F1::us(uint32_t nus)
{
uint32_t temp;
SysTick->LOAD = nus * g_fac_us; /* 时间加载 */
SysTick->VAL = 0x00; /* 清空计数器 */
SysTick->CTRL |= 1 << 0 ; /* 开始倒数 */
do
{
temp = SysTick->CTRL;
} while ((temp & 0x01) && !(temp & (1 << 16))); /* CTRL.ENABLE位必须为1, 并等待时间到达 */
SysTick->CTRL &= ~(1 << 0) ; /* 关闭SYSTICK */
SysTick->VAL = 0X00; /* 清空计数器 */
}
/**
* @brief 延时nms
* @param nms: 要延时的ms数 (0< nms <= 65535)
* @retval 无
*/
void BSP_Delay::F1::ms(uint16_t nms)
{
uint32_t repeat = nms / 1000; /* 这里用1000,是考虑到可能有超频应用,
* 比如128Mhz的时候, delay_us最大只能延时1048576us左右了
*/
uint32_t remain = nms % 1000;
while (repeat)
{
us(1000 * 1000); /* 利用delay_us 实现 1000ms 延时 */
repeat--;
}
if (remain)
{
us(remain * 1000); /* 利用delay_us, 把尾数延时(remain ms)给做了 */
}
}
/**
* @brief 初始化延迟函数
* @param sysclk: 系统时钟频率, 即CPU频率(rcc_c_ck), 168MHz
* @retval 无
*/
void BSP_Delay::F4::Init(uint16_t sysclk)
{
HAL_SYSTICK_CLKSourceConfig(SYSTICK_CLKSOURCE_HCLK);/* SYSTICK使用外部时钟源,频率为HCLK */
g_fac_us = sysclk; /* 不论是否使用OS,g_fac_us都需要使用 */
}
/**
* @brief 延时nus
* @param nus: 要延时的us数.
* @note nus取值范围 : 0~190887435(最大值即 2^32 / fac_us @fac_us = 21)
* @retval 无
*/
void BSP_Delay::F4::us(uint32_t nus)
{
uint32_t ticks;
uint32_t told, tnow, tcnt = 0;
uint32_t reload = SysTick->LOAD; /* LOAD的值 */
ticks = nus * g_fac_us; /* 需要的节拍数 */
told = SysTick->VAL; /* 刚进入时的计数器值 */
while (1)
{
tnow = SysTick->VAL;
if (tnow != told)
{
if (tnow < told)
{
tcnt += told - tnow; /* 这里注意一下SYSTICK是一个递减的计数器就可以了 */
}
else
{
tcnt += reload - tnow + told;
}
told = tnow;
if (tcnt >= ticks)
{
break; /* 时间超过/等于要延迟的时间,则退出 */
}
}
}
}
/**
* @brief 延时nms
* @param nms: 要延时的ms数 (0< nms <= 65535)
* @retval 无
*/
void BSP_Delay::F4::ms(uint16_t nms)
{
uint32_t repeat = nms / 540; /* 这里用540,是考虑到可能有超频应用, 比如248M的时候,delay_us最大只能延时541ms左右了 */
uint32_t remain = nms % 540;
while (repeat)
{
us(540 * 1000); /* 利用delay_us 实现 540ms 延时 */
repeat--;
}
if (remain)
{
us(remain * 1000); /* 利用delay_us, 把尾数延时(remain ms)给做了 */
}
}
void BSP_Delay::FreeRTOS::Init(void)
{
//调用FreeRTOS自带的延时即可。
//osDelay
//vTaskDelay
//vTaskDelayUntil
}
/**
* @brief HAL库内部函数用到的延时
HAL库的延时默认用Systick,如果我们没有开Systick的中断会导致调用这个延时后无法退出
* @param Delay 要延时的毫秒数
* @retval None
*/
void HAL_Delay(uint32_t Delay)
{
#if isRTOS==0 //如果是裸机开发
#ifdef STM32F1 //如果是裸机开发且为F1
bsp_delay.f1.ms(Delay);
#endif
#ifdef STM32F4 //如果是裸机开发且为F4
bsp_delay.f4.ms(Delay);
#endif
#elif isRTOS==1 //如果是FreeRTOS开发
osDelay(Delay);
#endif
}
First, you'll definitely need to include your own header file.
You don't need to worry about this conditional compilation. Since there is a difference in delay between bare-metal development and RTOS development, I added a line of conditional compilation.

First, create the class object bsp_delay.

Then define all the functions within the class.

Function comment format:
This block is the comment for this function. Please try to write comments this way from now on. (When calling the function in MDK6 later, this comment will be displayed as a prompt, making it clear at a glance.)

The benefit of writing comments this way is that when calling the function, it will display what input parameters need to be filled in and what the return value is.

Brief function summary
param input parameter
retval return value
Note or Attention (precautions)
Note here: write one param for each entry parameter.
For example
/**
* @brief CAN1通信发送函数
* @param motor1: 第1个电机的相对电流值
* @param motor2: 第2个电机的相对电流值
* @param motor3: 第3个电机的相对电流值
* @param motor4: 第4个电机的相对电流值
* @retval bool是否发送成功
* @note 无特殊注意事项
*/
bool CAN_BUS::CAN1::CMD1(int16_t motor1,int16_t motor2,int16_t motor3,int16_t motor4)
{
// ... ...
}
Precautions
- In a .cpp source file, the definition of a weak function must be prefixed with
extern "C"because__weakis specific to C language (assembly vectors), so the code must be linked in C language form. - Code must be written between Begin and End; otherwise, the code will disappear after reconfiguring with CubeMX.
driver
- A driver, whose full name is device driver, enables a computer to communicate with the corresponding hardware. A driver is a configuration file written by the hardware manufacturer based on the operating system. In short, without drivers, the hardware in a computer cannot function.
- General module driver: GPIO initialization program + communication protocol program, data protocol handler
- For example: an LED only requires a GPIO initialization program; a Bluetooth module needs a GPIO initialization program, a communication protocol program, and a data processing program.


- GPIO program:

- Communication protocol program: The figure shows the serial communication protocol program and the GPIO program.

- Data parsing program: The figure shows the PS2 controller's data processing function (see the data parsing problem type in the C++ question bank, mainly using binary, hexadecimal, bitwise operators, etc.).

- Data unit conversion:
- 1 Mbps (bit rate) = 1,000,000 bit/s (bits per second)
1 byte = 8 bits = 8 binary digits (very important)
1 kbyte (kilobyte) = 1024 bytes
1 Mbyte (megabyte) = 1024 kbytes
1 Gbyte (gigabyte) = 1024 Mbytes - 1 character = 1 byte
- 1 Mbps (bit rate) = 1,000,000 bit/s (bits per second)
- Data unit conversion:
1 Arabic numeral = 1 character
Under GBK encoding, 1 Chinese character = 2 characters.
Under UTF-8 encoding, 1 Chinese character = 3 bytes.
3. Data naming format (see C++ documentation for details):




12. Regarding the parameter-filling method for HAL library peripheral APIs:

2. Check the corresponding data type.
3. Comments on the content of looking up data sheets and functions
DJI Motor Control (CAN)
①Introduction to CAN Communication
- What is CAN communication?
The CAN bus communication system is a type of serial communication that is superior to the serial RS485 bus. Unlike synchronous communication methods such as I2C and SPI that rely on clock signals, CAN communication does not use a clock signal for synchronization; it is an asynchronous, half-duplex communication method (differential signal, half-duplex).

- Classification of Serial Communication Logic Level Representation Methods
- TTL (common voltage level for microcontroller pins, full-duplex serial communication; the chip's I/O pins are RX and TX at TTL levels; the signal lines are RX and TX at TTL levels)
- RS232 (a type of level with a higher voltage range than TTL, offering better noise immunity; full-duplex serial communication; the chip's I/O pins use TTL-level RX and TX; the signal lines use RS232-level RX and TX)
- RS485 (differential signal, excellent noise immunity, Modbus protocol, serial half-duplex; chip I/O pins are TTL-level RX and TX; signal lines are A and B)
- CAN communication signal line
- Differential signal, excellent anti-interference, half-duplex. The chip's I/O pins are CAN_RX and CAN_TX at TTL levels; the signal lines are CAN_H and CAN_L (similar to RS485).
② Motor Library Code Analysis (The content of this library should be understood as thoroughly as possible, ideally line by line)
- Code and its initialization
- This part is covered by Zhengdian Atom; you just need to change the parameters to DJI motors.
Code repository link: https://github.com/SDUTEMIS/SDUT\_VinciRobot/tree/main/1.Embedded\_STM32\_Driver%2FC%2F4.Motor\_Drivers%2F1.DJI\_CAN\_PID





- DJI Motor Library Open-Loop Code Analysis: The library was modified from the official DJI library code by senior students from previous years.
- CAN message sending function analysis




This function sends current values to DJI CAN1 communication motors. CAN1 communication can only send 8 bits of data at a time, while the current value is 16 bits. Therefore, the current value is shifted right by 8 bits before being sent to the motor. Once the motor receives the current value, it starts rotating (the input parameters are the motor current values for ESC IDs 1-4).


This function sends the current value to the DJI CAN1 communication motor. CAN1 communication can only send 8 bits of data at a time, and the current value is 16 bits of data. Therefore, the current value is shifted right by 8 bits before being sent to the motor. Once the motor receives the current value, it starts rotating (the input parameter is the motor current value for ESC IDs 5-8).
10. CAN message sending function call
```cpp
int16_t Current_Motor_Target[1];
void chassis_task(void const * argument)
{
//wait a time
//空闲一段时间
vTaskDelay(20); //等待所有设备准备就绪
while(1) //可以在定时器中断里实现
{
Current_Motor_Target[0] = 1000; //测试电机闭环是否可用的代码,正式使用时请注释该行代码
CAN1_CMD_1(Current_Motor_Target[0],0,0,0); //对电调ID为1的电机发送1000电流使其开环转动。
//系统延时
vTaskDelay(2); //等同于osDelay(2);
}
}
```
6. DJI Motor Library Closed-Loop Code Analysis: The library is a modified version of DJI's official library code by previous senior students. (On top of the open-loop foundation, CAN message reception and a series of data parsing programs have been added.)
1. CAN communication receive interrupt callback function (The CAN_RX0 receive interrupt callback function is used to process data sent by the CAN communication motor, which includes **motor speed, angle, temperature**, and other data.)



6. Data parsing function
1. Motor encoders are divided into incremental encoders and absolute encoders.
1. Incremental encoder: data is lost on power-up, and the angle starts from 0.
2. Absolute encoder: data is not lost after power loss.
3. The M3508 and M2006 encoders are absolute encoders, so their data is retained after power loss. However, since they record the rotor angle and the rotor is connected to a gearbox, the data becomes incorrect. Therefore, we need to use code to convert the absolute encoder data into incremental encoder data for use. (In the struct, `total` represents the absolute encoder angle, while `total_angle` is the total angle we have processed and converted to incremental encoder format.)
2. Record the power-on angle

4. After code processing, the total angle at power-on is calculated as: total angle = number of revolutions (0) * 8192 + current absolute encoder angle (assumed to be A) - power-on angle captured at startup (since this is at power-on, it is also A) = 0.

6. Function not currently in use (this function is not being called)

③PID Controller
- Introduction to the PID Algorithm
- MATLAB PID Controller Introduction: https://www.mathworks.com/help/control/pid-controller-design.html?s\_tid=CRUX\_lftnav
- The principle of the PID algorithm
The most classic example for understanding the PID control algorithm is a leaking water tank problem.
There is a leaking water tank, and the rate of leakage is not constant. We also have a bucket, and we can control whether to add water to the tank or scoop water out of it. Additionally, we can measure the water level. Our goal now is to stabilize the water level at any desired level we choose.
Note that we need to use PID within a closed-loop system. What is a closed-loop system? It means there is both input and feedback. Input refers to being able to input a quantity to influence and control our system, while feedback means we need to be able to know the state of the thing we are ultimately controlling. In this leaking water tank system, the input is the bucket—we can use the bucket to add water to the tank or scoop water out of it to affect the water level. As for feedback, we need to be able to measure the water level and know what it is.

a, Understanding Proportional Control
First is proportional control. Proportional control is like adding water to or scooping water from a tank using a bucket. Suppose we need to stabilize the water level at plane A, but the actual water level is at plane B. The water level difference is Err = A - B. In this case, the amount of water we need to add is Kp * Err, where Kp is our proportional control coefficient.
If A > B, Err is positive, so water is added to the tank; if A < B, Err is negative, so water is scooped out of the tank. As long as there is a difference between the expected water level and the actual water level, we will adjust the system by adding or removing water using the bucket. At the same time, the magnitude of Kp also affects system performance. If the value of Kp is large, the advantage is that the speed of reaching plane A from plane B is fast, but the disadvantage is that when plane B is close to plane A, the system tends to oscillate significantly. If the value of Kp is small, the advantage is that the system oscillates less when plane B approaches plane A, but the disadvantage is that the speed of reaching plane A from plane B is slow.
Some people might wonder: why not just set the proportional control coefficient Kp directly to 1, and then add water in the amount of Err = A - B? However, in practice, many systems cannot achieve this. For example, in a temperature control system, if the actual temperature is 10 degrees and I want to raise it to 40 degrees through heating, can we accurately add exactly 30 degrees to the system all at once? Obviously, that's impossible. So, the ultimate result of proportional control is that the value of Err tends toward 0.
b, Understanding Derivative Control
*Now let's look at differential control. Under the effect of our proportional control, Err begins to decrease (assuming initially the expected level A is greater than the actual level B, meaning Err is a positive value). This means that Err is a curve with a slope less than zero over time. Within the cycle period, the larger Err is, the greater the absolute value of the differential, which in turn suppresses the rate at which Err decreases. This continues until the slope reaches zero, at which point the differential stops acting.

Derivative control can reflect the trend of the input signal, so before the magnitude of the input signal changes too drastically, it can introduce an effective early correction signal to increase the system's damping level, thereby improving system stability. However, the high-pass characteristic of first-order derivatives makes this controller prone to amplifying high-frequency noise.
c, Understanding Integral Control
The main function of the integral control component is to eliminate steady-state error. So how does the integral term eliminate steady-state error?
Proportional control can only try to bring Err to 0, while the role of the derivative is to stop acting when the slope of the curve is controlled to 0. However, when the slope is 0, Err is not necessarily 0.
*This is where the integral term comes into play. We know that the integral of a curve corresponds to the area enclosed between the curve and the x-axis. As shown in the figure below, the purpose of the integral action is to make the sum of the red area and the blue area equal to zero. Therefore, even if the system has stabilized under proportional and derivative control, as long as the error (Err) is not zero, a steady-state error will persist. As long as this error exists, the integral will continue to affect the system until the error becomes zero. In this way, our PID control can theoretically achieve a very precise control effect.

d, PID Algorithm Discretization
Assume the sampling time interval is T, then at time k:
The deviation is e(k);
The integral is e(k) + e(k-1) + e(k-2) + … + e(0);
The differential is (e(k) - e(k-1)) / T;
Thus, after discretization, the formula is as follows:

Proportional coefficient: Kp,
Integral coefficient: Kp * T / Ti, can be represented as Ki; *
- Derivative coefficient: Kp*Td/T, can be represented as Kd;
Then the formula can be written as follows:

The discrete form of the PID algorithm is exactly this, which is what we commonly refer to as positional PID.
But why still add an incremental calculation method?
A summation symbol could cause the microcontroller's memory to be insufficient—a single byte with eight bits can only store up to 255. Second, the impact of power loss is very significant, as all previously stored states will be completely lost. Therefore, an incremental approach that does not require high state recording should be used.
Next, we continue deriving the incremental PID. Based on the formula above, we can obtain:

e, PID dual-loop

f, PID feedforward
- PID Algorithm Library
- Core computational functions (very mature controllers, mathematical algorithms)

- Initialization code (assign the three parameters Kp, Ki, Kd, along with the output maximum and integral limit, to the PID handle pid_v_1 or another handle)

- Feedback loop code

- Closed-loop code call



④ C++ Library (Recommended)
Introduction
Code repository link: https://github.com/TungChiahuiMCURepos/CAN\_PID\_CPP

By analogy with C language libraries,
can.c is the CAN communication initialization driver file automatically generated by CubeMX.
bsp_can.cpp is a code file that needs to be written by yourself to enable CAN communication (the part that CubeMX does not automatically generate and requires manual invocation).
In can_receive.cpp, the implementation of the CANRX0 receive interrupt callback function is provided. This callback uses several motor information data processing functions, along with four CAN transmission functions.
pid.cpp is the core mathematical algorithm code for the PID control system.
The file pid_user.cpp contains code that calls the PID core functions and wraps them into initialization code for the PID controller, along with some closed-loop implementation code.
C++ DJI Motor Library
Structure and brief introduction of CLASS

Below the image is the CAN_BUS class, which contains three nested classes.
- CAN_BUS::BSP class, which contains two methods:
- CAN_Start is the function that enables CAN communication.
- Filter_Init is a function for CAN communication filtering.
- CAN_BUS::DJI_ENCODER class, which contains three methods (all functions within this class are called by the CAN_RX0 receive interrupt callback function):
- The function
get_motor_measureprocesses the encoder data received from DJI motors via CAN communication and extracts various motor information. - get_moto_offset is a function that processes the encoder data from DJI motors received via CAN communication, and calculates the initial angle value of the motor when it is first powered on.
- get_total_angle is a function that processes the encoder data received from DJI motors via CAN communication and calculates the motor angle value. (It is not currently being called.)
- The function
- CAN_BUS::CMD class, which contains four methods:
- CAN1_Front is a function that sends current to the front 4 motors of CAN1 (corresponding to ESC IDs: 1-4).
- CAN1_Behind is a function that sends current to the last 4 motors of CAN1 (corresponding to ESC IDs: 5-8).
- CAN2_Front is a function that sends current to the first 4 motors of CAN2 (corresponding to ESC IDs: 1-4).
- CAN2_Behind is a function that sends current to the last 4 motors of CAN2 (corresponding to ESC IDs: 5-8).

CAN_BUS::BSP class methods (functions) (in bsp_can.cpp)
####### CAN_Start: Function to Enable CAN Communication

####### Filter_Init: CAN Communication Filtering Function


Methods (functions) of the CAN_BUS::DJI_ENCODER class (in can_receive.cpp)
####### get_motor_measure processes the encoder data received from DJI motors via CAN communication, and extracts various motor information from the function.

####### get_moto_offset processes the CAN communication received from the DJI motor encoder data, and calculates the initial angle value when the motor is first powered on function

####### The get_total_angle function processes the DJI motor encoder data received via CAN communication and calculates the motor angle value. (Not currently called)

CAN_BUS::CMD class methods (functions) (in can_receive.cpp)
####### CAN1_Front Function to send current to the first 4 motors of CAN1

####### CAN1_BehindCAN1 Rear 4 Motor Current Send Function

####### CAN2_FrontCAN2 Front Function for sending current to the first 4 motors

####### CAN2_Behind CAN2 Rear Function to send current to 4 motors

CAN_RX0 receive interrupt callback function (in can_receive.cpp)

C++ PID Library
Structure and brief introduction of CLASS

Below the image is the PID_Controller class, which contains three nested classes and one method.
- PID_Controller class:
- All_Device_Init initializes the PID controllers for all devices.
- The
PID_Controller::COREcore class contains three methods:- PID_Init PID core initialization function;
- PID_Calc PID core calculation function;
- PID_Clear PID reset function.
- PID_Controller::CAN_MOTOR is a CAN motor class that contains six methods (since the three methods above and the three below differ only in CAN communication, only CAN1 will be explained):
- CAN1_Velocity_Realize CAN1 velocity loop implementation function;
- CAN1_Position_Realize CAN1 position loop implementation function;
- CAN1_VP_Dual_Loop_Realize CAN1 speed-position dual-loop implementation function;
- CAN2_Velocity_Realize CAN2 velocity loop implementation function;
- CAN2_Position_Realize CAN2 position loop implementation function;
- CAN2_VP_Dual_Loop_Realize CAN2 speed and position dual-loop implementation function;
- The
PID_Controller::SENSORSsensor class contains three methods:- Yaw_Realize is the PID implementation function for the heading angle of the gyroscope IMU.
- Pos_X_Realize: function to implement X-coordinate positioning using the encoder.
- Pos_Y_Realize Function for Y-coordinate positioning using the wheel encoder.

Methods (functions) of the PID_Controller class (in pid_user.cpp)
####### All_Device_Init initializes the PID controllers for all devices.

Methods (functions) of the PID_Controller::CORE class (in pid.cpp)
####### PID_Init PID Core Initialization Function

####### PID_Calc PID Core Calculation Function

####### PID_Clear PID Reset Function

Methods of the PID_Controller::CAN_MOTOR class (in pid_user.cpp) (only the three closed-loop functions for CAN1 are covered here)
- Note:
- The general current value variable is defined as an array, for example,
fp32 motor_current_target[8];This successfully defines the current values to be sent to eight motors.
- The general current value variable is defined as an array, for example,
*The same applies to speed and angular position: fp32 motor_speed_target[8]; and fp32 motor_position_target[8];.
3. *The C++ motor PID library has some differences from the C language motor PID library.*
1. *Because the ESC ID range is 1-8, while the array range is 0-7,*
2. *So to keep it consistent with the array index, note the difference here:*
3. *In the C library, the value of i is the value of the electronic speed controller (ESC) ID.*
4. *In the C++ library, the ****i value is the ESC ID value minus 1.***
####### CAN1_Velocity_Realize CAN1 Velocity Loop Implementation Function

####### CAN1_Position_Realize CAN1 Position Loop Implementation Function

####### CAN1_VP_Dual_Loop_Realize CAN1 Speed-Position Dual Loop Implementation Function

PID_Controller::SENSORS sensor class methods (functions) (in pid_user.cpp)
####### Yaw_Realize - Yaw angle PID implementation function for the gyroscope IMU (You can supplement this once you've completed the C++ library for the gyroscope IMU)

####### Pos_X_Realize Encoder Positioning X Coordinate Implementation Function (You can supplement this once you've completed the C++ library for the Encoder OPS-9)

####### Pos_Y_Realize Encoder Positioning Y-Coordinate Implementation Function (To be supplemented once you complete the C++ library for the OPS-9 encoder)

How to call it?
Here I choose to use a PID controller to perform a negative feedback loop control every 1ms and send a current value each time.
You can add a delay(1) inside the while(1) infinite loop to perform the transmission.
You can also implement this using a timer interrupt with a period of 1ms; using a timer interrupt is recommended.

⑤ Physical Connection. For details, please refer to the manual.




DMA (Direct Memory Access)
FreeRTOS
Theoretical Knowledge
https://www.bilibili.com/video/BV19g411p7UT
Below are only some common operations and points to note. For more detailed FreeRTOS configuration, please refer to: (study along with it)
DJI Development Board Type C Embedded Software Tutorial Document.pdf
STM32F1 FreeRTOS Development Manual_V1.1.pdf
STM32F4 FreeRTOS Development Manual_V1.1.pdf
Commonly used content (the tutorial below focuses on how to configure CubeMX; for theoretical knowledge, please refer to Zhengdian Atom).
System Configuration
- When selecting the system, the Timebase Source
- Reason: Because FreeRTOS occupies the SysTick, the time base source needs to be changed.
- Selection rule: Prefer timers with fewer features (for example, TIM6 and TIM7 on the F407ZGT6 have fewer functions).

- How to choose? (As shown in the figure)

- Select the Interface
- Reason: FreeRTOS follows ARM's CMSIS standard.
- Selection principle: Prefer CMSIS v1, as CMSIS v2 still has some minor unresolved issues.
- How to choose?

- Configure Include Parameters
- Function: Same as hal_conf.h (used to enable certain features of the HAL library), this is used to enable some features of FreeRTOS.
- Configuration of Include Parameters
- CubeMX Configuration (Recommended)
- Just enable the corresponding function as needed. (The commonly used one is vTaskDelayUntil.)

- Manually edit header file configuration (not recommended)
Create task
- CubeMX creates tasks:
- Parameter Descriptions (see DJI manual for details):

- What parameters are generally chosen?
- Task Name: MUST BE IN UPPERCASE ENGLISH (CORRESPONDING TO THE ENTRY FUNCTION)
- Priority: Generally, select normal priority (unless there is special logic).
- Stack Size: 128 Words is sufficient.
- Entry Function: lowercase English (corresponding to the Task Name)
- Code Generation Option: blindly select "As weak" (generates FreeRTOS thread task entry functions as weak functions).
- Parameter: Typically NULL is sufficient. If special features (such as semaphores) are needed, certain handles (e.g., the semaphore handle) must be provided.
- Allocation: Just choose Dynamic and let FreeRTOS handle dynamic allocation management.

- Notes:
- Creating too many tasks will cause memory to blow up.
Delay
- relative delay
- Function: The following two functions have the same effect:
osDelay()andvTaskDelay(). - Time: starts counting from when the function is called, until the specified delay ends.
- Calling method: Same as the HAL_Delay() method.
extern "C" //若在C++中运行需要加上该行 void green_led_task(void const * argument) { for(;;) //等同于while(true) { HAL_GPIO_WritePin(LED_GPIO_Port, LED_Pin, GPIO_PIN_RESET); osDelay(500); HAL_GPIO_WritePin(LED_GPIO_Port, LED_Pin, GPIO_PIN_SET); osDelay(500); } } - Function: The following two functions have the same effect:
- absolute delay
- Function:
- Get current time: osKernelSysTick()
- Absolute delay function: osDelayUntil()
- Time: The timing starts from the beginning of the task, treating the entire task lifecycle as a whole. This applies to tasks that run at a fixed frequency.

- Calling method:


- Function:
Task State Transition
- FreeRTOS states (see DJI manual for details):

- Function Introduction:

- How to call:

queue
- Reason: Global variables are unsafe in multithreaded environments. When multiple tasks operate on such a variable, the data is prone to corruption.
- Queue: A queue is a mechanism for data exchange (message passing) between tasks, from tasks to interrupts, and from interrupts to tasks.
- For specific content, refer to the Zhengdian Atom video to learn theoretical knowledge.
- call
- CubeMX Configuration: (Select the bit count of the data you want to transmit for Queue Size, and select the data type for Item Size)

- Call (see the detailed explanation in Zhengdian Atom):




Semaphore (a special form of queue)
- Reason: Same queue
- Semaphore: A special type of queue, it is a mechanism for solving synchronization problems, enabling orderly access to shared resources.
- Categories: Binary Semaphore, Counting Semaphore (see Zhengdian Atom for details)
- Synchronization problem: When A finishes a task and notifies B so that B can proceed, this is called a synchronization problem.

- Semaphore Overview (see Zhengdian Atom for details)

- Comparison of Queues and Semaphores

- Binary Semaphore Introduction:

- CubeMX Configuration



Pass the handle of the created binary semaphore into the task's parameter argument.
Actually, setting it to NULL also works.

- For detailed API function descriptions, please refer to the Zhengdian Atom documentation.
- Signal semaphore function: xSemaphoreGive();
- Semaphore acquisition function:






Memory management
Introduction
Stack area: Automatically allocated and released by the compiler, storing function parameter values, local variable values, etc. Its operation is similar to a stack in data structures.
Heap: Typically allocated and freed by the programmer. If the programmer does not free it, the operating system may reclaim it when the program ends. The allocation method is similar to a linked list in data structures.
(For details, see Vinci Robotics Team C/C++ Resources)
Modify the stack and heap sizes of the STM32.
####### Modifying the heap and stack sizes of the STM32 itself

As shown in the figure above,
The total memory size of a typical STM32 microcontroller is 20 Kb.
Heap Size refers to the heap size, which is 512 bytes = 0.5 KB.
Stack Size is the stack size, which is 1024 bytes = 1 KB.
The remaining portion of memory is allocated to the remaining regions, with most of the memory assigned to the Static region.
After generating the project with CubeMX, you can see the heap and stack region size addresses in the startup file. (Of course, you can also modify them here, but it's recommended to make changes directly in CubeMX instead. Generally, there's no need to modify them unless necessary.)



####### Modifying the Heap Size in FreeRTOS (This heap is not the same as the standard heap—see the explanation below)

- TOTAL_HEAP_SIZE: If FreeRTOS is used, the size of the FreeRTOS heap area can be modified here.
- Memory management scheme: The algorithm for dynamic memory allocation can be modified, and the heap_4 algorithm is generally used.
- The heap area in FreeRTOS_HEAP is not the typical heap area — instead, it is memory that FreeRTOS carves out from the ZI region of the STM32 (you can think of it as FreeRTOS's kernel grabbing memory from areas like data, bss, heap, stack, etc.), rather than allocating memory from the STM32's Heap region (this applies when you choose the heap_1, 2, 4, or 5 algorithms). If you select the heap_3 algorithm, it will use the C library's malloc() and free() functions to allocate heap memory, in which case FreeRTOS would be using the STM32's heap region (which is relatively small, making it less effective than the heap_4 algorithm). However, we typically use the heap_4 algorithm for memory management, so FreeRTOS_HEAP is allocated from the STM32's ZI region. (In other words, FreeRTOS_HEAP does not directly request memory from the heap region; instead, it flexibly allocates memory from RAM, and can exceed the size of STM32_HEAP.)
- Because we are using the heap_4 algorithm, we do not need to modify the heap and stack of the STM32. We only need to modify FreeRTOS_HEAP (that is, allocate memory from the ZI region of the STM32 that FreeRTOS can control).

Introduction to Memory Management API
####### C Language Library Memory Management API (Not Recommended)


####### ALIENTEK Block-Based Memory Management API


memx refers to memory blocks, including internal SRAM and external SRAM (external SRAM may or may not be present).


####### FreeRTOS Memory Management API (Recommended)
- Introduction


- FreeRTOS memory management algorithm (we generally choose heap_4)


heap_4's first-fit algorithm starts from the memory block at the beginning of the heap area and finds the first one that fits the required memory size.

- FreeRTOS Memory Management API Functions


By looking at the code above, you can see that after allocating and then freeing memory, the amount of free memory returns to its original value.
However, it can be seen that at this point we allocated 4 bytes of memory, but it deducted 16 bytes. This is due to byte alignment — FreeRTOS chooses to trade space for speed by aligning bytes.
FPU floating-point computation acceleration
Due to the STM32's relatively low clock frequency, floating-point arithmetic operations can be very slow. Currently, there are several methods to optimize large floating-point operations like sin and cos.
Check if supported
| STM32 series | CPU core | DSP instruction | FPU type | The performance of arm_cos_f32(). | Suitable calculation | Suggested functions |
|---|---|---|---|---|---|---|
| STM32H7 | Cortex-M7 | ✅ Supported | ✅ Double-Precision FPU (DP-FPU) | 🚀 Fastest (Hardware Accelerated) | High-precision computation, robotics, filtering, navigation | arm_cos_f32() |
| STM32H5 | Cortex-M33 | ✅ Supported | ✅ Double-Precision FPU (DP-FPU) | 🚀 Fastest (Hardware Accelerated) | High-precision computation, filtering, and AI computation | arm_cos_f32() |
| STM32F7 | Cortex-M7 | ✅ Supported | ✅ Single-Precision FPU (SP-FPU) | 🔥 Fast (hardware accelerated) | Robot control, navigation, filtering | arm_cos_f32() |
| STM32F4 | Cortex-M4 | ✅ Supported | ✅ Single-Precision FPU (SP-FPU) | 🔥 Fast (hardware accelerated) | Robot control, mathematical operations | arm_cos_f32() |
| STM32G4 | Cortex-M4 | ✅ Supported | ✅ Single-Precision FPU (SP-FPU) | 🔥 Fast (hardware accelerated) | Motor control, filtering | arm_cos_f32() |
| STM32L4 | Cortex-M4 | ✅ Supported | ✅ Single-Precision FPU (SP-FPU) | 🔥 Fast (hardware accelerated) | Low-power computing | arm_cos_f32() |
| STM32U5 | Cortex-M33 | ✅ Supported | ✅ Single-Precision FPU (SP-FPU) | 🔥 Fast (hardware accelerated) | Low-power AI computing | arm_cos_f32() |
| STM32F3 | Cortex-M4 | ✅ Supported | ❌ No FPU | ⚠️ Slower (no FPU, DSP acceleration only) | Motor control, signal processing | arm_cos_q31() |
| STM32G0 | Cortex-M0+/M4 | ❌ Partial support | ❌ No FPU (some M4 versions have SP-FPU) | ⚠️ Slower (software computation) | Basic Control | arm_cos_q31() |
| STM32F1 | Cortex-M3 | ❌ Not supported | ❌ No FPU | 🚫 Slowest (purely software computation) | Floating-point calculations are not recommended. | arm_cos_q31() |
| STM32F0 | Cortex-M0 | ❌ Not supported | ❌ No FPU | 🚫 Slowest (purely software computation) | Floating-point calculations are not recommended. | arm_cos_q31() |
| STM32L0 | Cortex-M0+ | ❌ Not supported | ❌ No FPU | 🚫 Slowest (purely software computation) | Ultra-low power applications | arm_cos_q31() |
Enable FPU
A floating-point unit (FPU) is a structure used to perform floating-point operations, typically implemented in circuitry and applied in computer chips. ARM-designed M4 cores and higher-level cores support FPU, meaning the STM32F4 series and above. (The STM32F1 series does not support it.)
STM32F4/F7 microcontrollers typically have a single-precision FPU, while STM32H5/H7 models generally feature a double-precision FPU.
Enabling FPU on STM32F4 often results in a performance gap oftens or even hundreds of timescompared to not enabling it.
Using STM32CubeMX to generate a project will enable the FPU by default, as shown below.


If you are using the F1, you won't have this option at all, which means the M3 core does not support the FPU.

The image below shows that the FPU is enabled in the source code.

DSP acceleration
DSP acceleration refers to optimizing trigonometric algorithms using the CMSIS-DSP library, which speeds up computation but introduces slightly larger errors. However, for 99% of application scenarios, the error is acceptable, typically on the order of 1e-6 units.
The DSP library is only applicable to ARM's Cortex-A and Cortex-M cores, meaning it is suitable for devices such as phones, ARM microcontrollers, Raspberry Pi, and so on.
For STM32 microcontrollers, this basically covers all STM32 series, so they can all be used.
If you don't have an FPU, such as with STM32F1 series microcontrollers, you can still accelerate trigonometric function calculations using the DSP library. This DSP library is optimized through a mathematical approach combining table lookup and interpolation, making the computations relatively fast.
| Platform/Library Functions | CMSIS-DSP | C++ std::cos | C math.h |
|---|---|---|---|
| arm_cos_f32 | std::__math::cos | cosf() / cos() | |
| Cortex-M4/M7 (with FPU) | ✅ Fastest (table lookup + interpolation) | ✅ Faster (full computation) | ✅ Relatively fast (comparable to std::cos) |
| Cortex-M0/M3 (no FPU) | ⚠️ Relatively slow (table lookup + interpolation) | 🚫 Slowest (software floating-point) | 🚫Slowest (comparable to std::cos) |
| Cortex-A (such as Raspberry Pi) | ✅ Potentially faster (lookup table method) | ✅ Faster (using SIMD/FPU) | ✅ Faster (glibc/libm, SIMD optimizations) |
| x86/x86-64 (PC side) | ❌ Unavailable | ✅ Fastest (hardware accelerated) | ✅ Fastest (using FPU or SIMD) |
Therefore, when running on the STM32, it is still recommended to use the DSP library functions.
Install and enable the DSP library:
- Method 1 (Recommended): Open using CubeMX


Then enable the DSP library.

After generating the project, you can see the lib we generated through MDK5 or MDK6.



- Method 2 (not recommended): Open with MDK5
This approach will increase compilation time by at least 200%.

Function Introduction
ARM-core CPUs support the trigonometric functions of the CMSIS-DSP library, which are faster than the standard math.h and cmath functions.
- Common C/C++ trigonometric function library:
Below are the standard overloaded trigonometric functions. When the FPU is enabled, as long as the input is of type fp32, the speed is actually quite fast, and the DSP library is not necessary.

#include <cmath>
// 更新机器人的位置(假设机器人沿着x轴移动)
this->x_position += this->vx * std::__math::cos(this->yaw) * this->dt;
this->y_position += this->vy * std::__math::sin(this->yaw) * this->dt;
this->y_position = - y_position;
this->yaw += this->vw * this->dt;
- DSP library functions



Pass in the fp32 value.
// 更新机器人的位置(假设机器人沿着x轴移动)
this->x_position += this->vx * arm_cos_f32(this->yaw) * this->dt;
this->y_position += this->vy * arm_sin_f32(this->yaw) * this->dt;
this->y_position = - y_position;
this->yaw += this->vw * this->dt;
Performance Comparison
| ✅ For microcontrollers with FPU |
|---|
| function |
| Overloaded function std::__math::cos(x) |
| float |
| cosf(x) |
| arm_cos_f32(x) |
| arm_cos_q31(x) |
| Lookup Table (LUT) |
Besides arm_cos_f32, there are other arm_cos_q31 functions that may be more suitable for low-end chips like the F103, and you can freely choose among them.
| ❌ For microcontrollers without an FPU |
|---|
| function |
| arm_cos_f32(x) |
| arm_cos_q31(x) |
| Overloaded function std::__math::cos(x) |
| float |
| cosf(x) |
| Lookup Table (LUT) |
DMA + Multi-Channel ADC (Remote Control Joystick)

CubeMX configuration:
Most multi-channel ADCs need to enable scan mode.
Enabling or disabling ADC continuous mode affects the relevant code in the main function. If continuous mode is not enabled, the ADC must be repeatedly triggered within the while loop.
Enable continuous mode (the delay of 500 can be removed).

Not open:

In comparison, continuous is faster, so continuous is recommended.
Common STM32 Issues
STM32 ST-link download issue
- Reason: When configuring the file using CubeMX, the Debug option under the SYS setting was forgotten to be set.

- Phenomenon: After downloading the program once, the program fails to run and cannot be downloaded again.
- The STM32 has three boot modes:

- User Flash: Normal working mode. The built-in Flash of the STM32 is typically where programs are downloaded via JTAG or SWD mode, and the system boots from here after a reset.
- SRAM: The built-in RAM area of the chip, which is the memory, does not have program storage capability. This mode is typically used for debugging.
- System memory: System memory is a specific area inside the chip.
- The stm32 manufacturer has placed a Bootloader inside this area. This startup mode is selected to allow program downloading via the serial port, because the Bootloader provided by the manufacturer includes firmware for serial port downloading, enabling the program to be downloaded into the system's Flash through this Bootloader.
- Solution:
- Set BOOT0 to 1; set BOOT1 to 0.

- After connecting to the computer and pressing the reset button, use Keil5 to download a normal program that has no issues, and find that the program downloads successfully.
- Change the BOOT pin back to its original state, and try downloading the program again — everything works fine.