Using an absolute and incremental encoder per axis

Samuel · January 11, 2021, 3:42pm

With the proposed structure we’d only use the encoder classes to sample their respective data and send that raw data, combined with an indicator of the encoder’s type, to the estimation algorithms.

Yeah this separation is the most important one I was trying to get at that should be orthogonal to the class hierarchy.

So I think the diagram that @XJey5 sent is a good intermediate structure to work towards. You might find that a few more connections next to raw_count are needed, specifically I’m thinking of a bool flag that says “this encoder maintains absolute offset between reboots”. This is so that the estimator algorithms can properly decide when an encoder can be used for phase estimation. This could also be a config flag on the “Estimator Algorithms” object. It’s possible that the Estimator Algorithms needs to support both integer and float inputs. This will become clearer as you progress with the implementation.

The Encoder Manager I would implement just as a single function get_encoder (in board.cpp or main.cpp) that returns an Encoder* from an int.

Additionally the sensorless and ACIM modes need to keep working, so they would need to inherit the Encoder interface too.

XJey5 · January 14, 2021, 9:48am

We are struggling a bit with the placement of the estimation algorithm class. In the diagram shown in a previous post, we put the class outside of the encoder parent class. However in order to maintain/implement as much composition as possible, we can also put the estimation algorithm class inside the encoder class, thereby still outputting the velocity estimate values via the encoder, but calculated by the estimation algorithm class inside the encoder.
This will result in less changes inside other classes “neighboring” the encoder class, like axis.
What do you think?

Samuel · January 15, 2021, 11:24am

With the future direction of the firmware in mind I would say putting the estimator algorithm class outside of the encoder parent class is more future-friendly.

Jorijs · February 5, 2021, 8:49am

Hi there, here is both an update on our progress as well as a problem we’ve run into.

We have currently implemented the encoder parent class, the inherited incremental encoder class, an encoder processor class (features estimation algorithms, update functions etc.) and a encoder manager class which is basically the interface between these classes and for anything wanting to use encoder data. As we now have the general structure up and running for a single type of encoder we can now actually start testing whether everything functions like it should.

We have, however, run into the problem that adding a new case in the update function (for a undefined encoder state, which is necessary for the initial encoder parent class) breaks the communication between the ODrive and the odrivetool. Below I’ll add the code snippit.

bool EncoderProcessor::update(){
// update internal encoder state.
int32_t delta_enc = 0;
int32_t latched_count = raw_count.any().value(); //LATCH value to prevent override

switch (encoder->config_.mode) {
    case Encoder::MODE_UNDEFINED:
    {

    } break;

    case Encoder::MODE_INCREMENTAL: {
        //TODO: use count_in_cpr_ instead as shadow_count_ can overflow
        //or use 64 bit
        int16_t delta_enc_16 = (int16_t)latched_count - (int16_t)shadow_count_;
        delta_enc = (int32_t)delta_enc_16; //sign extend
    } break;

If we comment the case Encoder::MODE_UNDEFINED block and flash the odrive with the firmware, no issues are present (besides the impact of that part of logic missing).

Do you have any clue why it might screw up the communication between the odrive and odrivetool?

Jorijs · February 8, 2021, 9:25am

So I’m not entirely sure what the problem was, however, it has been fixed by simply adding a return statement to the MODE_UNDEFINED case. This is allowed as the undefined encoder has nothing to update and it has the added benefit of skipping the unnecessary successive body of the update function.

Jorijs · February 9, 2021, 1:29pm

We are currently running into the problem that the firmware crashes whenever we try to connect inputports to outputports during runtime. I assume this has to do with their member functions not being thread safe as documented. We’ve been trying to find a runtime friendly solution. We’ve tried implementing the problematic part of the code using a critical section to no avail. Similarly reading them directly using pointers has proven unsuccessful.

Do you guys have any suggestions?

Cheers

Jorijs · March 1, 2021, 3:51pm

Hi fellas,

It’s been some time since our last update so here goes.
We’ve managed to successfully implement our intended structure with functioning incremental and absolute encoders. Which is great news.
We’ve run into new problems though, while trying to increase the amount of configurable encoders from 2 to 6. The problem seems to be in using the asynchronous transfer of the spi_arbiter by more than 2 absolute encoders.

Debugging is still rather difficult due to our feedback being limited to the odrivetool no longer reacting as a sign that the firmware got stuck in undefined behaviour or something like that.

As usual any suggestion on the spi difficulties or a way to maybe make debugging a bit easier are very welcome.

Cheers

Wetmelon · March 1, 2021, 6:27pm

You’re not using an STLinkV2 for debugging with GDB?

Jorijs · March 3, 2021, 10:18am

We have not used an STLink yet. Though I’m currently trying to get it up and running. I’m using vscode but when setting breakpoints, it indicates that no source file can be found with the name of in which the breakpoint is set.

Riewert · March 3, 2021, 12:16pm

In VSCode, when you click on Terminal->Run Task, are the options Build and flash - ST-Link available, and have you been able to get them to complete succesfully?

Your life will be a lot easier with debugging properly setup.

Jorijs · March 3, 2021, 1:10pm

Yes they are able to complete successfully.
I’ve been able to run the debugger though added breakpoints in the source files are ignored. Whenever paused, it shows assembly instructions rather than source code. It seems like it’s unable to link the source files. When run using make gdb, the TUI displays assembly instructions and when trying ‘layout src’ displays that there are no source files available.

Riewert · March 3, 2021, 3:31pm

Maybe @Samuel can help you out.

Samuel · March 4, 2021, 9:05am

It sounds like your .elf file doesn’t contain debug info. Can you check if the compile commands include the -g flag? You can put in a deliberate syntax error so that the build system shows an error and dumps the compile command.

You can also check arm-none-eabi-readelf --debug-dump=info build/ODriveFirmware.elf | wc -l. If I build with -g I get 1277898 (huge amount if debug info) and without -g I get 1002 (almost no debug info).

Regarding the SPI transfers: currently we start SPI at the beginning of a control loop iteration, then update a few unrelated components and then by the time we get to the encoder’s update() call we expect the SPI transfers to be done. That means there’s not much time for the transfers. If this becomes a problem then maybe you can use the SPI results one control iteration later. That way the SPI transfers can span the whole 125µs iteration.

Jorijs · March 4, 2021, 9:31am

The command does seem to use the -g flag:
arm-none-eabi-gcc -x assembler-with-cpp -c Board/v3/startup_stm32f405xx.s -DUSB_PROTOCOL_NATIVE -DUART_PROTOCOL_ASCII -DSTM32F405xx -DARM_MATH_CM4 -mcpu=cortex-m4 -mfpu=fpv4-sp-d16 -DFPU_FPV4 -DHW_VERSION_MAJOR=3 -DHW_VERSION_MINOR=6 -DHW_VERSION_VOLTAGE=56 -D__weak="__attribute__((weak))" -D__packed="__attribute__((__packed__))" -DUSE_HAL_DRIVER -mthumb -mfloat-abi=hard -Wno-psabi -Wall -Wdouble-promotion -Wfloat-conversion -fdata-sections -ffunction-sections -g -gdwarf-2 --g -Og -flto -ffast-math -IBoard/v3/Middlewares/Third_Party/FreeRTOS/Source/portable/GCC/ARM_CM4F -IBoard/v3/Middlewares/Third_Party/FreeRTOS/Source/include -IBoard/v3/Middlewares/Third_Party/FreeRTOS/Source/CMSIS_RTOS -IBoard/v3/Middlewares/ST/STM32_USB_Device_Library/Core/Inc -IBoard/v3/Middlewares/ST/STM32_USB_Device_Library/Class/CDC/Inc -IBoard/v3/Drivers/STM32F4xx_HAL_Driver/Inc -IBoard/v3/Drivers/STM32F4xx_HAL_Driver/Inc/Legacy -IBoard/v3/Drivers/CMSIS/Device/ST/STM32F4xx/Include -IBoard/v3/Drivers/CMSIS/Include -IBoard/v3/Inc -I. -o build/obj/Board_v3_startup_stm32f405xx.s.o
(with the --g being the wrong addition to cause a syntax error).

The arm-none-eabi-readelf --debug-dump=info build/ODriveFirmware.elf | wc -l does return 1006 though which is not inline with what you indicated.

Regarding the SPI transfers: currently we start SPI at the beginning of a control loop iteration, then update a few unrelated components and then by the time we get to the encoder’s update() call we expect the SPI transfers to be done. That means there’s not much time for the transfers. If this becomes a problem then maybe you can use the SPI results one control iteration later. That way the SPI transfers can span the whole 125µs iteration.

This sounds like a very plausible root of our problem. Thanks for the added suggestion on how to fix this. We’ll try to implement it.

Thanks for the help!

Jorijs · March 9, 2021, 3:41pm

The debugger now works . The crux of the problem was LTO still being enabled in the tup.config.

Jorijs · March 19, 2021, 2:34pm

So we got a basic implementation running. But when running current control with 2 motors at the same time we get control_deadline_missed as error. Looking at the ControlLoop_IRQHandler function; it seems like there is an exact amount of clock cycles that are supposed to have occurred for this error not to occur.

// If we did everything right, the TIM8 update handler should have been
// called exactly once between the start of this function and now.

if (timestamp_ != timestamp + TIM_1_8_PERIOD_CLOCKS * (TIM_1_8_RCR + 1)) {
    motors[0].disarm_with_error(Motor::ERROR_CONTROL_DEADLINE_MISSED);
    motors[1].disarm_with_error(Motor::ERROR_CONTROL_DEADLINE_MISSED);
}

So assuming that our changes has altered the amount of clock cycles, would we need to alter this? And if so, how would we find the new value?

Riewert · March 20, 2021, 8:23am

Do you get the error intermittently, or every cycle/always?

Because that error can also occur when the controller is unstable. Does it persist when you lower controller effort?

Jorijs · March 21, 2021, 10:20am

We get the error whenever we put the second motor in closed loop control. So no actual actuation of either of the motors has occurred yet.

Wetmelon · March 21, 2021, 5:23pm

It’s not an exact number, just an upper limit. How much extra processing time do you figure you added? Make sure you’re compiling with -Ofast and ideally LTO, but that might be broken.

Riewert · March 21, 2021, 7:49pm

Does the error persist when you turn off debugging?