Error in Firmware - shadow_count

Hello there,
I just dusted of that old odrive controller board and started building a little something…

I updated the board firmware to 0.5.2 and suddenly, Motor 0 stopped working. I have gone through half a day of debugging hardware, because I thought I messed something up…

But well… after running out of options, I did downgrade the firmware, not to whatever was in there before, but to 0.5.1 … that solved the issue.

I use hoverboard motors with hall sensors.

Turns out that in firmware 0.5.2, despite showing the hall_state correctly on both axis0 and axis1, the
axis0.encoder.shadow_count remains at 0, the axis1 shadow_count reacts as usual.

This does not happen in 0.5.1 - but it is the cause of the motor 0 not working any more.

Hope someone can fix this :wink: - or maybe someone has, then please kindly point me to the solution.

Kind regards from Switzerland,
Tim

:thinking: weird. Did you try 0.5.2 and odrv0.erase_configuration()?

Hi there,
sorry for the late answer.

To give you a more complete view (sorry for the wall of text), I did the following as of now:

Long wall of text describing my efforts
  • Dusted of the old board that had still connections on M0 from a hoverboard motor test long time ago. That test with some early firmware (not sure, it was many months ago i purchased the board) did work, so I was sure (later), that M0 has worked.
  • Removed that old wiring (no connectors board). Soldered the wires I extracted from an original hoverboard controller board to both M0 und M1
  • Installed odrivetool on windows, and tried the websites tutorial
  • Commands did not work out - well, of course, firmware was old. Did not check which version it was, but updated with odrivetool dfu
  • Firmware update aborted on Windows 10, as there seems another thread here regarding this
  • Searched frantically for that st-link, found it, and revived the board.
  • Now the tutorial worked. I copied the commands and edited them for M1, executed them and all was fine.
  • Using the commands for M0 failed. Strange. Did I do something wrong?
  • Checked the fires, found out that one connector for the hall sensors had swapped connections, mitigated that, still M0 was not going through encoder calibration…
  • I spend several hours of hardware-debugging (swapping motors, wired, known good vs fresh out of the part shelf etc etc etc) … it seemed that either my M0 port was defective.
  • I tried to find a way to display input values, like maybe I blew a port on the controller and only a partial signal is present… when I noticed the hall_state was changing on M0, but no change was there on the shadow_count.
  • I compared M0 to M1 signals, they matched → my conclusion was that input signal was ok, but maybe the firmware was not ok
  • Checked the firmware, updated again, did the erase_configuration() and I think I got the board from testing with 0.5.1-dev to a 0.5.2 - which had the same issues.
  • All right, I know it worked on M0 … let’s test 0.5.1 software.
  • Flashed the 0.5.1 release, erase_configuration, and on the first attempt event before doing the encoder calibration, shadow_count worked as expected.
  • That evening I posted the entry. I later (next day) experimented a bit further and got some parts working as I wanted them to work.
  • There remain some issues with 0.5.1, some strange behaviour overall, so I tried to reflash 0.5.2 to see if it was just some sort of glitch - well, again, M0 shadow_count remained 0, M1 working fine.
  • I will in the next step try the 0.5.3 branch, or to get the thing running with 0.5.1 for now.

(When writing “flash the firmware”, I always used odrivetool dfu on a Linux Laptop, not using the workshop windows computer for that task. I only had to use the st-link to revive the board, all later action were done with dfu on linux. I regulary used erase_configuration to be sure to have no old config artifacts left.)

Hope this draws a better picture on my scenario. Just a few words about me, as I am new here: I design electronic circuits for a long time, ham radio license for over 25 years now, working in software development as a day job… so I know how offensive it sounds when I say “there may be a bug in the firmware”. I also see that either that bug sits there for over 2 months unnoticed, or only a small part of the odrive community really uses the hoverboard motors (and the bug is in the specific parts for the hoverboard hall encoder), or still it is a hardware issue with my board. Seeing option 1 and 2, I totally agree that it could be a hardware issue unless I can replicate it on another hardware.

I only have one odrive (56V 3.6 btw, this may be important too), so I cannot compare one board to another, but I am thinking about buying one ore two more soon…

I can keep you posted what happened after testing 0.5.3, I think I’ll have some spare time tomorrow,
and if you have suggestions, custom firmware for debugging or whatever it takes to reproduce or resolve this, I am willing to test these things out. As I am from Switzerland, please keep in mind that the timezone here is UTC+2.

Thanks for reading,
Greetings from Switzerland,
Tim

Hehe well I’m sure you know we’re not offended by bug reports, especially ones so detailed! Only when people try nothing and complain - you’ve clearly done a lot of debugging.

So one big thing to note… in 0.5.2 we added a “Hall effect sensor polarity” calibration. I’m not entirely sure how much testing this had; it’s certainly possible there are edge cases we missed. So the new process is:

  1. Motor calibration
  2. Hall effect sensor polarity calibration
  3. encoder offset calibration

Then everything should work. However, it’s possible that the polarity calibration is failing? Can you check encoder.config.hall_polarity_calibrated and encoder.config.hall_polarity after full calibration?

The code that handles the “hall polarity check” is here: ODrive/encoder.cpp at fw-v0.5.3 · odriverobotics/ODrive · GitHub

The code that handles the “update” phase of the encoder (aka the shadow count part) is here: ODrive/encoder.cpp at fw-v0.5.3 · odriverobotics/ODrive · GitHub

There’s also a “phase” check which I’m not even sure is being used at the moment.

We have a developer in Switzerland! If this is for work/business we can maybe have them reach out to you.

Thanks for the reply :wink: That was quick ^^

Erm, yes, the whole debugging thing is done because the calibration of the encoder fails, in the polarity calibration step. The offset calibration did not work after that, at some point I even tried to just write the same values from M1 into the available config values of M0 to kind of simulate a successful polarity calibration - did not work. As I did not look at the code at all (yet), I just think that not everything is put into available variables and it would never worked out like I tried.

From memory, after the polarity step, I got error 0x16 (if I remember the number correctly, it resolved to the ERROR_ILLEGAL_HALL_STATE), and I think I got something like …_NOT_CALIBRATED_YET as second error when trying to do the offset calibration without polarity calibration.

The motor calibration, just to mention it, never failed. Also, the motor on M0 spins during the encoder calibration before failing.

If I find the time to dive into the code, which may be somewhere in the middle of the week at best, I will certainly do that. The project I am using the odrive is the classic moving plattform - I found a new hobby where it is sometimes required to move some equipement from the parking lot to the starter zone… and back after the event, it may also transport me sitting on the equipement box :wink: So you can imagine I want this project to be done, even if it requires to debug some code before even starting the software that is connected to the odrive to control it.

The project will be seen by others in the hobby, at every event that I take it. There will be some that also want this type of vehicle, and even not only using a remote controlled platform but with a camera and some other additions, it could even be an active element in that hobby. There are not many “makers” in this hobby, most only “buy, use, tune a bit” … so I cannot call it a business thing yet :wink:

As already said, I try to figure out what happens on 0.5.3, will most likely take a look at the code, and try to find a way to reproduce it on different hardware to be sure it is not a hardware fault.

At this point in time, I am doing that project as a hobby, but if you or your swiss developer is interested in having a look at my setup or further details to try to reproduce it, I would be happy to do some video call or even meet that developer (I am living in the St. Gallen area).

Kind regards,
Tim