Really weird problem - spi_error_rate has become self-aware

I am having serious problems getting an AS5047P to work with an SPI, and am at my wits end with it. I have been using incremental mode for a while now with no problems at all so I’m sure it’s specific to SPI and the SPI noise issues that have been talked about before on this forum.

What I am seeing is that, whenever I try a physical fix to reduce noise levels, the spi_error_rate value improves for a short time, then mysteriously “undoes” itself and goes back to how it behaved before.

But the physical fixes are still there. I haven’t taken them out. It’s as if the background noise level is taking them into account and increasing itself to compensate, in some sort of cosmic closed loop control.

I got the impression this was happening as I was going along, and dismissed it because it felt insane. But on trying the ferrite ring, the spi_error_rate dropped completely to zero, and I cheered success.

Which made it undeniable when, a moment later, the error rate changed its mind and went back to doing what it was doing before.

Does anyone at all have any ideas here?

Detail:

  • I am trying, but currently unable, to run the AXIS_STATE_FULL_CALIBRATION_SEQUENCE.
  • Currently you can’t complete the cycle all the way through because spi_error_rate jumps too high.
  • So I have followed all the suggestions I could find on this board to try to reduce noise.

To be specific, I have tried

  • using 3.3V instead of 5V, making sure to use the same header as the SPI wires use.
  • putting resistors in series with the SCK line. (I tried 10R and 100R, only having one of each, and not having anything between 20R-50R.) (I have now taken these out again because they made no difference.)
  • disabling the error bit check in the firmware as suggested here.
  • increasing the spi_error_rate threshold to 0.5 as @towen did here.
  • replacing the five SPI wires with fatter cables, and braiding them together.
  • improvising a ferrite ring and wrapping the motor cables around it.
  • tying the metal chassis to ground (Took this off again because it made no difference and I didn’t like it being there.)

In software I have been simply running
odrv0.axis0.requested_state = AXIS_STATE_FULL_CALIBRATION_SEQUENCE
followed by
dump_errors(odrv0,True)
over and again.

What I see is the error rate shooting up as soon as power goes to the motor, and the calibration cycle erroring out with ENCODER_ERROR_ABS_SPI_COM_FAIL

Here is the screenshot of the typical results:

spi_error_rate is orange, pos_estimate is blue. 0.5 is the new bomb-out threshold for the spi_error_rate, so once the orange line gets higher than that the Odrive errors out.

I did not think to take a screenshot of the calibration cycle immediately after the ferrite ring installation, because I did not expect this to happen. But the graph then showed spi_error_rate as a completely flat line.

I can’t, obviously, reproduce the behaviour, because if I could this wouldn’t need an explanation.

Has anyone ever seen anything like this before? Does anyone out there have any ideas at all?

I feel your pain!! I hate it when my projects get smarter then I am. They start wanting to take control!!! :crazy_face:

How long is your SPI cable? Are they individual wires, or a pre-made cable?

Can you send pictures of the wire connection to the Odrive, and the wire connection to the motor and encoder?

I assume you are still using incremental mode, but want to use spi to change some encoder values?

-John

Thanks for replying. I’ve been watching this post all day and feeling like an outcast.

I will take some photos when I can and upload them here.

SPI cable is a braid of individual wires, several inches long.

But I honestly don’t think it can be the wire length, because that doesn’t explain what I saw after installing the ferrite ring. The wires were the same length then, and spi_error_rate was perfectly steady at e-5.

I want SPI literally just to skip the startup calibration cycle, and that is all. I actually thought about whether I could cheat by automatically switching to SPI on start up, reading the position (SPI reads fine when the motor isn’t powered) then switching back to incremental and updating shadow_count to the right value.

Unfortunately ODrive didn’t like me doing that, and got confused.

I ran over and took some photos. It’s a bit dark but hopefully you can see what you need.

Here’s the general installation:

This is the motor mounted into the frame. The encoder is on one side, the other side is driving a belt. The “ferrite ring” is a timing pulley I was lucky enough to find lying around. There was barely enough length to do a single turn around the ring.

Close up of the encoder, bolted to the chassis. You can see the fatter, braided wires are SPI and they are causing the trouble. The four thin ones are ABI and ‘Test’, which I tied to GND. They’ve caused no trouble at all despite being ordinary ribbon cable.

Here is the view from up inside the chassis, where the ODrive is mounted. You can see the wires plugged into the header. The two braids are black+red = 3.3V and GND, white+green+blue = SCK, MOSI, MISO. Both braids are coming straight from the encoder.

1 Like

yes, somewhat hard to see. However, it looks like you have the ferrite ring ( gear! :wink: ) on the three power leads to the motor. Is that where they are? They should be on the SPI lines.

It looks like SPI is a real PITA. There are a lot of topics talking about ringing on the SPI lines. I take it you do not have a scope…?

I don’t use SPI on the odrive. The only thing I could suggest is looking into dropping the SPI CLK speed to the bare minimum, but I don’t see any way to do that without recompiling.

-John

However, it looks like you have the ferrite ring ( gear! :wink: ) on the three power leads to the motor. Is that where they are? They should be on the SPI lines.

Is that true? All the talk and images I’ve seen have shown it on the motor cables. They being the ones that carry the high power and cause the noise in the first place.

And, as I said, it worked that way, for a brief window, before mysteriously unworking again.

Yea, you are right.

Where does the ground line from the encoder go? Directly to the odrive and nowhere else, or does it go somewhere else?

Can you monitor the SPI error rate and encoder possition when you turn the motor by hand? Does the odrive set any error codes when you do this?

What is your power supply? Battery? What current capacity?

Starting to throw darts now…

-John

I’ve been stuck on a similar problem for a week now, I find that the encoder works only briefly in certain positions but will never calibrate properly. When checking data with an oscilloscope as well moving the motor manually will send 15 or 16 bit numbers as shadow counts when the ASM encoders only have 14 bit packets. Try manually setting offset and direction if youre getting good quality data before you calibrate or move the motor. Good luck kind sailor!

Shadow count can go > 14 bits, you want to monitor pos_abs

Can you get hold of a real ferrite ring? I’m not sure about the high frequency impedance of aluminum… :joy:

Also, if you have been using the incremental interface without issues, I wonder if it would be easy to mod the firmware so that it initialises the position with SPI on boot, but then switches over to incremental?
It would mean adding a new encoder mode, combining the features of ENCODER_MODE_INCREMENTAL and ENCODER_MODE_SPI_ABS_AMS I suppose.

Also I think you could do it by setting encoder.config.offset to whatever you read over SPI ?

Do you have any resistors in your SPI wires? I found that adding 50R in series with MISO, MOSI and SCK helps.

Also, off-topic but I’d strongly recommend 3D printing some plastic covers for those motors (out of a heat resistant plastic like PETG or PC if possible) - It’s easy to pick up tiny bits of iron swarf (especially with welded steel frames around) , and they really wreck the motor if they get stuck in between the magnets and the windings.

I have tried to do exactly this, so far without success. In fairness, I was in a bit of a hurry and without much time to dedicate to it. And I was trying to do it from the outside, without messing around in the firmware if I could help it (just because I don’t want to do any harm.) I got the offset, no problem, but wasn’t able to persuade the ODrive that it had completed its calibration cycle, so it refused to enter closed loop control. I need to get back to this at some point.

It’s easy to pick up tiny bits of iron swarf (especially with welded steel frames around) , and they really wreck the motor if they get stuck in between the magnets and the windings.

Yep, we’ve learned that one the hard way. And you can’t just shake them out either, because magnets. Fortunately my girlfriend is very deft with a pair of tweezers.

Can you get hold of a real ferrite ring? I’m not sure about the high frequency impedance of aluminum…

Do you have any idea why an aluminium ring might work once, then stop? Because that’s what I swear I saw.

I suppose I need to buy the 50Rs and a proper ferrite ring. The existing ring was a real bitch to get in there cause of the cable length, so I’m really not looking forward to replacing it.

I just want to say that I had an error that manifested almost exactly like this in my dual AS5048A setup with two large motors (TMotor U8 KV100’s), but applying the fixes mentioned on this thread totally fixed it.

I was convinced that this was solely due to EMI for the longest time, but it looks like there is a race condition in the control loop as to how long it takes the encoders to return the SPI data, and delays will uptick the spi_error_rate.

I’ll mention that the only real code difference at this point is that I also removed the 14th bit check as shown here, but I made revert this change too.

2 Likes