Odrive dies spontaneously

I’m using Odrive to drive a fairly large 4 wheeled vehicle (2 driven wheels and 2 castor wheels). I’m using the 150kv Odrive motors, with a 45:1 reduction attached.

The desired position is streamed via step/dir. After driving around for a bit i stopped the vehicle (keeping STEP low). I walked away, and when i came back a few minutes later the odrive didn’t respond to STEP inputs anymore. I connected via USB (without rebooting) and saw the following (using odrivetool):

system: no error
axis0
  axis: Error(s):
    AxisError.MOTOR_FAILED
  motor: Error(s):
    MotorError.DRV_FAULT
    MotorError.CURRENT_SENSE_SATURATION
    MotorError.UNKNOWN_CURRENT_MEASUREMENT
  DRV fault: FETLC_OC, FETHC_OC, FETLB_OC, FETHB_OC, FETLA_OC, FETHA_OC, OTW, OTSD, PVDD_UV, GVDD_UV, FAULT
  sensorless_estimator: no error
  encoder: no error
  controller: no error
axis1
  axis: Error(s):
    AxisError.MOTOR_FAILED
  motor: Error(s):
    MotorError.DRV_FAULT
  DRV fault: none
  sensorless_estimator: no error
  encoder: no error
  controller: no error

After disconnecting the motors and rebooting (leaving USB & vbatt disconnected for a few minutes) the status is as follows:

system: no error
axis0
  axis: Error(s):
    AxisError.MOTOR_FAILED
  motor: Error(s):
    MotorError.DRV_FAULT
  DRV fault: GVDD_UV, FAULT
  sensorless_estimator: no error
  encoder: no error
  controller: no error
axis1
  axis: Error(s):
    AxisError.MOTOR_FAILED
  motor: Error(s):
    MotorError.DRV_FAULT
  DRV fault: GVDD_UV, FAULT
  sensorless_estimator: no error
  encoder: no error
  controller: no error

After some research i suspect the DRV’s are gone. Because i wasn’t there when it happened i can’t tell if the motors did something on their own (sound, movement). I do however have a logfile containing values read via UART before, during and after the failure. It seems that the odrive tried to pump 100A through each motor, pulling 28A from the battery. Why this happened, and how it can kill the DRVs i have no idea whatsoever.

[19:05:40.304] [ODRIVE] {"error":0,"vbus":38.1619,"ibus":4.52997,"left":{"error":0,"ibus":3.65107,"iq":19.2277,"power":112.408,"velEstimate":-15.55,"posEstimate":79.9716},"right":{"error":0,"ibus":1.48753,"iq":-9.59826,"power":50.625,"velEstimate":-16.0625,"posEstimate":5.14733}}
[19:05:46.504] [ODRIVE] {"error":0,"vbus":38.2537,"ibus":4.64241,"left":{"error":0,"ibus":1.96734,"iq":17.7518,"power":139.644,"velEstimate":-15.8,"posEstimate":85.2567},"right":{"error":0,"ibus":1.06968,"iq":-7.54238,"power":39.8481,"velEstimate":-15.7375,"posEstimate":6.69043}}
[19:05:52.704] [ODRIVE] {"error":0,"vbus":38.1772,"ibus":3.65403,"left":{"error":0,"ibus":2.29992,"iq":21.4929,"power":135.026,"velEstimate":-15.5125,"posEstimate":92.8858},"right":{"error":0,"ibus":1.19017,"iq":-6.43982,"power":39.6608,"velEstimate":-15.6,"posEstimate":14.3272}}
[19:05:58.904] [ODRIVE] {"error":0,"vbus":38.1619,"ibus":4.27325,"left":{"error":0,"ibus":2.74448,"iq":20.2779,"power":97.3148,"velEstimate":-12.7,"posEstimate":99.4892},"right":{"error":0,"ibus":0.637381,"iq":-6.41559,"power":42.1311,"velEstimate":-10.4,"posEstimate":32.5859}}
[19:06:05.104] [ODRIVE] {"error":0,"vbus":38.7436,"ibus":8.80945,"left":{"error":0,"ibus":6.06071,"iq":19.9546,"power":80.2805,"velEstimate":-13.3375,"posEstimate":43.1483},"right":{"error":0,"ibus":0.001088,"iq":-0.557,"power":0.02735,"velEstimate":0,"posEstimate":96.973}}
[19:06:11.304] [ODRIVE] {"error":0,"vbus":38.8507,"ibus":0.010879,"left":{"error":0,"ibus":0.008961,"iq":1.8559,"power":0.168333,"velEstimate":0,"posEstimate":35.544},"right":{"error":0,"ibus":0.001408,"iq":-0.549492,"power":0.027387,"velEstimate":0,"posEstimate":96.973}}
[19:06:17.504] [ODRIVE] {"error":0,"vbus":38.866,"ibus":0.009676,"left":{"error":0,"ibus":0.008315,"iq":1.75195,"power":0.165107,"velEstimate":0,"posEstimate":35.544},"right":{"error":0,"ibus":0.00141,"iq":-0.732296,"power":0.028585,"velEstimate":0,"posEstimate":96.973}}
[19:06:23.704] [ODRIVE] {"error":0,"vbus":38.8967,"ibus":0.010706,"left":{"error":0,"ibus":0.009094,"iq":1.75195,"power":0.166099,"velEstimate":0,"posEstimate":35.544},"right":{"error":0,"ibus":0.001472,"iq":-0.722845,"power":0.027467,"velEstimate":0,"posEstimate":96.973}}
[19:06:29.904] [ODRIVE] {"error":0,"vbus":38.9579,"ibus":0.010657,"left":{"error":0,"ibus":0.009026,"iq":1.81598,"power":0.166648,"velEstimate":0,"posEstimate":35.544},"right":{"error":0,"ibus":0.001538,"iq":-0.190595,"power":0.027127,"velEstimate":0,"posEstimate":96.973}}
[19:06:36.104] [ODRIVE] {"error":0,"vbus":38.9273,"ibus":0.010052,"left":{"error":0,"ibus":0.009887,"iq":1.63646,"power":0.166311,"velEstimate":0,"posEstimate":35.544},"right":{"error":0,"ibus":0.001642,"iq":-0.72021,"power":0.027612,"velEstimate":0,"posEstimate":96.973}}
[19:06:42.304] [ODRIVE] {"error":0,"vbus":38.9426,"ibus":0.011328,"left":{"error":0,"ibus":0.009503,"iq":1.6628,"power":0.168355,"velEstimate":0,"posEstimate":35.544},"right":{"error":0,"ibus":0.001321,"iq":-0.669704,"power":0.027724,"velEstimate":0,"posEstimate":96.973}}
[19:06:48.504] [ODRIVE] {"error":0,"vbus":38.9579,"ibus":0.011751,"left":{"error":0,"ibus":0.009334,"iq":1.72992,"power":0.168634,"velEstimate":0,"posEstimate":35.544},"right":{"error":0,"ibus":0.001109,"iq":-0.562894,"power":0.027509,"velEstimate":0,"posEstimate":96.973}}
[19:06:54.704] [ODRIVE] {"error":0,"vbus":38.9885,"ibus":0.00977,"left":{"error":0,"ibus":0.008399,"iq":1.84611,"power":0.170529,"velEstimate":0,"posEstimate":35.544},"right":{"error":0,"ibus":0.001214,"iq":-0.839205,"power":0.027648,"velEstimate":0,"posEstimate":96.973}}
[19:07:00.904] [ODRIVE] {"error":0,"vbus":38.9885,"ibus":0.009107,"left":{"error":0,"ibus":0.009005,"iq":1.78971,"power":0.165886,"velEstimate":0,"posEstimate":35.544},"right":{"error":0,"ibus":0.001256,"iq":-0.567536,"power":0.027465,"velEstimate":0,"posEstimate":96.973}}
[19:07:07.104] [ODRIVE] {"error":0,"vbus":39.0191,"ibus":0.010835,"left":{"error":0,"ibus":0.009305,"iq":1.78788,"power":0.164361,"velEstimate":0,"posEstimate":35.544},"right":{"error":0,"ibus":0.001421,"iq":-0.567948,"power":0.027417,"velEstimate":0,"posEstimate":96.973}}
[19:07:13.304] [ODRIVE] {"error":0,"vbus":39.0191,"ibus":0.010048,"left":{"error":0,"ibus":0.008193,"iq":1.72765,"power":0.168716,"velEstimate":0,"posEstimate":35.544},"right":{"error":0,"ibus":0.001396,"iq":-0.81631,"power":0.027969,"velEstimate":0,"posEstimate":96.973}}
[19:07:19.504] [ODRIVE] {"error":0,"vbus":39.0038,"ibus":0.00821,"left":{"error":0,"ibus":0.006696,"iq":1.85447,"power":0.163993,"velEstimate":0,"posEstimate":35.544},"right":{"error":0,"ibus":0.001393,"iq":-0.627854,"power":0.027609,"velEstimate":0,"posEstimate":96.973}}
[19:07:25.704] [ODRIVE] {"error":0,"vbus":39.0191,"ibus":0.010744,"left":{"error":0,"ibus":0.009135,"iq":1.82127,"power":0.166479,"velEstimate":0,"posEstimate":35.544},"right":{"error":0,"ibus":0.001361,"iq":-0.547328,"power":0.027972,"velEstimate":0,"posEstimate":96.973}}
[19:07:31.904] [ODRIVE] {"error":0,"vbus":39.0038,"ibus":0.010122,"left":{"error":0,"ibus":0.010065,"iq":1.57669,"power":0.169283,"velEstimate":0,"posEstimate":35.544},"right":{"error":0,"ibus":0.001374,"iq":-0.742856,"power":0.027831,"velEstimate":0,"posEstimate":96.973}}
[19:07:38.104] [ODRIVE] {"error":0,"vbus":38.9885,"ibus":0.010328,"left":{"error":0,"ibus":0.009614,"iq":1.7916,"power":0.16346,"velEstimate":0,"posEstimate":35.544},"right":{"error":0,"ibus":0.001401,"iq":-0.630611,"power":0.028189,"velEstimate":0,"posEstimate":96.9831}}
[19:07:44.304] [ODRIVE] {"error":0,"vbus":38.912,"ibus":0.295729,"left":{"error":0,"ibus":0.15486,"iq":9.69358,"power":1.30096,"velEstimate":0,"posEstimate":35.5389},"right":{"error":0,"ibus":0.141861,"iq":8.71433,"power":1.24091,"velEstimate":0,"posEstimate":96.9831}}
[19:07:50.504] [ODRIVE] {"error":0,"vbus":38.9273,"ibus":0.293477,"left":{"error":0,"ibus":0.1561,"iq":9.5432,"power":1.30506,"velEstimate":0,"posEstimate":35.5389},"right":{"error":0,"ibus":0.140122,"iq":8.75334,"power":1.24084,"velEstimate":0,"posEstimate":96.9854}}
[19:07:56.704] [ODRIVE] {"error":0,"vbus":34.2738,"ibus":28.2028,"left":{"error":0,"ibus":15.0571,"iq":99.9114,"power":14.7592,"velEstimate":0,"posEstimate":34.6642},"right":{"error":0,"ibus":15.4748,"iq":100.189,"power":-26.5018,"velEstimate":0,"posEstimate":97.3918}}
[19:08:02.904] [ODRIVE] {"error":0,"vbus":38.8201,"ibus":0,"left":{"error":64,"ibus":0,"iq":101.257,"power":41.4256,"velEstimate":0,"posEstimate":34.6689},"right":{"error":64,"ibus":0,"iq":100.189,"power":-26.5018,"velEstimate":0,"posEstimate":97.3918}}
[19:08:09.104] [ODRIVE] {"error":0,"vbus":38.8967,"ibus":0,"left":{"error":64,"ibus":0,"iq":101.257,"power":41.4256,"velEstimate":0,"posEstimate":34.6689},"right":{"error":64,"ibus":0,"iq":100.189,"power":-26.5018,"velEstimate":0,"posEstimate":97.3918}}
[19:08:15.304] [ODRIVE] {"error":0,"vbus":38.912,"ibus":0,"left":{"error":64,"ibus":0,"iq":101.257,"power":41.4256,"velEstimate":0,"posEstimate":34.6689},"right":{"error":64,"ibus":0,"iq":100.189,"power":-26.5018,"velEstimate":0,"posEstimate":97.3918}}
[19:08:21.504] [ODRIVE] {"error":0,"vbus":38.912,"ibus":0,"left":{"error":64,"ibus":0,"iq":101.257,"power":41.4256,"velEstimate":0,"posEstimate":34.6689},"right":{"error":64,"ibus":0,"iq":100.189,"power":-26.5018,"velEstimate":0,"posEstimate":97.3918}}
[19:08:27.704] [ODRIVE] {"error":0,"vbus":38.9273,"ibus":0,"left":{"error":64,"ibus":0,"iq":101.257,"power":41.4256,"velEstimate":0,"posEstimate":34.6689},"right":{"error":64,"ibus":0,"iq":100.189,"power":-26.5018,"velEstimate":0,"posEstimate":97.3918}}
[19:08:33.904] [ODRIVE] {"error":0,"vbus":38.9426,"ibus":0,"left":{"error":64,"ibus":0,"iq":101.257,"power":41.4256,"velEstimate":0,"posEstimate":34.6689},"right":{"error":64,"ibus":0,"iq":100.189,"power":-26.5018,"velEstimate":0,"posEstimate":97.3918}}

Are the Iq values you logged the setpoint or measured ones?
It looks like there was a huge spike of motoring power, did the bot move due to the fault event? I know you were not present, but maybe you know if it moved from where you left it?
Do you use the brake resistor circuit?
Overall pretty strange, I also don’t know how this could happen and/or kill the DRVs. Is it possible something shorted the board? Was there any visible damage?

iq is iq_measured. the bot didn’t “drive away”, i can’t tell whether it moved a bit (a few turns of the motor) though, as 1 turn leads to only ~30mm of movement (on a 1.5m long vehicle)
i don’t use a brake resistor (but there was no kinetic or potential energy at the time, so i think we can exclude overvoltage?)
the power source is a 12s LiFePo4, 50Ah
A short can be excluded, as everything in the box containing the odrive is unable to move (screwed/taped down). there was also no debris or liquid of any kind inside.
no visible damage on the top side of the odrive. i will check the bottom as soon as i’m there again.

I’m also really confused, especially since there was no commanded movement and both channels behaved almost identically (first a ~0.01 turn reported pos change, 8-9A of current, then a big change in reported pos & 100A iq, then both dead)

i’ll check whether the mosfets shorted vbatt (or gnd) to their gates. i guess that could kill the DRV? i had MOSFETs die before due to a slipping encoder reporting very wrong positions - but i don’t think that can be the case here, as

  1. it was stationary
  2. the encoder cannot slip anymore - it’s connected using an aluminium coupler, and i even ground a flat on the motor shaft to ensure zero possible movement
  3. the encoders are connected using shielded cables
  4. if the failure was external (mechanical) it would be unlikely to affect both channels simultaneously

All I can really think of is some sort of electrical interferance causing both channels to either get a crazy input, or get encoder interference causing encoder slip. Nothing else really affects both channels.
Whatever happened, it’s not really good that the ODrive got permanent damage.

Can we send you a pair of ODrive Pro as replacements? They are much more robust and unlikely to get damaged in an event where there is an issue. Please email info@odriverobotics.com and include your original order number.

Cheers.

I was thinking of interference too, but i cannot imagine where it should have come from. There were no other electronics nearby except for the Raspberry PI controlling the bot (connected via 1x serial and 2x step/dir, powered off a USB powerbank).

I was outside at the time, with no known sources of interference nearby. I’ve been using a 10’000 Count encoder, so if the 0.6 rotations were due to interference that would mean thousands of pulses i think. I have a hard time imagining how that can happen with shielded cables. Also, it’s been working fine so far, i had the same setup (just without the gearboxes/wheels) running for hours for testing my software. The only other problem i had recently was the known bug in the ASCII protocol (ODrive stops responding over UART/ASCII - #2 by Nicholas_Schneider) which is part of why i started using STEP/DIR.

I’ll be very happy to try my setup with ODrive pro - I sent you an E-Mail with my order number. Thanks a lot!

One more thing: Did i understand correctly that even a slipping encoder shouldn’t lead to destruction of the MOSFETs? Because i had that happen twice (until i figured out that the encoder was slipping), but thought it’s expected behavior. Fortunately i bought 2 ODrives and thus was able to use one as a MOSFET donor to repair the other when that happened.