Has anyone experienced behavior where there Odrive will stop responding to CAN messages after a few minutes. If I power cycle only the Odrive, the CAN works again. During a O-drive power cycle, the Teensy and Jetson still have power on their CAN transceivers and keep running as normal. After power cycling, the Odrive will again run for a few minutes before ceasing to respond (around 10000-20000 heartbeat packets). Opening o-drive tool during such an event shows no errors in dump_errors and all Odrive functions still work normally (holding motor position in this case).
The CAN is set to 250000 I am using the example node_id’s an requesting the CANSimple messages with RTR bits set. All messages are being requested at 10Hz. The CAN-H and CAN-L lines are around 30cm long.
Currently running the latest stable firmware released in September 2020.
Thanks @Wetmelon we have subsequently tried to reduce the frequency that we request messages. After 2 test runs I saw the same failure mode after about 2 hours on one of the tests.
We are only requesting data from axis0, during normal operation we see both the requested messages and heartbeat messages from axis0 and axis1. On failure, no messages are received on the CAN bus. I usually monitor using a cansniffer tool on linux, will post a log of the last few messages. All messages look normal with no errors reported. If I power cycle just the O-drive (the CAN tranceivers are kept powered), the messages return as normal. The O-Drive has a ground wire that runs along the CAN lines to keep everything (Jetson and Teensy) on a common reference. I have counted the number of heartbeat messages until failure, but the number is variable, so its not as deterministic as I hoped.
Currently, we didn’t check it via oscilloscope. But, if the symptom generate again, we will check and share it.
However, how could we check the traffic on bus?
Yes, odrivetool still working. Because, configuration value changed well.
Instruction odrv0.axis0.controller.input_vel = 1 also operate the motor.
And there are no error in dump_error(odrv0).
For debug the odrive firmware, we set up the two LED blinking at analog_polling_thread and can_server_thread.
But, the two LED blinking stopped, when the symptom is generated.
I am really concerned by this issue as I am planning to use a o drive through can for a multi motor project. If the leds are not blinking anymore there is a great probability that there is a firmware issue causing the code to stop running for some reasons. might be a counter overflow. If so you should also lost the physical signal of CAN. As said before it s interesting to check with a scope. If the issue does not appears when CAN is not enabled, that clearly point the firmware issue on code related to CAN. do you have the same issue over time when using o drive without enabling CAN protocol ? If not, you can try to blink a led along the CAN code to try to find where the problem comes from