Usb communication issue / AttributeError: 'RemoteObject' object has no attribute 'axis0'

Oskar, I was able to capture the errors with just try except block in python.
here is zoomed out view (blue vertical dashed line is trigger):


With sigrok signal decoder I don’t see any errors but here is the suspicious portion
(usb disconnects about 20 frames after the circled frame):




here is another capture session with same pattern
(seemingly random appearance of SETUP PID then shortly after resets):

I put the captured sigrok files on google drive link here: https://drive.google.com/drive/folders/1ZsbgBWiQUggRHibgv2StUjEyhWG2lpkM?usp=sharing

dmesg logs when this happens (I am using 4port usb hub):

[12443.847757] usb 3-1.1-port3: disabled by hub (EMI?), re-enabling...
[12443.849094] usb 3-1.1.3: USB disconnect, device number 29
[12443.851199] cdc_acm 3-1.1.3:1.0: failed to set dtr/rts

motor that causes this USB error (doesn’t happen when this motor is not in closed loop control):
24V 8poles
also USB error occurs faster if motor is trying to hold a position rather than constantly moving

"phase_resistance": 0.3050479292869568
"phase_inductance": 0.0002905561705119908

motor
let me know if you need more captures.

1 Like

I’m downloading your sigrok files so I can look closer in the viewer, I’ll report back if I discover something. Thx for making the capture. I assume you captured this on the ODrive side of the hub (like TP 1 and 2 on the ODrive)?

Either way, I think we have a strong hint as to what is going on:

Together with the fact that it only dies during active PWM on your motor: it is likely capacitively coupled EMI (Electro Magnetic Interference). So I have some followup questions:
How are phase wires routed to the motor? The motor has exposed metal mounting plate, is the chassis connected to mains ground? Does the ODrive share a GND connection (through DC- or otherwise) with the PC? For example, is V- of the power supply bridged to Protective Earth in your mains wiring? Are you using a laptop on battery, or a desktop or a charger with a ground pin?

captured signal is on the ODrive side of the hub, not TPs though (I split a short usb cable into two to expose the wire).

Phase wires are routed in bare unshielded UL1007 wires and motor chassis is only grounded through AMT encoder’s GND, not by anything else.

I am using a desktop PC; The GND is indeed connected to PC through USB DC-, and power supply GND which is bridged to Protective Earth (Desktop PC chassis is also bridged to Protective Earth).

Here are the phase wires and how the motor is sitting in the setup:


Here’s some info that may help analyze the capture:

#define CDC_IN_EP       0x81  /* EP1 for data IN */
#define CDC_OUT_EP      0x01  /* EP1 for data OUT */
#define CDC_CMD_EP      0x82  /* EP2 for CDC commands */
#define ODRIVE_IN_EP    0x83  /* EP3 IN: ODrive device TX endpoint */
#define ODRIVE_OUT_EP   0x03  /* EP3 OUT: ODrive device RX endpoint */

Note that the 0x80 is masked away on the IN endpoints, i.e. the ODRIVE endpoints are both endpoint 3, just that the IN ones get an extra 0x80 in how they are defined here.

So looking at the captures, and as far as I can tell, everything is operating normally. It seems what happens is that the python sends data (custom protocol reads) on OUT EP3, and data comes back on IN EP3 soon after. When the ODrive has nothing to send, the USB subsystem is still polling at a high rate on the IN EP3, but it’s all NAK most of the time (this is normal when ODrive has no data to send).

However sometime later this SETUP packets come in: these are activating EP1, which is the CDC device, aka the virtual serial port. This is likely a (linux?) driver on the PC that decided they wanted to talk to the serial port. From then on the USB subsystem also starts polling IN EP1 and at what seems double the rate (from every 50ish microseconds to every 25ish microseconds).

My guess is that it’s this faster poll rate that ends up triggering the USB hub to get an error/interferance density that is higher?

Not really a “true” solution, but you can try to disable the linux modem manager, which I suspect is the process that opens the CDC (virtual serial) device. There is a guide here, which involves tagging on , ENV{ID_MM_DEVICE_IGNORE}="1" onto the etc/udev/rules.d/91-odrive.rules and then reloading rules.

Of course the true solution is to eliminate the EMI. Do you have a ferrite on your USB cable?

Nice robot arm btw! ;D

I added the , ENV{ID_MM_DEVICE_IGNORE}="1" to the udev rules that get installed when you install odrivetool with pip, in this commit.

Will go out on the next release.

Thanks Oskar for your valuable time.
I updated the udev rules and reloaded and reran the script.
first of all, UBS ACM device is still detected just after all the ODrives are detected (dmesg)
(behaviors certainly changed though explained next):

[  128.406348] usb 3-1.1.4: New USB device found, idVendor=1209, idProduct=0d32
[  128.406354] usb 3-1.1.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[  128.406358] usb 3-1.1.4: Product: ODrive 3.5 CDC Interface
[  128.406362] usb 3-1.1.4: Manufacturer: ODrive Robotics
[  128.406366] usb 3-1.1.4: SerialNumber: 208037713548
[  128.445532] cdc_acm 3-1.1.2:1.0: ttyACM0: USB ACM device
[  128.448513] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  128.451809] cdc_acm 3-1.1.4:1.0: ttyACM2: USB ACM device
[  128.452195] usbcore: registered new interface driver cdc_acm

It took longer for usb error to occur this time.
Before the error occurs, there was a whole page of the following in dmesg:

[  231.049779] usb 3-1.1.3: reset full-speed USB device number 14 using xhci_hcd
[  231.151736] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  231.441773] usb 3-1.1.3: reset full-speed USB device number 14 using xhci_hcd
[  231.543378] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  232.397740] usb 3-1.1.3: reset full-speed USB device number 14 using xhci_hcd
[  232.685485] usb 3-1.1.3: device descriptor read/64, error -71
[  232.899498] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  232.981764] usb 3-1.1.3: reset full-speed USB device number 14 using xhci_hcd
[  233.083210] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device

similar EMI error after exception is thrown:

[  474.122768] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  475.833540] usb 3-1.1-port3: disabled by hub (EMI?), re-enabling...
[  475.834883] usb 3-1.1.3: USB disconnect, device number 55
[  476.063461] usb 3-1.1.3: new full-speed USB device number 56 using xhci_hcd
[  476.170091] usb 3-1.1.3: New USB device found, idVendor=1209, idProduct=0d32

Also the exception that is being caught seems to have changed from
AttributeError: ‘RemoteObject’ object has no attribute ...
to
received unexpected ACK: 8068
(number after ACK is different each time)

I added two sigrok captures again to the same link:
https://drive.google.com/drive/folders/1ZsbgBWiQUggRHibgv2StUjEyhWG2lpkM?usp=sharing

EDIT:
I can confirm the blacklist is working (I did upgrade modemmanager from 1.6.4 to 1.8.2 so I can set filter options). But still couldn’t figure out how to stop ubuntu from polling the ACM devices. I’ll post an edit if when I get a chance to try again. I might try opening the USB hubs I have and compare how different USB hub ICs behave. Glad to know there is not a single error packet caused by ODrive in the capture.

>>> sudo mmcli -G DEBUG;
Successfully set logging level
>>> journalctl -f | grep "ModemManager.*\[filter\]"

(plug in ODrive)

ModemManager[1076]: <debug> [filter] (tty/ttyACM0): port filtered: device is blacklisted

how did you upgrade from modemmanager 1.64 to 1.8.4?
i did not have modemmanager installed at all while observing the odrive usb error. apt installed modemmanager 1.6.4 on my ubuntu 16.04

when i try your commands to test usb filtering i get

root@MTBD00694:~/odrive_test# mmcli -G DEBUG;
error: couldn’t find the ModemManager process in the bus
root@MTBD00694:~/odrive_test#

==>looks like it cannot know whether usb filtering works or not

can you elaborate on the hardware/software part of your CAN communication ?
you post in CAN dscussion suggests this for CAN-RS232 conversion + FTDI based RS232-USB (can you send us a link where you bought yours plz?) + code screenshoots suggests a custom implementation in c/cpp to send CAN commands (over usb then)

@madcowswe => i’m really stuck here.
an unreliable system cant go in production.
it really is a pitty because I remember massively sending odrive USB position commands for 3 weeks 24/7 and everything was fine (without any load on the motor though). ok that was n late 2017, so many things must have changed since then. but the whole point was precisely to avoid the situation i’m facing now

What should i do here? switch to windows (omg!) ? use ubuntu 18 ? use a raspberry pi? dont touch my ubuntu but use another odrive interfacae? surely someone managed to use odrive in a reliable way, right?

If your ubuntu didn’t have modemmanager installed, I think it wouldn’t have caused USB crash in the first place.
but if you wish to try 1.8 you can install it by:

sudo apt-repository ppa:aleksander-m/modemmanager-xenial
sudo apt-get update
sudo apt-get --only-upgrade install modemanager

I had to edit /lib/systemd/system/ModemManager.service

to have the following lines under [Service] section:

...
ExecStart=/usr/sbin/ModemManager --filter-policy=default
...
Environment="MM_FILTER_RULE_TTY_ACM_INTERFACE=0"

I think there is also a snap package which may be installed:

snap list | grep modem-manager

As for the CAN hardware, any CAN <-> serial converter would work. One I am using you can buy from this link in ebay ( https://www.ebay.com/itm/SystemBase-sCAN-RS-232-DE-09S-DB9-female-CAN-DE-09P-DB9-male-serial/323363066646?hash=item4b49f0d316:g:0D4AAOSwqj5bVfod&frcectupt=true)
However I think for this price you have other options from larger manufacturers.
Using this kind of CAN converter makes the host side software pretty much the same whether you use ODrive’s uart or can interface (read/write to serial). I am using a python program using pyserial_asyncio library.

After looking at the captures from logic analyzer, I don’t think there is really a cause in the ODrive’s usb implementation, but the EMI noise affecting the host side (in my case a USB hub). I think USB connection might not be the best choice for interfacing with a motor controller because of electrical noise and the way USB protocol maintains connections. I think your best bet is trying either UART or CAN interface. The Stanford Doggo project used UART.

i am confused.
due – in part to this bug – we moved the entire machine back to the lab a few days ago (change of building)

  • 2 days ago, i did the test with the ferrite on the usb cable near the odrive board + the udev , ENV{ID_MM_DEVICE_IGNORE}="1" rule + install modemmanger 1.6.4 => the machine ran in infinite loop from 13h to 16h30 (3h30 straight). we turned it off becase we were sick with the noise and thought maybe the issue was gone
  • yesterday, i reverted back to how it was when the bug appeared (default udev rule + no ferrite + uninstalled modemmanager), in order to observe that the bug does reappear. It ran again in infinite loop from 10h30 to 15h30 (5h straight) => the bug did not occur either

=> I now seem unable to reproduce the bug. :cry: :cry: :cry:

i have started a test at 15h30 and will ask the gard to hit the emergency stop in case it’s still running at 20h.

i’m really lost here.

If you need it to be running for hours without any interruption really try the other interfaces and save yourself lots of headache down the road… Especially if you have any cats around.:pouting_cat:

hello,

after a couple weeks in standby, i worked again with odrive.

making runs of unattend use (recorded on timelapse videos)
tuesday may 7th from 14h to 23h => 9h
thursday may 9th from 11h to 17h30 => 6h30
friday may 10th from 8h30 to 17h => 8h30
=> all those ran without any issue.

and this morning, the usb bug triggered once after about a minute and never reproduced ever since…

May 13 11:44:41 MTBD00694 kernel: [  940.753530] usb 3-5: new full-speed USB device number 9 using xhci_hcd
May 13 11:44:41 MTBD00694 kernel: [  940.883351] usb 3-5: New USB device found, idVendor=1209, idProduct=0d32
May 13 11:44:41 MTBD00694 kernel: [  940.883354] usb 3-5: New USB device strings: Mfr=1, Product=2, SerialNumber=3
May 13 11:44:41 MTBD00694 kernel: [  940.883357] usb 3-5: Product: ODrive 3.5 CDC Interface
May 13 11:44:41 MTBD00694 kernel: [  940.883358] usb 3-5: Manufacturer: ODrive Robotics
May 13 11:44:41 MTBD00694 kernel: [  940.883359] usb 3-5: SerialNumber: 367033663037
May 13 11:44:41 MTBD00694 kernel: [  940.883957] cdc_acm 3-5:1.0: ttyACM1: USB ACM device
May 13 11:44:41 MTBD00694 mtp-probe: checking bus 3, device 9: "/sys/devices/pci0000:00/0000:00:14.0/usb3/3-5"
May 13 11:44:41 MTBD00694 mtp-probe: bus: 3, device: 9 was not an MTP device
May 13 11:45:55 MTBD00694 kernel: [ 1015.409504] usb 3-5: reset full-speed USB device number 9 using xhci_hcd
May 13 11:45:55 MTBD00694 kernel: [ 1015.538705] cdc_acm 3-5:1.0: ttyACM1: USB ACM device
May 13 11:47:59 MTBD00694 kernel: [ 1139.413994] usb 3-5: USB disconnect, device number 9

usb 3-5: reset full-speed USB device number 9 using xhci_hcd ==> this reset message seems to be related.

Looking at my interface options (https://docs.odriverobotics.com/interfaces)

  • native usb => according to documentation the recommanded approach but i do experience the random/phantom bug described here (+ the very annoying 10 sec each time you open a communication towards odrive)
  • step/dir => not recommanded anyway
  • pwm => i need absolute positionning so i guess this is not an option
  • CAN => not in official documentaton yet + requires extra hardward (CAN-RS232 + RS232-USB => and it’s USB in the end anyway!!!) + requires extensive code to write since I curently use the python odrive client and would have to write an entire CAN client… not to mention the time waiting for extra hardware to be sourced
  • UART-PC => the PC I use does not have a hardware serial port. I may find a PCI card that offers such a port. or use a classic RS232-USB device but would end up with risk of yet another USB bug? Besides, Dont know if I need to write my own code or if the existing odrive python lib will work with UART too.
  • UART-Arduino => i would need to port all my logic from python-PC to arduino C (quite a lot of work), and since I precalculate all the intermediary absolute positions prior to sending commands to odrive, i’m not sure I would have enough memory in an uno or a mega.

i feel stuck :’(

I guess the quick and dirty way would be to implement some sort of watch dog: make main script start with powering on/off the odrive and regularly update a given file. then have a monitor script that checks the date/time of that file => if it’s older than 30sec, assume the main script has died and kill/restart it.

any suggestion is welcome really

You are right that both CAN and UART converters interfaces PC with USB, however using CAN you don’t have to have a shared GND between ODrive and CAN hardware. If you want to keep using USB in native protocol, galvanically isolating ODrive and PC using an usb isolator IC may help (there are prebuilt breakouts as well).

Have you tried writing your own native usb driver? From what you post about the issue, it seems to be a client side problem, and since the messages are not that complicated it might be worth your time.

yes i agree. symptoms suggests a client side issue. but from my perspective => since a a client is supported by the project, then it should work.

Anyway, as I was trying to build my simple watchdog concept + obstacle detection algo => I stumbled upon yet another weird behavior this morning. I observed some strange weird trajectories (typically some lines wouldnt be straight any more). sometimes the main “beam” of the corexy would even collide with the chassis => something I had seen only when the encoder wheel would be physically dettached from the motor shaft. But i did check everything and both encoder wheels were tightly attached on both motors. Besides, the behavior would stop if i reboot the odrive. then reappear on following reboot. then disappear. then reappear. I did not understand at all.

Eventually the main “beam” heavily collided with the chassis and one belt was completely ripped off!!!* ((pretty scary in fact. fortunately everything is enclosed). This is most likely happened because we changed from a lab power supply that automatically shuts down if it tries to deliver more than 10A, to an industrial power supply that delivers up to 25A. It’s the later that I have been using during the 3 latest several hour long tests. I’m guessing that instead of being the power supply shutting off, the belt became the next weak point in the chain. We should have maybe used some sort of a fuse here :confused:

Now if I want to continue, I need to pay for about 300€ of belt to replace them both.

but i’m really feeling tired of this project all together.

Sorry to hear you’re so frustrated.

I had an ODrive force my axis into the end stops a few times, which caused quite a setback the first time it happened. Since then I include springs at both ends of all my axis to prevent damage. I really recommend doing this, and judging by the pictures you showed it would fit nicely. This is where I ordered mine: https://www.bearingboys.co.uk/Light-Load-Die-Springs--Green-2429-c - select an inner-diameter that is larger than your beams.

Similarly, about fifteen years ago I was working on a bipedal robot, when it kicked me in the balls because the higher-level controller froze. Fun times.

Anyways, the current state of the project can be a disappointment to people. It does require a lot of configuration to get everything running smoothly, and there are bugs.

I get the impression Oskar and his team work hard to keep everybody happy. However, maybe there should be a clearer disclaimer when you buy an ODrive.

I used a controller from roboteq before, and it is a great product but I switched to ODrive because of the price point and the weight & size. They’re about 2-4 times as expensive per axis, and they’re about 2 times larger & heavier due to the casing and heatsink. However the software is easier to use and has more features. https://www.roboteq.com/index.php/roboteq-products-and-services/brushless-dc-motor-controllers#prflag

Short update

  1. I got new belts for 100€ instead of 300€
  2. I’ve ordered springs
  3. I am strengthening the error sanity checks
  4. I add an electric safety (if more than X amps are delivered by power supply, power supply is immediately witched off)
    And most importantly
  5. I got rid of the PC, python and USB interface. I opted for an arduino mega, RS232 interface and the OdriveArduino I modified for my needs.

There was quite some work to rewrite the trajectory planing algorithm that fits the constraints of the arduino board. I’m streaming position commands at 100Hz. I now have the system working for a few hours and it seems rather stable… I guess we will continue to turn on “Odrive radio” (having the machine working non stop for hours) for a few days to build the trust again.

Yeah and I also did

  1. Order alternatives to odrive, including from roboteq and ingeniamc.com . In both cases their professional support seems just right by the way.
1 Like

Good to hear you’re making progress! If you end up finding that the UART is a bit noisy, you can always get a CAN shield for the Mega and run CAN. You may also want to consider upgrading to a Yun or a Teensy or STM32uino… all of which will run Arduino code, but have much faster 32-bit ARM processors.

I also had this problem, but I also found the solution, do you by any chance use multi-threading in your python ?
Well, I do, and I was calling the “odrv0” object from two different threads (on set the speed and the other was collecting information), and we all know that reading/writing to a shared object from multiple threads can cause a collision, so I made sure to call lock.acquire() and release.

Blockquote

Global variable

lock = threading.Lock()

Inside the threads (example)

lock.acquire()
if odrv0.axis1.current_state != AXIS_STATE_CLOSED_LOOP_CONTROL:
odrv0.axis1.requested_state = AXIS_STATE_CLOSED_LOOP_CONTROL
lock.release()

Blockquote

and the issues went away.

1 Like

Thanks for pointing that out. I was not using multi threading. Besides, I am also pretty sure there was no odrivetool nor any process I am aware of that would access the odrive.

Anyway, my python code is dead code now.

For the record, we have been testing the uart/arduino interface for several hours over several days now. The instability seems gone. But we will better now when the real usecase starts, hopefully in a few days.

Unfortunately the real test is postponed until late August at best :frowning:
I’ll update with results once we have seen the beast running for a few days

1 Like