Usb communication issue / AttributeError: 'RemoteObject' object has no attribute 'axis0'

#1

I am trying to stress my odrive corexy system to prove the robustness of the system => I am running in infinite loop a sequence of motor commands

the system runs about fine for about 15-30min but eventually I observe an USB communication issue with odrive than manifests its as I am trying to read motor error code;

AttributeError: ‘RemoteObject’ object has no attribute 'axis0’

Odrive board is directly connected to the a USB port on the PC motherboard (no usb hub)

At the moment when this problems occurs, the exception thrown is not caught, then the python script crashes. And if i try to restart it, most of the time the USB connection never happens and I have to cut on/off the power.

what can i do about this?

some tech details below

firware             => 0.4.8.0
hardware            => 3.5
variant             => 48
bus_voltage         => 23.925806045532227
vel_limit           => 400000.0	400000.0
pos_gain            => 7.0	7.0
vel_gain            => 0.0002500000118743628	0.0002500000118743628
vel_integrator_gain => 0.0010000000474974513	0.0010000000474974513
motor_error         => ERROR_NO_ERROR	ERROR_NO_ERROR

[... after about 15-30min...]

Traceback (most recent call last):
  File "automata_v2.2.py", line 166, in <module>
    test_cut()
  File "automata_v2.2.py", line 160, in test_cut
    cut(c)
  File "automata_v2.2.py", line 48, in cut
    cmd = "G0 X"+str(x)+" Y"+str(y) ; c.send_cmds(cmd) ; time.sleep(t)
  File "/root/odrive_test/corexy.py", line 235, in send_cmds
    res = self.__handle_cmd(cmd)
  File "/root/odrive_test/corexy.py", line 264, in __handle_cmd
    return self.__parse_goto(cmd[2:])
  File "/root/odrive_test/corexy.py", line 288, in __parse_goto
    return self.__goto(target_x, target_y)
  File "/root/odrive_test/corexy.py", line 328, in __goto
    self.__move_motors(dA_steps, dB_steps)
  File "/root/odrive_test/corexy.py", line 545, in __move_motors
    m0 = self.__get_motor_error_name(self.my_odrive.axis0.motor.error)
  File "/usr/local/lib/python3.5/dist-packages/fibre/remote_object.py", line 245, in __getattribute__
    return object.__getattribute__(self, name)
AttributeError: 'RemoteObject' object has no attribute 'axis0'
root@MTBD00694:~/odrive_test# 




root@MTBD00694:~/odrive_test# python3 -m pip show odrive
Name: odrive
Version: 0.4.8
Summary: Control utilities for the ODrive high performance motor controller
Home-page: https://github.com/madcowswe/ODrive
Author: Oskar Weigl
Author-email: oskar.weigl@odriverobotics.com
License: MIT
Location: /usr/local/lib/python3.5/dist-packages
Requires: pywin32, requests, PyUSB, ipython, PySerial, IntelHex, matplotlib
Required-by: 
root@MTBD00694:~/odrive_test# 


root@MTBD00694:~/odrive_test# python3 -i
Python 3.5.2 (default, Nov 12 2018, 13:43:14) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> exit()
root@MTBD00694:~/odrive_test# cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.4 LTS"
root@MTBD00694:~/odrive_test#
0 Likes

#2

Hey alexisdal, I have similar issue. When this happens and I am able to reconnect (after replugging usb without power cycle), all the properties like encoder position is intact.
To make sure its not some hardware issue I tried with different ODrive boards, and happens consistently with one type of motor I have.
I bought a 400Mhz logic analyzer so I can capture this with sigrok when it happens (with some pretrigger sampling).
Do you have any idea how one would catch this USB error soon enough so I can trigger the sampling in time?

0 Likes

#3

No clue. I would record with time samples and then timestamp exception too on python side, then try to match time events.

This is pretty critical for my usecase. It’s basically an odrive show stopper.

I don’t know if it’s the PC side or odrive but in the end, it’s the same.

If the USB connection is unreliable, what is the most reliable option? Native protocol on UART maybe?

0 Likes

#4

For this same reason I started using CAN interface. Didn’t have any connection problem since.
I observed dmesg log after running with CAN interface for a while and didn’t observe any USB error. So I guess error only occurs when host and ODrive is actively communicating. I am curious about the cause and bought the logic analyzer anyways so I’m gonna give another try at capturing it. I’ll post a screenshot of pulseview if I get it.

0 Likes

#5

If you can capture the error that would be super awesome. I think you should be able to trigger on activity timeout if you use a script that sends commands continuously: when it crashes the USB comms will cease.
Thanks!

0 Likes

#6

Oskar, I was able to capture the errors with just try except block in python.
here is zoomed out view (blue vertical dashed line is trigger):


With sigrok signal decoder I don’t see any errors but here is the suspicious portion
(usb disconnects about 20 frames after the circled frame):




here is another capture session with same pattern
(seemingly random appearance of SETUP PID then shortly after resets):

I put the captured sigrok files on google drive link here: https://drive.google.com/drive/folders/1ZsbgBWiQUggRHibgv2StUjEyhWG2lpkM?usp=sharing

dmesg logs when this happens (I am using 4port usb hub):

[12443.847757] usb 3-1.1-port3: disabled by hub (EMI?), re-enabling...
[12443.849094] usb 3-1.1.3: USB disconnect, device number 29
[12443.851199] cdc_acm 3-1.1.3:1.0: failed to set dtr/rts

motor that causes this USB error (doesn’t happen when this motor is not in closed loop control):
24V 8poles
also USB error occurs faster if motor is trying to hold a position rather than constantly moving

"phase_resistance": 0.3050479292869568
"phase_inductance": 0.0002905561705119908

motor
let me know if you need more captures.

1 Like

#7

I’m downloading your sigrok files so I can look closer in the viewer, I’ll report back if I discover something. Thx for making the capture. I assume you captured this on the ODrive side of the hub (like TP 1 and 2 on the ODrive)?

Either way, I think we have a strong hint as to what is going on:

Together with the fact that it only dies during active PWM on your motor: it is likely capacitively coupled EMI (Electro Magnetic Interference). So I have some followup questions:
How are phase wires routed to the motor? The motor has exposed metal mounting plate, is the chassis connected to mains ground? Does the ODrive share a GND connection (through DC- or otherwise) with the PC? For example, is V- of the power supply bridged to Protective Earth in your mains wiring? Are you using a laptop on battery, or a desktop or a charger with a ground pin?

0 Likes

#8

captured signal is on the ODrive side of the hub, not TPs though (I split a short usb cable into two to expose the wire).

Phase wires are routed in bare unshielded UL1007 wires and motor chassis is only grounded through AMT encoder’s GND, not by anything else.

I am using a desktop PC; The GND is indeed connected to PC through USB DC-, and power supply GND which is bridged to Protective Earth (Desktop PC chassis is also bridged to Protective Earth).

Here are the phase wires and how the motor is sitting in the setup:


0 Likes

#9

Here’s some info that may help analyze the capture:

#define CDC_IN_EP       0x81  /* EP1 for data IN */
#define CDC_OUT_EP      0x01  /* EP1 for data OUT */
#define CDC_CMD_EP      0x82  /* EP2 for CDC commands */
#define ODRIVE_IN_EP    0x83  /* EP3 IN: ODrive device TX endpoint */
#define ODRIVE_OUT_EP   0x03  /* EP3 OUT: ODrive device RX endpoint */

Note that the 0x80 is masked away on the IN endpoints, i.e. the ODRIVE endpoints are both endpoint 3, just that the IN ones get an extra 0x80 in how they are defined here.

So looking at the captures, and as far as I can tell, everything is operating normally. It seems what happens is that the python sends data (custom protocol reads) on OUT EP3, and data comes back on IN EP3 soon after. When the ODrive has nothing to send, the USB subsystem is still polling at a high rate on the IN EP3, but it’s all NAK most of the time (this is normal when ODrive has no data to send).

However sometime later this SETUP packets come in: these are activating EP1, which is the CDC device, aka the virtual serial port. This is likely a (linux?) driver on the PC that decided they wanted to talk to the serial port. From then on the USB subsystem also starts polling IN EP1 and at what seems double the rate (from every 50ish microseconds to every 25ish microseconds).

My guess is that it’s this faster poll rate that ends up triggering the USB hub to get an error/interferance density that is higher?

Not really a “true” solution, but you can try to disable the linux modem manager, which I suspect is the process that opens the CDC (virtual serial) device. There is a guide here, which involves tagging on , ENV{ID_MM_DEVICE_IGNORE}="1" onto the etc/udev/rules.d/91-odrive.rules and then reloading rules.

Of course the true solution is to eliminate the EMI. Do you have a ferrite on your USB cable?

Nice robot arm btw! ;D

0 Likes

#10

I added the , ENV{ID_MM_DEVICE_IGNORE}="1" to the udev rules that get installed when you install odrivetool with pip, in this commit.

Will go out on the next release.

0 Likes

#11

Thanks Oskar for your valuable time.
I updated the udev rules and reloaded and reran the script.
first of all, UBS ACM device is still detected just after all the ODrives are detected (dmesg)
(behaviors certainly changed though explained next):

[  128.406348] usb 3-1.1.4: New USB device found, idVendor=1209, idProduct=0d32
[  128.406354] usb 3-1.1.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[  128.406358] usb 3-1.1.4: Product: ODrive 3.5 CDC Interface
[  128.406362] usb 3-1.1.4: Manufacturer: ODrive Robotics
[  128.406366] usb 3-1.1.4: SerialNumber: 208037713548
[  128.445532] cdc_acm 3-1.1.2:1.0: ttyACM0: USB ACM device
[  128.448513] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  128.451809] cdc_acm 3-1.1.4:1.0: ttyACM2: USB ACM device
[  128.452195] usbcore: registered new interface driver cdc_acm

It took longer for usb error to occur this time.
Before the error occurs, there was a whole page of the following in dmesg:

[  231.049779] usb 3-1.1.3: reset full-speed USB device number 14 using xhci_hcd
[  231.151736] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  231.441773] usb 3-1.1.3: reset full-speed USB device number 14 using xhci_hcd
[  231.543378] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  232.397740] usb 3-1.1.3: reset full-speed USB device number 14 using xhci_hcd
[  232.685485] usb 3-1.1.3: device descriptor read/64, error -71
[  232.899498] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  232.981764] usb 3-1.1.3: reset full-speed USB device number 14 using xhci_hcd
[  233.083210] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device

similar EMI error after exception is thrown:

[  474.122768] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  475.833540] usb 3-1.1-port3: disabled by hub (EMI?), re-enabling...
[  475.834883] usb 3-1.1.3: USB disconnect, device number 55
[  476.063461] usb 3-1.1.3: new full-speed USB device number 56 using xhci_hcd
[  476.170091] usb 3-1.1.3: New USB device found, idVendor=1209, idProduct=0d32

Also the exception that is being caught seems to have changed from
AttributeError: ‘RemoteObject’ object has no attribute ...
to
received unexpected ACK: 8068
(number after ACK is different each time)

I added two sigrok captures again to the same link:
https://drive.google.com/drive/folders/1ZsbgBWiQUggRHibgv2StUjEyhWG2lpkM?usp=sharing

EDIT:
I can confirm the blacklist is working (I did upgrade modemmanager from 1.6.4 to 1.8.2 so I can set filter options). But still couldn’t figure out how to stop ubuntu from polling the ACM devices. I’ll post an edit if when I get a chance to try again. I might try opening the USB hubs I have and compare how different USB hub ICs behave. Glad to know there is not a single error packet caused by ODrive in the capture.

>>> sudo mmcli -G DEBUG;
Successfully set logging level
>>> journalctl -f | grep "ModemManager.*\[filter\]"

(plug in ODrive)

ModemManager[1076]: <debug> [filter] (tty/ttyACM0): port filtered: device is blacklisted
0 Likes

#12

how did you upgrade from modemmanager 1.64 to 1.8.4?
i did not have modemmanager installed at all while observing the odrive usb error. apt installed modemmanager 1.6.4 on my ubuntu 16.04

when i try your commands to test usb filtering i get

root@MTBD00694:~/odrive_test# mmcli -G DEBUG;
error: couldn’t find the ModemManager process in the bus
root@MTBD00694:~/odrive_test#

==>looks like it cannot know whether usb filtering works or not

can you elaborate on the hardware/software part of your CAN communication ?
you post in CAN dscussion suggests this for CAN-RS232 conversion + FTDI based RS232-USB (can you send us a link where you bought yours plz?) + code screenshoots suggests a custom implementation in c/cpp to send CAN commands (over usb then)

@madcowswe => i’m really stuck here.
an unreliable system cant go in production.
it really is a pitty because I remember massively sending odrive USB position commands for 3 weeks 24/7 and everything was fine (without any load on the motor though). ok that was n late 2017, so many things must have changed since then. but the whole point was precisely to avoid the situation i’m facing now

What should i do here? switch to windows (omg!) ? use ubuntu 18 ? use a raspberry pi? dont touch my ubuntu but use another odrive interfacae? surely someone managed to use odrive in a reliable way, right?

0 Likes

#13

If your ubuntu didn’t have modemmanager installed, I think it wouldn’t have caused USB crash in the first place.
but if you wish to try 1.8 you can install it by:

sudo apt-repository ppa:aleksander-m/modemmanager-xenial
sudo apt-get update
sudo apt-get --only-upgrade install modemanager

I had to edit /lib/systemd/system/ModemManager.service

to have the following lines under [Service] section:

...
ExecStart=/usr/sbin/ModemManager --filter-policy=default
...
Environment="MM_FILTER_RULE_TTY_ACM_INTERFACE=0"

I think there is also a snap package which may be installed:

snap list | grep modem-manager

As for the CAN hardware, any CAN <-> serial converter would work. One I am using you can buy from this link in ebay ( https://www.ebay.com/itm/SystemBase-sCAN-RS-232-DE-09S-DB9-female-CAN-DE-09P-DB9-male-serial/323363066646?hash=item4b49f0d316:g:0D4AAOSwqj5bVfod&frcectupt=true)
However I think for this price you have other options from larger manufacturers.
Using this kind of CAN converter makes the host side software pretty much the same whether you use ODrive’s uart or can interface (read/write to serial). I am using a python program using pyserial_asyncio library.

After looking at the captures from logic analyzer, I don’t think there is really a cause in the ODrive’s usb implementation, but the EMI noise affecting the host side (in my case a USB hub). I think USB connection might not be the best choice for interfacing with a motor controller because of electrical noise and the way USB protocol maintains connections. I think your best bet is trying either UART or CAN interface. The Stanford Doggo project used UART.

0 Likes

#14

i am confused.
due – in part to this bug – we moved the entire machine back to the lab a few days ago (change of building)

  • 2 days ago, i did the test with the ferrite on the usb cable near the odrive board + the udev , ENV{ID_MM_DEVICE_IGNORE}="1" rule + install modemmanger 1.6.4 => the machine ran in infinite loop from 13h to 16h30 (3h30 straight). we turned it off becase we were sick with the noise and thought maybe the issue was gone
  • yesterday, i reverted back to how it was when the bug appeared (default udev rule + no ferrite + uninstalled modemmanager), in order to observe that the bug does reappear. It ran again in infinite loop from 10h30 to 15h30 (5h straight) => the bug did not occur either

=> I now seem unable to reproduce the bug. :cry: :cry: :cry:

i have started a test at 15h30 and will ask the gard to hit the emergency stop in case it’s still running at 20h.

i’m really lost here.

0 Likes

#15

If you need it to be running for hours without any interruption really try the other interfaces and save yourself lots of headache down the road… Especially if you have any cats around.:pouting_cat:

0 Likes