Usb communication issue / AttributeError: 'RemoteObject' object has no attribute 'axis0'

#1

I am trying to stress my odrive corexy system to prove the robustness of the system => I am running in infinite loop a sequence of motor commands

the system runs about fine for about 15-30min but eventually I observe an USB communication issue with odrive than manifests its as I am trying to read motor error code;

AttributeError: ‘RemoteObject’ object has no attribute 'axis0’

Odrive board is directly connected to the a USB port on the PC motherboard (no usb hub)

At the moment when this problems occurs, the exception thrown is not caught, then the python script crashes. And if i try to restart it, most of the time the USB connection never happens and I have to cut on/off the power.

what can i do about this?

some tech details below

firware             => 0.4.8.0
hardware            => 3.5
variant             => 48
bus_voltage         => 23.925806045532227
vel_limit           => 400000.0	400000.0
pos_gain            => 7.0	7.0
vel_gain            => 0.0002500000118743628	0.0002500000118743628
vel_integrator_gain => 0.0010000000474974513	0.0010000000474974513
motor_error         => ERROR_NO_ERROR	ERROR_NO_ERROR

[... after about 15-30min...]

Traceback (most recent call last):
  File "automata_v2.2.py", line 166, in <module>
    test_cut()
  File "automata_v2.2.py", line 160, in test_cut
    cut(c)
  File "automata_v2.2.py", line 48, in cut
    cmd = "G0 X"+str(x)+" Y"+str(y) ; c.send_cmds(cmd) ; time.sleep(t)
  File "/root/odrive_test/corexy.py", line 235, in send_cmds
    res = self.__handle_cmd(cmd)
  File "/root/odrive_test/corexy.py", line 264, in __handle_cmd
    return self.__parse_goto(cmd[2:])
  File "/root/odrive_test/corexy.py", line 288, in __parse_goto
    return self.__goto(target_x, target_y)
  File "/root/odrive_test/corexy.py", line 328, in __goto
    self.__move_motors(dA_steps, dB_steps)
  File "/root/odrive_test/corexy.py", line 545, in __move_motors
    m0 = self.__get_motor_error_name(self.my_odrive.axis0.motor.error)
  File "/usr/local/lib/python3.5/dist-packages/fibre/remote_object.py", line 245, in __getattribute__
    return object.__getattribute__(self, name)
AttributeError: 'RemoteObject' object has no attribute 'axis0'
root@MTBD00694:~/odrive_test# 




root@MTBD00694:~/odrive_test# python3 -m pip show odrive
Name: odrive
Version: 0.4.8
Summary: Control utilities for the ODrive high performance motor controller
Home-page: https://github.com/madcowswe/ODrive
Author: Oskar Weigl
Author-email: oskar.weigl@odriverobotics.com
License: MIT
Location: /usr/local/lib/python3.5/dist-packages
Requires: pywin32, requests, PyUSB, ipython, PySerial, IntelHex, matplotlib
Required-by: 
root@MTBD00694:~/odrive_test# 


root@MTBD00694:~/odrive_test# python3 -i
Python 3.5.2 (default, Nov 12 2018, 13:43:14) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> exit()
root@MTBD00694:~/odrive_test# cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.4 LTS"
root@MTBD00694:~/odrive_test#
#2

Hey alexisdal, I have similar issue. When this happens and I am able to reconnect (after replugging usb without power cycle), all the properties like encoder position is intact.
To make sure its not some hardware issue I tried with different ODrive boards, and happens consistently with one type of motor I have.
I bought a 400Mhz logic analyzer so I can capture this with sigrok when it happens (with some pretrigger sampling).
Do you have any idea how one would catch this USB error soon enough so I can trigger the sampling in time?

#3

No clue. I would record with time samples and then timestamp exception too on python side, then try to match time events.

This is pretty critical for my usecase. It’s basically an odrive show stopper.

I don’t know if it’s the PC side or odrive but in the end, it’s the same.

If the USB connection is unreliable, what is the most reliable option? Native protocol on UART maybe?

#4

For this same reason I started using CAN interface. Didn’t have any connection problem since.
I observed dmesg log after running with CAN interface for a while and didn’t observe any USB error. So I guess error only occurs when host and ODrive is actively communicating. I am curious about the cause and bought the logic analyzer anyways so I’m gonna give another try at capturing it. I’ll post a screenshot of pulseview if I get it.

#5

If you can capture the error that would be super awesome. I think you should be able to trigger on activity timeout if you use a script that sends commands continuously: when it crashes the USB comms will cease.
Thanks!

#6

Oskar, I was able to capture the errors with just try except block in python.
here is zoomed out view (blue vertical dashed line is trigger):


With sigrok signal decoder I don’t see any errors but here is the suspicious portion
(usb disconnects about 20 frames after the circled frame):




here is another capture session with same pattern
(seemingly random appearance of SETUP PID then shortly after resets):

I put the captured sigrok files on google drive link here: https://drive.google.com/drive/folders/1ZsbgBWiQUggRHibgv2StUjEyhWG2lpkM?usp=sharing

dmesg logs when this happens (I am using 4port usb hub):

[12443.847757] usb 3-1.1-port3: disabled by hub (EMI?), re-enabling...
[12443.849094] usb 3-1.1.3: USB disconnect, device number 29
[12443.851199] cdc_acm 3-1.1.3:1.0: failed to set dtr/rts

motor that causes this USB error (doesn’t happen when this motor is not in closed loop control):
24V 8poles
also USB error occurs faster if motor is trying to hold a position rather than constantly moving

"phase_resistance": 0.3050479292869568
"phase_inductance": 0.0002905561705119908

motor
let me know if you need more captures.

1 Like
#7

I’m downloading your sigrok files so I can look closer in the viewer, I’ll report back if I discover something. Thx for making the capture. I assume you captured this on the ODrive side of the hub (like TP 1 and 2 on the ODrive)?

Either way, I think we have a strong hint as to what is going on:

Together with the fact that it only dies during active PWM on your motor: it is likely capacitively coupled EMI (Electro Magnetic Interference). So I have some followup questions:
How are phase wires routed to the motor? The motor has exposed metal mounting plate, is the chassis connected to mains ground? Does the ODrive share a GND connection (through DC- or otherwise) with the PC? For example, is V- of the power supply bridged to Protective Earth in your mains wiring? Are you using a laptop on battery, or a desktop or a charger with a ground pin?

#8

captured signal is on the ODrive side of the hub, not TPs though (I split a short usb cable into two to expose the wire).

Phase wires are routed in bare unshielded UL1007 wires and motor chassis is only grounded through AMT encoder’s GND, not by anything else.

I am using a desktop PC; The GND is indeed connected to PC through USB DC-, and power supply GND which is bridged to Protective Earth (Desktop PC chassis is also bridged to Protective Earth).

Here are the phase wires and how the motor is sitting in the setup:


#9

Here’s some info that may help analyze the capture:

#define CDC_IN_EP       0x81  /* EP1 for data IN */
#define CDC_OUT_EP      0x01  /* EP1 for data OUT */
#define CDC_CMD_EP      0x82  /* EP2 for CDC commands */
#define ODRIVE_IN_EP    0x83  /* EP3 IN: ODrive device TX endpoint */
#define ODRIVE_OUT_EP   0x03  /* EP3 OUT: ODrive device RX endpoint */

Note that the 0x80 is masked away on the IN endpoints, i.e. the ODRIVE endpoints are both endpoint 3, just that the IN ones get an extra 0x80 in how they are defined here.

So looking at the captures, and as far as I can tell, everything is operating normally. It seems what happens is that the python sends data (custom protocol reads) on OUT EP3, and data comes back on IN EP3 soon after. When the ODrive has nothing to send, the USB subsystem is still polling at a high rate on the IN EP3, but it’s all NAK most of the time (this is normal when ODrive has no data to send).

However sometime later this SETUP packets come in: these are activating EP1, which is the CDC device, aka the virtual serial port. This is likely a (linux?) driver on the PC that decided they wanted to talk to the serial port. From then on the USB subsystem also starts polling IN EP1 and at what seems double the rate (from every 50ish microseconds to every 25ish microseconds).

My guess is that it’s this faster poll rate that ends up triggering the USB hub to get an error/interferance density that is higher?

Not really a “true” solution, but you can try to disable the linux modem manager, which I suspect is the process that opens the CDC (virtual serial) device. There is a guide here, which involves tagging on , ENV{ID_MM_DEVICE_IGNORE}="1" onto the etc/udev/rules.d/91-odrive.rules and then reloading rules.

Of course the true solution is to eliminate the EMI. Do you have a ferrite on your USB cable?

Nice robot arm btw! ;D

#10

I added the , ENV{ID_MM_DEVICE_IGNORE}="1" to the udev rules that get installed when you install odrivetool with pip, in this commit.

Will go out on the next release.

#11

Thanks Oskar for your valuable time.
I updated the udev rules and reloaded and reran the script.
first of all, UBS ACM device is still detected just after all the ODrives are detected (dmesg)
(behaviors certainly changed though explained next):

[  128.406348] usb 3-1.1.4: New USB device found, idVendor=1209, idProduct=0d32
[  128.406354] usb 3-1.1.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[  128.406358] usb 3-1.1.4: Product: ODrive 3.5 CDC Interface
[  128.406362] usb 3-1.1.4: Manufacturer: ODrive Robotics
[  128.406366] usb 3-1.1.4: SerialNumber: 208037713548
[  128.445532] cdc_acm 3-1.1.2:1.0: ttyACM0: USB ACM device
[  128.448513] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  128.451809] cdc_acm 3-1.1.4:1.0: ttyACM2: USB ACM device
[  128.452195] usbcore: registered new interface driver cdc_acm

It took longer for usb error to occur this time.
Before the error occurs, there was a whole page of the following in dmesg:

[  231.049779] usb 3-1.1.3: reset full-speed USB device number 14 using xhci_hcd
[  231.151736] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  231.441773] usb 3-1.1.3: reset full-speed USB device number 14 using xhci_hcd
[  231.543378] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  232.397740] usb 3-1.1.3: reset full-speed USB device number 14 using xhci_hcd
[  232.685485] usb 3-1.1.3: device descriptor read/64, error -71
[  232.899498] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  232.981764] usb 3-1.1.3: reset full-speed USB device number 14 using xhci_hcd
[  233.083210] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device

similar EMI error after exception is thrown:

[  474.122768] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  475.833540] usb 3-1.1-port3: disabled by hub (EMI?), re-enabling...
[  475.834883] usb 3-1.1.3: USB disconnect, device number 55
[  476.063461] usb 3-1.1.3: new full-speed USB device number 56 using xhci_hcd
[  476.170091] usb 3-1.1.3: New USB device found, idVendor=1209, idProduct=0d32

Also the exception that is being caught seems to have changed from
AttributeError: ‘RemoteObject’ object has no attribute ...
to
received unexpected ACK: 8068
(number after ACK is different each time)

I added two sigrok captures again to the same link:
https://drive.google.com/drive/folders/1ZsbgBWiQUggRHibgv2StUjEyhWG2lpkM?usp=sharing

EDIT:
I can confirm the blacklist is working (I did upgrade modemmanager from 1.6.4 to 1.8.2 so I can set filter options). But still couldn’t figure out how to stop ubuntu from polling the ACM devices. I’ll post an edit if when I get a chance to try again. I might try opening the USB hubs I have and compare how different USB hub ICs behave. Glad to know there is not a single error packet caused by ODrive in the capture.

>>> sudo mmcli -G DEBUG;
Successfully set logging level
>>> journalctl -f | grep "ModemManager.*\[filter\]"

(plug in ODrive)

ModemManager[1076]: <debug> [filter] (tty/ttyACM0): port filtered: device is blacklisted
#12

how did you upgrade from modemmanager 1.64 to 1.8.4?
i did not have modemmanager installed at all while observing the odrive usb error. apt installed modemmanager 1.6.4 on my ubuntu 16.04

when i try your commands to test usb filtering i get

root@MTBD00694:~/odrive_test# mmcli -G DEBUG;
error: couldn’t find the ModemManager process in the bus
root@MTBD00694:~/odrive_test#

==>looks like it cannot know whether usb filtering works or not

can you elaborate on the hardware/software part of your CAN communication ?
you post in CAN dscussion suggests this for CAN-RS232 conversion + FTDI based RS232-USB (can you send us a link where you bought yours plz?) + code screenshoots suggests a custom implementation in c/cpp to send CAN commands (over usb then)

@madcowswe => i’m really stuck here.
an unreliable system cant go in production.
it really is a pitty because I remember massively sending odrive USB position commands for 3 weeks 24/7 and everything was fine (without any load on the motor though). ok that was n late 2017, so many things must have changed since then. but the whole point was precisely to avoid the situation i’m facing now

What should i do here? switch to windows (omg!) ? use ubuntu 18 ? use a raspberry pi? dont touch my ubuntu but use another odrive interfacae? surely someone managed to use odrive in a reliable way, right?

#13

If your ubuntu didn’t have modemmanager installed, I think it wouldn’t have caused USB crash in the first place.
but if you wish to try 1.8 you can install it by:

sudo apt-repository ppa:aleksander-m/modemmanager-xenial
sudo apt-get update
sudo apt-get --only-upgrade install modemanager

I had to edit /lib/systemd/system/ModemManager.service

to have the following lines under [Service] section:

...
ExecStart=/usr/sbin/ModemManager --filter-policy=default
...
Environment="MM_FILTER_RULE_TTY_ACM_INTERFACE=0"

I think there is also a snap package which may be installed:

snap list | grep modem-manager

As for the CAN hardware, any CAN <-> serial converter would work. One I am using you can buy from this link in ebay ( https://www.ebay.com/itm/SystemBase-sCAN-RS-232-DE-09S-DB9-female-CAN-DE-09P-DB9-male-serial/323363066646?hash=item4b49f0d316:g:0D4AAOSwqj5bVfod&frcectupt=true)
However I think for this price you have other options from larger manufacturers.
Using this kind of CAN converter makes the host side software pretty much the same whether you use ODrive’s uart or can interface (read/write to serial). I am using a python program using pyserial_asyncio library.

After looking at the captures from logic analyzer, I don’t think there is really a cause in the ODrive’s usb implementation, but the EMI noise affecting the host side (in my case a USB hub). I think USB connection might not be the best choice for interfacing with a motor controller because of electrical noise and the way USB protocol maintains connections. I think your best bet is trying either UART or CAN interface. The Stanford Doggo project used UART.

#14

i am confused.
due – in part to this bug – we moved the entire machine back to the lab a few days ago (change of building)

  • 2 days ago, i did the test with the ferrite on the usb cable near the odrive board + the udev , ENV{ID_MM_DEVICE_IGNORE}="1" rule + install modemmanger 1.6.4 => the machine ran in infinite loop from 13h to 16h30 (3h30 straight). we turned it off becase we were sick with the noise and thought maybe the issue was gone
  • yesterday, i reverted back to how it was when the bug appeared (default udev rule + no ferrite + uninstalled modemmanager), in order to observe that the bug does reappear. It ran again in infinite loop from 10h30 to 15h30 (5h straight) => the bug did not occur either

=> I now seem unable to reproduce the bug. :cry: :cry: :cry:

i have started a test at 15h30 and will ask the gard to hit the emergency stop in case it’s still running at 20h.

i’m really lost here.

#15

If you need it to be running for hours without any interruption really try the other interfaces and save yourself lots of headache down the road… Especially if you have any cats around.:pouting_cat:

#16

hello,

after a couple weeks in standby, i worked again with odrive.

making runs of unattend use (recorded on timelapse videos)
tuesday may 7th from 14h to 23h => 9h
thursday may 9th from 11h to 17h30 => 6h30
friday may 10th from 8h30 to 17h => 8h30
=> all those ran without any issue.

and this morning, the usb bug triggered once after about a minute and never reproduced ever since…

May 13 11:44:41 MTBD00694 kernel: [  940.753530] usb 3-5: new full-speed USB device number 9 using xhci_hcd
May 13 11:44:41 MTBD00694 kernel: [  940.883351] usb 3-5: New USB device found, idVendor=1209, idProduct=0d32
May 13 11:44:41 MTBD00694 kernel: [  940.883354] usb 3-5: New USB device strings: Mfr=1, Product=2, SerialNumber=3
May 13 11:44:41 MTBD00694 kernel: [  940.883357] usb 3-5: Product: ODrive 3.5 CDC Interface
May 13 11:44:41 MTBD00694 kernel: [  940.883358] usb 3-5: Manufacturer: ODrive Robotics
May 13 11:44:41 MTBD00694 kernel: [  940.883359] usb 3-5: SerialNumber: 367033663037
May 13 11:44:41 MTBD00694 kernel: [  940.883957] cdc_acm 3-5:1.0: ttyACM1: USB ACM device
May 13 11:44:41 MTBD00694 mtp-probe: checking bus 3, device 9: "/sys/devices/pci0000:00/0000:00:14.0/usb3/3-5"
May 13 11:44:41 MTBD00694 mtp-probe: bus: 3, device: 9 was not an MTP device
May 13 11:45:55 MTBD00694 kernel: [ 1015.409504] usb 3-5: reset full-speed USB device number 9 using xhci_hcd
May 13 11:45:55 MTBD00694 kernel: [ 1015.538705] cdc_acm 3-5:1.0: ttyACM1: USB ACM device
May 13 11:47:59 MTBD00694 kernel: [ 1139.413994] usb 3-5: USB disconnect, device number 9

usb 3-5: reset full-speed USB device number 9 using xhci_hcd ==> this reset message seems to be related.

Looking at my interface options (https://docs.odriverobotics.com/interfaces)

  • native usb => according to documentation the recommanded approach but i do experience the random/phantom bug described here (+ the very annoying 10 sec each time you open a communication towards odrive)
  • step/dir => not recommanded anyway
  • pwm => i need absolute positionning so i guess this is not an option
  • CAN => not in official documentaton yet + requires extra hardward (CAN-RS232 + RS232-USB => and it’s USB in the end anyway!!!) + requires extensive code to write since I curently use the python odrive client and would have to write an entire CAN client… not to mention the time waiting for extra hardware to be sourced
  • UART-PC => the PC I use does not have a hardware serial port. I may find a PCI card that offers such a port. or use a classic RS232-USB device but would end up with risk of yet another USB bug? Besides, Dont know if I need to write my own code or if the existing odrive python lib will work with UART too.
  • UART-Arduino => i would need to port all my logic from python-PC to arduino C (quite a lot of work), and since I precalculate all the intermediary absolute positions prior to sending commands to odrive, i’m not sure I would have enough memory in an uno or a mega.

i feel stuck :’(

I guess the quick and dirty way would be to implement some sort of watch dog: make main script start with powering on/off the odrive and regularly update a given file. then have a monitor script that checks the date/time of that file => if it’s older than 30sec, assume the main script has died and kill/restart it.

any suggestion is welcome really

#18

You are right that both CAN and UART converters interfaces PC with USB, however using CAN you don’t have to have a shared GND between ODrive and CAN hardware. If you want to keep using USB in native protocol, galvanically isolating ODrive and PC using an usb isolator IC may help (there are prebuilt breakouts as well).

#19

Have you tried writing your own native usb driver? From what you post about the issue, it seems to be a client side problem, and since the messages are not that complicated it might be worth your time.

#20

yes i agree. symptoms suggests a client side issue. but from my perspective => since a a client is supported by the project, then it should work.

Anyway, as I was trying to build my simple watchdog concept + obstacle detection algo => I stumbled upon yet another weird behavior this morning. I observed some strange weird trajectories (typically some lines wouldnt be straight any more). sometimes the main “beam” of the corexy would even collide with the chassis => something I had seen only when the encoder wheel would be physically dettached from the motor shaft. But i did check everything and both encoder wheels were tightly attached on both motors. Besides, the behavior would stop if i reboot the odrive. then reappear on following reboot. then disappear. then reappear. I did not understand at all.

Eventually the main “beam” heavily collided with the chassis and one belt was completely ripped off!!!* ((pretty scary in fact. fortunately everything is enclosed). This is most likely happened because we changed from a lab power supply that automatically shuts down if it tries to deliver more than 10A, to an industrial power supply that delivers up to 25A. It’s the later that I have been using during the 3 latest several hour long tests. I’m guessing that instead of being the power supply shutting off, the belt became the next weak point in the chain. We should have maybe used some sort of a fuse here :confused:

Now if I want to continue, I need to pay for about 300€ of belt to replace them both.

but i’m really feeling tired of this project all together.

#21

Sorry to hear you’re so frustrated.

I had an ODrive force my axis into the end stops a few times, which caused quite a setback the first time it happened. Since then I include springs at both ends of all my axis to prevent damage. I really recommend doing this, and judging by the pictures you showed it would fit nicely. This is where I ordered mine: https://www.bearingboys.co.uk/Light-Load-Die-Springs--Green-2429-c - select an inner-diameter that is larger than your beams.

Similarly, about fifteen years ago I was working on a bipedal robot, when it kicked me in the balls because the higher-level controller froze. Fun times.

Anyways, the current state of the project can be a disappointment to people. It does require a lot of configuration to get everything running smoothly, and there are bugs.

I get the impression Oskar and his team work hard to keep everybody happy. However, maybe there should be a clearer disclaimer when you buy an ODrive.

I used a controller from roboteq before, and it is a great product but I switched to ODrive because of the price point and the weight & size. They’re about 2-4 times as expensive per axis, and they’re about 2 times larger & heavier due to the casing and heatsink. However the software is easier to use and has more features. https://www.roboteq.com/index.php/roboteq-products-and-services/brushless-dc-motor-controllers#prflag