Usb communication issue / AttributeError: 'RemoteObject' object has no attribute 'axis0'

Thanks Oskar for your valuable time.
I updated the udev rules and reloaded and reran the script.
first of all, UBS ACM device is still detected just after all the ODrives are detected (dmesg)
(behaviors certainly changed though explained next):

[  128.406348] usb 3-1.1.4: New USB device found, idVendor=1209, idProduct=0d32
[  128.406354] usb 3-1.1.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[  128.406358] usb 3-1.1.4: Product: ODrive 3.5 CDC Interface
[  128.406362] usb 3-1.1.4: Manufacturer: ODrive Robotics
[  128.406366] usb 3-1.1.4: SerialNumber: 208037713548
[  128.445532] cdc_acm 3-1.1.2:1.0: ttyACM0: USB ACM device
[  128.448513] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  128.451809] cdc_acm 3-1.1.4:1.0: ttyACM2: USB ACM device
[  128.452195] usbcore: registered new interface driver cdc_acm

It took longer for usb error to occur this time.
Before the error occurs, there was a whole page of the following in dmesg:

[  231.049779] usb 3-1.1.3: reset full-speed USB device number 14 using xhci_hcd
[  231.151736] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  231.441773] usb 3-1.1.3: reset full-speed USB device number 14 using xhci_hcd
[  231.543378] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  232.397740] usb 3-1.1.3: reset full-speed USB device number 14 using xhci_hcd
[  232.685485] usb 3-1.1.3: device descriptor read/64, error -71
[  232.899498] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  232.981764] usb 3-1.1.3: reset full-speed USB device number 14 using xhci_hcd
[  233.083210] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device

similar EMI error after exception is thrown:

[  474.122768] cdc_acm 3-1.1.3:1.0: ttyACM1: USB ACM device
[  475.833540] usb 3-1.1-port3: disabled by hub (EMI?), re-enabling...
[  475.834883] usb 3-1.1.3: USB disconnect, device number 55
[  476.063461] usb 3-1.1.3: new full-speed USB device number 56 using xhci_hcd
[  476.170091] usb 3-1.1.3: New USB device found, idVendor=1209, idProduct=0d32

Also the exception that is being caught seems to have changed from
AttributeError: ‘RemoteObject’ object has no attribute ...
to
received unexpected ACK: 8068
(number after ACK is different each time)

I added two sigrok captures again to the same link:
https://drive.google.com/drive/folders/1ZsbgBWiQUggRHibgv2StUjEyhWG2lpkM?usp=sharing

EDIT:
I can confirm the blacklist is working (I did upgrade modemmanager from 1.6.4 to 1.8.2 so I can set filter options). But still couldn’t figure out how to stop ubuntu from polling the ACM devices. I’ll post an edit if when I get a chance to try again. I might try opening the USB hubs I have and compare how different USB hub ICs behave. Glad to know there is not a single error packet caused by ODrive in the capture.

>>> sudo mmcli -G DEBUG;
Successfully set logging level
>>> journalctl -f | grep "ModemManager.*\[filter\]"

(plug in ODrive)

ModemManager[1076]: <debug> [filter] (tty/ttyACM0): port filtered: device is blacklisted

how did you upgrade from modemmanager 1.64 to 1.8.4?
i did not have modemmanager installed at all while observing the odrive usb error. apt installed modemmanager 1.6.4 on my ubuntu 16.04

when i try your commands to test usb filtering i get

root@MTBD00694:~/odrive_test# mmcli -G DEBUG;
error: couldn’t find the ModemManager process in the bus
root@MTBD00694:~/odrive_test#

==>looks like it cannot know whether usb filtering works or not

can you elaborate on the hardware/software part of your CAN communication ?
you post in CAN dscussion suggests this for CAN-RS232 conversion + FTDI based RS232-USB (can you send us a link where you bought yours plz?) + code screenshoots suggests a custom implementation in c/cpp to send CAN commands (over usb then)

@madcowswe => i’m really stuck here.
an unreliable system cant go in production.
it really is a pitty because I remember massively sending odrive USB position commands for 3 weeks 24/7 and everything was fine (without any load on the motor though). ok that was n late 2017, so many things must have changed since then. but the whole point was precisely to avoid the situation i’m facing now

What should i do here? switch to windows (omg!) ? use ubuntu 18 ? use a raspberry pi? dont touch my ubuntu but use another odrive interfacae? surely someone managed to use odrive in a reliable way, right?

If your ubuntu didn’t have modemmanager installed, I think it wouldn’t have caused USB crash in the first place.
but if you wish to try 1.8 you can install it by:

sudo apt-repository ppa:aleksander-m/modemmanager-xenial
sudo apt-get update
sudo apt-get --only-upgrade install modemanager

I had to edit /lib/systemd/system/ModemManager.service

to have the following lines under [Service] section:

...
ExecStart=/usr/sbin/ModemManager --filter-policy=default
...
Environment="MM_FILTER_RULE_TTY_ACM_INTERFACE=0"

I think there is also a snap package which may be installed:

snap list | grep modem-manager

As for the CAN hardware, any CAN <-> serial converter would work. One I am using you can buy from this link in ebay ( https://www.ebay.com/itm/SystemBase-sCAN-RS-232-DE-09S-DB9-female-CAN-DE-09P-DB9-male-serial/323363066646?hash=item4b49f0d316:g:0D4AAOSwqj5bVfod&frcectupt=true)
However I think for this price you have other options from larger manufacturers.
Using this kind of CAN converter makes the host side software pretty much the same whether you use ODrive’s uart or can interface (read/write to serial). I am using a python program using pyserial_asyncio library.

After looking at the captures from logic analyzer, I don’t think there is really a cause in the ODrive’s usb implementation, but the EMI noise affecting the host side (in my case a USB hub). I think USB connection might not be the best choice for interfacing with a motor controller because of electrical noise and the way USB protocol maintains connections. I think your best bet is trying either UART or CAN interface. The Stanford Doggo project used UART.

i am confused.
due – in part to this bug – we moved the entire machine back to the lab a few days ago (change of building)

  • 2 days ago, i did the test with the ferrite on the usb cable near the odrive board + the udev , ENV{ID_MM_DEVICE_IGNORE}="1" rule + install modemmanger 1.6.4 => the machine ran in infinite loop from 13h to 16h30 (3h30 straight). we turned it off becase we were sick with the noise and thought maybe the issue was gone
  • yesterday, i reverted back to how it was when the bug appeared (default udev rule + no ferrite + uninstalled modemmanager), in order to observe that the bug does reappear. It ran again in infinite loop from 10h30 to 15h30 (5h straight) => the bug did not occur either

=> I now seem unable to reproduce the bug. :cry: :cry: :cry:

i have started a test at 15h30 and will ask the gard to hit the emergency stop in case it’s still running at 20h.

i’m really lost here.

If you need it to be running for hours without any interruption really try the other interfaces and save yourself lots of headache down the road… Especially if you have any cats around.:pouting_cat:

hello,

after a couple weeks in standby, i worked again with odrive.

making runs of unattend use (recorded on timelapse videos)
tuesday may 7th from 14h to 23h => 9h
thursday may 9th from 11h to 17h30 => 6h30
friday may 10th from 8h30 to 17h => 8h30
=> all those ran without any issue.

and this morning, the usb bug triggered once after about a minute and never reproduced ever since…

May 13 11:44:41 MTBD00694 kernel: [  940.753530] usb 3-5: new full-speed USB device number 9 using xhci_hcd
May 13 11:44:41 MTBD00694 kernel: [  940.883351] usb 3-5: New USB device found, idVendor=1209, idProduct=0d32
May 13 11:44:41 MTBD00694 kernel: [  940.883354] usb 3-5: New USB device strings: Mfr=1, Product=2, SerialNumber=3
May 13 11:44:41 MTBD00694 kernel: [  940.883357] usb 3-5: Product: ODrive 3.5 CDC Interface
May 13 11:44:41 MTBD00694 kernel: [  940.883358] usb 3-5: Manufacturer: ODrive Robotics
May 13 11:44:41 MTBD00694 kernel: [  940.883359] usb 3-5: SerialNumber: 367033663037
May 13 11:44:41 MTBD00694 kernel: [  940.883957] cdc_acm 3-5:1.0: ttyACM1: USB ACM device
May 13 11:44:41 MTBD00694 mtp-probe: checking bus 3, device 9: "/sys/devices/pci0000:00/0000:00:14.0/usb3/3-5"
May 13 11:44:41 MTBD00694 mtp-probe: bus: 3, device: 9 was not an MTP device
May 13 11:45:55 MTBD00694 kernel: [ 1015.409504] usb 3-5: reset full-speed USB device number 9 using xhci_hcd
May 13 11:45:55 MTBD00694 kernel: [ 1015.538705] cdc_acm 3-5:1.0: ttyACM1: USB ACM device
May 13 11:47:59 MTBD00694 kernel: [ 1139.413994] usb 3-5: USB disconnect, device number 9

usb 3-5: reset full-speed USB device number 9 using xhci_hcd ==> this reset message seems to be related.

Looking at my interface options (https://docs.odriverobotics.com/interfaces)

  • native usb => according to documentation the recommanded approach but i do experience the random/phantom bug described here (+ the very annoying 10 sec each time you open a communication towards odrive)
  • step/dir => not recommanded anyway
  • pwm => i need absolute positionning so i guess this is not an option
  • CAN => not in official documentaton yet + requires extra hardward (CAN-RS232 + RS232-USB => and it’s USB in the end anyway!!!) + requires extensive code to write since I curently use the python odrive client and would have to write an entire CAN client… not to mention the time waiting for extra hardware to be sourced
  • UART-PC => the PC I use does not have a hardware serial port. I may find a PCI card that offers such a port. or use a classic RS232-USB device but would end up with risk of yet another USB bug? Besides, Dont know if I need to write my own code or if the existing odrive python lib will work with UART too.
  • UART-Arduino => i would need to port all my logic from python-PC to arduino C (quite a lot of work), and since I precalculate all the intermediary absolute positions prior to sending commands to odrive, i’m not sure I would have enough memory in an uno or a mega.

i feel stuck :’(

I guess the quick and dirty way would be to implement some sort of watch dog: make main script start with powering on/off the odrive and regularly update a given file. then have a monitor script that checks the date/time of that file => if it’s older than 30sec, assume the main script has died and kill/restart it.

any suggestion is welcome really

You are right that both CAN and UART converters interfaces PC with USB, however using CAN you don’t have to have a shared GND between ODrive and CAN hardware. If you want to keep using USB in native protocol, galvanically isolating ODrive and PC using an usb isolator IC may help (there are prebuilt breakouts as well).

Have you tried writing your own native usb driver? From what you post about the issue, it seems to be a client side problem, and since the messages are not that complicated it might be worth your time.

yes i agree. symptoms suggests a client side issue. but from my perspective => since a a client is supported by the project, then it should work.

Anyway, as I was trying to build my simple watchdog concept + obstacle detection algo => I stumbled upon yet another weird behavior this morning. I observed some strange weird trajectories (typically some lines wouldnt be straight any more). sometimes the main “beam” of the corexy would even collide with the chassis => something I had seen only when the encoder wheel would be physically dettached from the motor shaft. But i did check everything and both encoder wheels were tightly attached on both motors. Besides, the behavior would stop if i reboot the odrive. then reappear on following reboot. then disappear. then reappear. I did not understand at all.

Eventually the main “beam” heavily collided with the chassis and one belt was completely ripped off!!!* ((pretty scary in fact. fortunately everything is enclosed). This is most likely happened because we changed from a lab power supply that automatically shuts down if it tries to deliver more than 10A, to an industrial power supply that delivers up to 25A. It’s the later that I have been using during the 3 latest several hour long tests. I’m guessing that instead of being the power supply shutting off, the belt became the next weak point in the chain. We should have maybe used some sort of a fuse here :confused:

Now if I want to continue, I need to pay for about 300€ of belt to replace them both.

but i’m really feeling tired of this project all together.

Sorry to hear you’re so frustrated.

I had an ODrive force my axis into the end stops a few times, which caused quite a setback the first time it happened. Since then I include springs at both ends of all my axis to prevent damage. I really recommend doing this, and judging by the pictures you showed it would fit nicely. This is where I ordered mine: https://www.bearingboys.co.uk/Light-Load-Die-Springs--Green-2429-c - select an inner-diameter that is larger than your beams.

Similarly, about fifteen years ago I was working on a bipedal robot, when it kicked me in the balls because the higher-level controller froze. Fun times.

Anyways, the current state of the project can be a disappointment to people. It does require a lot of configuration to get everything running smoothly, and there are bugs.

I get the impression Oskar and his team work hard to keep everybody happy. However, maybe there should be a clearer disclaimer when you buy an ODrive.

I used a controller from roboteq before, and it is a great product but I switched to ODrive because of the price point and the weight & size. They’re about 2-4 times as expensive per axis, and they’re about 2 times larger & heavier due to the casing and heatsink. However the software is easier to use and has more features. https://www.roboteq.com/index.php/roboteq-products-and-services/brushless-dc-motor-controllers#prflag

Short update

  1. I got new belts for 100€ instead of 300€
  2. I’ve ordered springs
  3. I am strengthening the error sanity checks
  4. I add an electric safety (if more than X amps are delivered by power supply, power supply is immediately witched off)
    And most importantly
  5. I got rid of the PC, python and USB interface. I opted for an arduino mega, RS232 interface and the OdriveArduino I modified for my needs.

There was quite some work to rewrite the trajectory planing algorithm that fits the constraints of the arduino board. I’m streaming position commands at 100Hz. I now have the system working for a few hours and it seems rather stable… I guess we will continue to turn on “Odrive radio” (having the machine working non stop for hours) for a few days to build the trust again.

Yeah and I also did

  1. Order alternatives to odrive, including from roboteq and ingeniamc.com . In both cases their professional support seems just right by the way.
1 Like

Good to hear you’re making progress! If you end up finding that the UART is a bit noisy, you can always get a CAN shield for the Mega and run CAN. You may also want to consider upgrading to a Yun or a Teensy or STM32uino… all of which will run Arduino code, but have much faster 32-bit ARM processors.

I also had this problem, but I also found the solution, do you by any chance use multi-threading in your python ?
Well, I do, and I was calling the “odrv0” object from two different threads (on set the speed and the other was collecting information), and we all know that reading/writing to a shared object from multiple threads can cause a collision, so I made sure to call lock.acquire() and release.

Blockquote

Global variable

lock = threading.Lock()

Inside the threads (example)

lock.acquire()
if odrv0.axis1.current_state != AXIS_STATE_CLOSED_LOOP_CONTROL:
odrv0.axis1.requested_state = AXIS_STATE_CLOSED_LOOP_CONTROL
lock.release()

Blockquote

and the issues went away.

1 Like

Thanks for pointing that out. I was not using multi threading. Besides, I am also pretty sure there was no odrivetool nor any process I am aware of that would access the odrive.

Anyway, my python code is dead code now.

For the record, we have been testing the uart/arduino interface for several hours over several days now. The instability seems gone. But we will better now when the real usecase starts, hopefully in a few days.

Unfortunately the real test is postponed until late August at best :frowning:
I’ll update with results once we have seen the beast running for a few days

1 Like

Well I guess the topic name might be seriously misleading now :grinning:

We are having the odrive-powered machine back in pre-production stage since last Monday. After several dozen pallets in the lab to fine tune the behavior.

Today we cut about 500 boxes with it. No particular issue.

The whole thing seems rather stable. Such tests will continue this week and likely next week too. But at the moment it’s being manually loaded with boxes.
Then likely in November/December we will change the current prod layout to connect the main conveyor directly to the machine. And then we will finally know if we can test odrive for a serious industrial setup in a torture-like usecase.

If everything goes well – and maybe even if it doesn’t – I may write an entire article entitled “box cutter / an odrive story”

@madcowswe would you be interested ?

What branch are you using, btw? Master firmware?

the default fw that you can download from the main site.

the binary i can see in our directory (along with sha1sum) is

fe0c15963bfbdeda33135d78ac4644a147a83792 *ODriveFirmware_v3.5-48V_0.4.10.elf

we are using odrive 3.5 in 48v version i beleive

[edit] I made the decision to never touch the odrive firmware and always use master/production fw. [/edit]

1 Like

we got about 2200 boxes cut with the machine in the last few days. and still counting.
it’s a manual “test and validation” setup for the moment. Meaning that the machine is parallel to the main conveyor belt and they manually take boxes from the main belt, get them through the box cutter, and manuall get the newly cut box back on the belt.

We still have a few issues with non-standard boxes that get jammed in the machine (no relationship with odrive). We need to get that under control prior to connecting to the main belt to the machine.

but overall it’s looking good.

1 Like

Loss of USB communication might be because of the power supply of Odrive not providing enough current. That was the reason I was getting this error while trying to calibrate the motors.