The RPC topic


#1

We currently use Firbre as the main method of communicating between the Python tools and the ODrive. This has many good features, but it is not that mature, and there could be some risk associated with proceeding with Fibre instead of some more established protocol. Hence I would like to take the time at this stage to re-evaluate what the roadmap in terms of communication should be.

Vision for ODrive

Let me start by giving the picture of how I see the priorities:

  1. Compatibility with existing standards.
    Where it makes sense, it would be nice to be able to plug-and-play with existing systems. Supporting ubiquitous existing standards makes this possible. Examples are G-code, step/dir/enable, basic ascii position commands and maybe even (a subset of) CiA 402. Note that these are protocols that exist “on the side”, and are not the native protocol of the ODrive: I guess the point is that whichever we use as the native protocol, it should be capable of coexisting on the ODrive with these other ones.

  2. Easy to change, remix, prototype and develop with.
    I have no problem sacrificing a bit of memory footprint, bandwidth, CPU-time etc, to gain a protocol that is extremely plug-and-play: between nodes, between vendors, between hardware platforms.
    It should be completely automatic and with short turnaround time to add or change a message.
    The ultimate aim is to have the machinery to very easily add some definition in a single place, and then you can automatically get a float somewhere in firmware memory, along with sensible ways to refer/interface to this float, and the ability to spawn a slider in a GUI for playing with this float.

  3. Transport agnostic
    One day we use USB, another we use UART, or CAN, or Ethernet, or communication lasers.
    We don’t necessarily demand that this exists already for stm32, but it is important that it should be fairly straightforward to add.

  4. Maintenance and support
    ODrive Robotics is extremely resource limited. We are not Google, Apache or Mozzilla, we don’t want to be in the business of maintaining protocol standards. Any and all reuse of existing actively maintained implementations is very good. This doesn’t mean that we will be able to just point our finger at an existing protocol and hope that it will work: we may need to do some legwork to get it suitable for use with ODrive. But ideally we do some integration work that we try to keep as simple as possible, and let someone else maintain the rest.

  5. Popular
    This is somewhat related to the above: We want to maximise the chance that someone else has already or will write a module that supports the protocol, then we get compatibility for free.
    Example: Some completely unrelated entity may write a general gprc or Thrift interface module to ROS, but not so much a less popular one.

  6. An RPC protocol
    To easily implement a decentralized robotics architecture, it’s much more valuable to be able to seamlessly call, invoke, signal and query the remote objects, code and functions, rather than just plainly sending around data by itself.
    As noted above, the machinery required to expose/translate calls should be as automated as possible. Similar to how in Fibre we can easily point to a C++ function and it’s automatically exposed to Python. Or in other protocols like grpc, you specify the arguments/return/name of the function, and interfaces are generated in both languages.

Evaluation of Fibre today

The vision behind fibre is very aligned with the vision for ODrive, so it’s no surprise that even in the currently fairly immature state, it does a very good job. I can point a finger at some data member or function on the ODrive firmware, and it appears by sheer magic in Python. And this is without having to distribute some schema to anywhere, or regenerate code, or anything.

On the other hand, there is no client side implementation for ODrive, there is no facility to initiate transfers on the ODrive’s initiative (subscription), the latency and bandwidth seen on USB has lots of room for improvement, the density of TODO is immense, I happen to know that some parts were whipped together very quickly, the testing is limited, and the only people testing it are ODrive users. In other words, it is very not mature.

Exacerbating this is the fact that it is all written by @Samuel, who no longer works at ODrive Robotics. So when there are issues, it’s very painful for me because the burden falls on me to fix it. While I am fully capable to dig into the meat of this kind of stuff, I don’t have the time or resources to do so: I need to focus on ODrive’s core features and product oriented stuff.

Use something established?

The alternative would be to integrate with some established RPC standard. Maybe it doesn’t align as precisely with the vision of ODrive, but it would be a lot more mature and have other active maintainers. Other people to hit bugs first, and smaller chance of issues that require me to deal with them.

The idea would be to spend some focused effort integrating it with ODrive, and then the maintenance burden would be a lot less from then on. Importantly, we can still contribute or add pieces we need to the existing standard, rather than recreate everything. Basically avoid this:

Some things I found

I had a very brief look around to see what might be suitable, and the most appropriate thing I found would be to add a USB transport to grpc using nanopb. There are others who seems to have expressed interest in communicating grpc to microcontrollers, including over USB.

Of note is also Apache Avro, specifically it’s self-describing shema-exchange handshake.


Using google protobuf for communication protocol instead of json files?
#2

I can generally agree with this analysis. While I do believe that from a technical standpoint Fibre is (or will be) the most suitable choice, I agree that there lack of maturity and popularity is a major drawback.

To aid in a systematic comparison of the technical aspects I made a comparison chart here: https://github.com/samuelsadok/fibre/tree/devel#comparison
It’s not complete so feel free to extend or correct if there are factual errors.

For instance I’m not sure how to list Apache Avro. I understand Avro itself is just a serialization/deserialization protocol, somewhat like protobuf. Avro RPC is an RPC protocol based on Avro, like gRPC is based on protobuf. However you still need to provide a way to transmit data and address devices. It seems like the Apache Kafka distributed streaming framework is usually used to this end.

IMO it’s not clear if the support/maintenance benefit of integrating/adapting another framework (which will inevitably be a non-trivial amount of work) outweighs the benefits of putting the same amount of work into Fibre, however this is not my call to make.


#3

Speaking objectively, I think that the optimal choice is to focus on an existing/mature library/framework - assuming one exists. I do agree that the design goal of Fibre fits ODrive (and a thousand other projects) like a glove, however its relative infancy and lack of contributors is my reasoning. With that said, devoting more resources to FIbre may very well be a tie between what’s readily available to us and the effort to incorporate it. You’ve got my spare time either way.

Addressing the points/requirements @madcowswe has enumerated (numeric list below corresponds to numeric list in OP)…

  1. I don’t think that an ascii protocol should be a consideration. Topic #2 should solve any demand for it, IMO. So long as a client like odrivetool (albeit maybe more featureful) is available, it should suffice for the human-readable aspect of interaction. Beyond that, just use the library for your preferred language.

  2. Versioning should be included in this.

  3. I think one of the most important criteria when examining our options is the existence of an interface for handling request/response asynchronously. For example, gRPC’s core provides for exactly this, but the devs who wrote the various language-specific implementations (which call the core dll) didn’t necessarily carry over the abstraction. As a result, targeting a different protocol than the “standard” http2 is not as straight forward as it could be (depending on the language you’re working on). This issue’s occurrence will be proportional to our desire for individual language implementations.

  4. I would be fine with something unmaintained so long as it isn’t massive and there aren’t a lot of outstanding/abandoned issues and the project saw adequate popularity during its heyday to have ferreted out significant issues.

A few questions that would help narrow the scope of options and thereby get us closer to breaking the tie between effort and Fibre…

  1. What languages are absolute musts as far as clients go (aside from the obvious c/c++)? I imagine Python. I’ll inevitably write a cross platform dll in C# for whatever we wind up with. What else?
  2. How much of a sacrifice are you able to tolerate as far as memory and cpu go? Say, on a scale of 1-10 where 10 is “can’t add any more features 'cause full” and how much storage or execution memory would put ODrive at a 10.
  3. Are we able to say that we do or do not support multiple transports simultaneously on a given ODrive? e.g. an arduino issuing step/dir/enable and a PC polling values via USB at the same time.
  4. Who/what do you think our clients primarily are? USB? UART? Step/dir/enable?
  5. Any particular licenses you want to avoid?
  6. I’m sure I’ll think of another question later.

I did come across ZeroCM the other day, which seems very much inline/similar to Fibre’s goals (@Samuel you should take a look at it) albeit less featureful than the recent roadmap update. It has minimal scaffolding (as in define schema -> generate code (it generates types/messages but that’s it)), but there are several other projects that can fill that role (e.g. Avro, protobuf/nanopb).

PJON is also of note, and appears to provide appropriate interfaces for what we want to do, but I’m really put off by the obscure “standard”.