Update on ARM64: ROS2 on Pi

When I last looked at running ROS on a Raspberry Pi robot brain, I noticed Ubuntu now releases images for Raspberry Pi in both 32-bit and 64-bit flavors but I didn’t know of any compelling reason to move to 64-bit. The situation has now changed, especially if considering a move to the future of ROS2.

The update came courtesy of an announcement on ROS Discourse notifying the community that supporting 32-bit ARM builds have become troublesome, and furthermore, telemetry indicated that very few ROS2 robot builders were using 32-bit anyway. Thus the support for that platform is demoted to tier 3 for the current release Foxy Fitzroy.

This was made official on REP 2000 ROS 2 Releases and Target Platforms showing arm32 as a Tier 3 platform. As per that document, tier 3 means:

Tier 3 platforms are those for which community reports indicate that the release is functional. The development team does not run the unit test suite or perform any other tests on platforms in Tier 3. Installation instructions should be available and up-to-date in order for a platform to be listed in this category. Community members may provide assistance with these platforms.

Looking at the history of ROS 2 releases, we can see 64-bit has always been the focus. The first release Ardent Apalone (December 2017) only supported amd64 and arm64. Support for arm32 was only added a year ago for Dashing Diademata (May 2019) and only at tier 2. They kept it at tier 2 for another release Eloquent Elusor (November 2019) but now it is getting dropped to tier 3.

Another contributing factor is the release of Raspberry Pi 4 with 8GB of memory. It exceeded the 4GB limit of 32-bit addressing. This was accompanied by an update to the official Raspberry Pi operating system, renamed from Raspbian to Raspberry Pi OS, is still 32-bit but with mechanisms to allow addressing 8GB of RAM across the operating system even though individual processes are limited to 3GB. The real way forward is to move to a 64-bit operating system, and there’s a beta 64-bit build of Raspberry Pi OS.

Or we can go straight to Ubuntu’s release of 64-bit operating system for Raspberry Pi.

And the final note on ROS2: a bunch of new tutorials have been posted! The barrier for transitioning to ROS2 is continually getting dismantled, one brick at a time.

Learning DOT and Graph Description Languages Exist

One of the conventions of ROS is the /cmd_vel topic. Short for “command velocity”, it is commonly how high-level robot planning logic communicates “I want to move in this direction at this speed” to lower-level robot chassis control nodes of a robot. In ROS tutorials, this is usually the first practical topic that gets discussed. This convention helps with one of the core promises of ROS: portability of modules. High level logic can be created and output to /cmd_vel without worrying about low level motor control details, and robot chassis builders know teaching their hardware to understand /cmd_vel allows them to support a wide range of different robot modules.

Sounds great in theory, but there are limitations in practice and every once a while a discussion arises on how to improve things. I was reading one such discussion when I noticed one message had an illustrative graph accompanied by a “source” link. That went to a Github Gist with just a few simple lines of text describing that graph, and it took me down a rabbit hole learning about graph description languages.

In my computer software experience, I’ve come across graphical description languages like OpenGL, PostScript, and SVG. But they are complex and designed for general purpose computer graphics, I had no idea there were entire languages designed just for describing graphs. This particular file was DOT, with more information available on Wikipedia including the limitations of the language.

I’m filing this under the “TIL” (Today I Learned” section of the blog, but it’s more accurately a “How did I never come across this before?” post. It seems like an obvious and useful tool but not adopted widely enough for me to have seen it before. I’m adding it to my toolbox and look forward to the time when it would be the right tool for the job, versus something heavyweight like firing up Inkscape or Microsoft Office just to create a graph to illustrate an idea.

Another Z-Axis End Stop For Geeetech A10

Once the power situation was improved to something more acceptable, I revisited the Z-axis end stop. Because the bare wire hack attached with tape was never going to cut it long term.

The first order of business was to transfer the circuit to a small circuit board instead of just wires hanging in the air. This little board was broken off from a larger prototype board, an easy task as the board was already perforated. The inexpensive switch I used (*) had two mounting holes that conveniently lined up to holes on the perforated prototype circuit board, so I soldered two pins at those locations as primary load bearers. I pushed the switch against those two pins as I soldered the three signal pins, hopefully this means any downward force from the homing procedure would be directed into the two mounting pins and not the three electrical pins.

Geeetech A10 Z axis end stop clip old and new

Geeetech A10 Z axis end stop clip CADOnce I had a small circuit board to hold the switch wired to the appropriate JST-XH (*) 3P (3-pin or 3-position) connector, I designed and 3D printed a small bracket to hold it to the machine. I saw no signs of how the original Z-axis may have been fastened, certainly no obvious holes to reuse. So I designed a clip-on bracket. The tool-less installation is a plus, but it came with the downside that it could not grip solidly enough to reliably hold a Z-axis position.

Right now it is sitting at the bottom against a cross beam, sitting at a height that I had guessed is relatively close to the original Z-axis end stop switch position. If that is too high, I will either have to print a shorter bracket or take a knife and trim some of the bottom off of this one. If it is too low, I can add something underneath this bracket to act as a spacer or print a taller bracket.


(*) Disclosure: As an Amazon Associate I earn from qualifying purchases.

Replacement Power Panel for Geeetech A10

A very crude Z-axis end stop switch allowed me to verify this partial chassis of an old Geeetech A10 could still move in the X, Y, and Z axis. Once proven, I went back to refine the hacks done in the interest of expediency for those tests. First task is the power adapter, which had been a cheap barrel jack not quite the correct dimensions for reliable electrical contact with the 12V DC power adapter I’ve been using.

The 12V supply itself was a hack, as the Geeetech A10 printer is actually designed as a 24V printer but I didn’t have a 24V power brick handy. Since this printer has been deprived of its print nozzle and heated bed, the majority of power draw are absent leaving only the motors. I understand the stepper motor current chopper drivers would still keep the current within limits and give me nearly equivalent holding torque. However, halving the voltage meant it couldn’t sustain as high of a maximum speed and I saw this on the Z-axis. The X axis is super light (as there is no print head) and had no problem running quickly on 12V. The Y axis has to move the print bed carriage (minus heated print bed) and had a little more difficulty, but still plenty quick. So it was the Z-axis that ran into limitations first, as it had to push the entire carriage upwards and it would lose steps at higher speeds well before reaching firmware speed limits that are presumably achievable if given 24V.

Geeetech A10 power panel CADA reduced top speed was still good enough for me to proceed so I drew up a quick 3D printable power panel for the printer. Since the 12V DC power supply was from my disassembled Monoprice Mini printer, I decided to reuse the jack and the power switch as well. Two protrusions in the printed plastic fit into extrusion rails, though it took a few prints to dial in the best size to fit within the rails.

With this power panel I could use the 12V DC power adapter and the connection is reliable. No more power resets from jiggled power cables! It also allows me to turn the printer off and on without unplugging the power jack.

With this little power panel in place, I moved on to build a better Z-axis end stop.

Successful Launch Of Mars-Bound Perseverance

The rover formerly known as Mars 2020 is on the way to the red planet! Today started the interplanetary journey for a robotic geologist. Perseverance rover is tasked with the next step in the sequence of searching for life on Mars, a mission plan informed by the finding of predecessor Curiosity rover.

I first saw the mission identifier (logo) when it was on the side of the rocket payload fairing and thought it was an amazing minimalist design. It also highlighted the ground clearance for the rover’s rocker-bogie suspension, which has direct application to Sawppy and other 3D-printed models here on Earth. I’ll come back to this topic later.

Speaking of Sawppy, of course I wasn’t going to let this significant event pass without some kind of celebration, but plans sometimes go awry. I try to keep project documentation on this blog focused by project, in an effort to avoid disorienting readers with the real-life reality that I constantly jump from one project to another, then back, on a regular and frequent basis. I’ve been writing about my machine automation project over the past few days and I have a few more posts to go, but here’s a sneak peek at another project that will be detailed on this blog soon: the baby Sawppy rover.

Originally intended to be up and running by Perseverance launch day (today) I missed that self-imposed deadline. The inspiration was the cartoon rover used as mascot for Perseverance’s naming contest, I wanted to bring that drawing to life and it fit well with my goal to make a smaller, less expensive, and easier to build rover model. Right now I’m on the third draft whose problems will inform a fourth draft, and I expect many more drafts. The first draft’s problems came to a head before I even printed all the pieces and was aborted. The second draft was printed, assembled, and motorized.

Getting the second draft up and running highlighted several significant problems with the steering knuckle design. Fixing it required changing not just the knuckles, but also the suspension arms that attached to them, and it ended up easier to just restart from scratch on a third draft. I couldn’t devote the time to get the little guy up and running by launch day, so I had to resort to a video where I moved it by hand.

Still, people loved baby Sawppy rover, including some people at the NASA Perseverance social media team! The little guy got a tiny slice of time in their Countdown to Mars video, at roughly the one minute mark alongside a few other hobbyist rovers.

More details on baby Sawppy rover will be coming to this blog, stay tuned.

Crude Z Axis End Stop For Geeetech A10

Preliminary exploration of a retired Geeetech A10 has gone well so far, enough that I felt confident discarding the control panel I did not intend to use. Before I tossed the control panel in a box, I verified each of the motors could move via jogging commands. But before I can toss more complex commands at the machine, I need a way to reset the machine to a known state. In machine tools this is called a “homing” operation, and this 3D printer do so via the G28 Auto Home command to set each axis to their end stop switches.

Problem: While the X and Y axis still had their respective end stop switches, this machine is missing the Z-axis switch and I wanted to whip up a quick hack to test the machine capabilities. If it works, I’ll revisit the problem and spend more time on a proper one. If it doesn’t work, at least I haven’t wasted a lot of time and effort.

The existing X-axis end stop was buried inside the mechanism, but the Y-axis end stop is visible. I was surprised to see a circuit board with several surface mount components on board. Unlike most of my other 3D printers, the end stop mechanism isn’t just a pair of wires hooked up to a single switch, there are actually three wires.

Geeetech A10 Y endstop

I removed the Y-axis switch to probe the circuit and search online. It appears to be close but not quite identical to the RepRap design, and had a few additions like a LED and its associated current-limiting resistor. The LED is a nice indicator of switch toggling status, but it is not strictly necessary. This end stop boiled down to a switch that directly connects the normally open leg to common, and a resistor between the normally closed leg and common.

Once understood, I grabbed a micro switch waiting in my parts bin (*) and created a free form wire soldering job for the test attached with double sided foam tape. (Picture up top.) The foam tape did not hold position well enough, so additional structure support was added in the form of blue painter’s tape.

Geeetech A10 Z endstop hack with tape

Hacks upon hacks, it’s hacks all the way down.

But it was good enough for G28 Auto Home to succeed, which opened the door for more tests to verify this 3D printer chassis could still execute motion control commands coordinated across all three dimensions. Once I was satisfied it is working well enough for further tinkering, I revisited the power hack to make it more reliable.


(*) Disclosure: As an Amazon Associate I earn from qualifying purchases.

Geeetech A10 Control Panel Removed

Once I had the retired Geeetech A10 3D printer powered up, I could start poking around to see what is working and what is not. Obviously the control panel was my entry point to jog each axis. I was very happy to see the individual motors move on command, but I couldn’t command a homing cycle just yet due to the missing Z-axis switch.

However, the control panel itself was annoying to use. The screen contrast was poor, and user responsiveness is lacking. I frequently find that encoder steps were ignored, as were some of my wheel presses to select menu options. I experienced the same frustration with the Monoprice Maker Select, and I had thought those issues were specific to that printer. Now I’m starting to wonder if this is common with 3D printers running Marlin on a ATmega328 control board.

The good news is that I don’t plan to interact with the control panel for much more than this initial test. Once I established the board was functional, I no longer feared the USB port damaging my computer so I found an appropriate USB cable and plugged it in. The expected USB serial device showed up. With the popular settings 250000 8N1, I could command the printer via Marlin G-code. This is how I intend to control this machine as a three-axis motion control platform.

I didn’t intend to use the control panel anymore, and I could have just left it alone. But it also sticks out to the side of the printer and awkwardly taking up space. After a particularly painful meeting between a body part and an outer corner of the panel, I took a closer look at how it was connected to the control board. It seems to be a single ribbon cable plugged into a single connector that had two dabs of hot glue to help keep it in place.

Geeetech A10 control panel ribbon cable connector

I removed the hot glue and the cable to see if this printer would continue functioning as a USB serial peripheral, in the absence of the control panel. Good news: it does! I could move all three axis (X, Y, and Z) via G0 commands. So after removing two M5 bolts, the control panel go live in a box. Cleaning up the printer outline and hopefully reducing painful episodes in the future.

Now I need to install a replacement Z-axis homing switch in order to try homing cycle.

Power Input Replacement for Geeetech A10

I’ve received the gift of a retired Geeetech A10 3D printer. It is missing some important components for 3D printing, but its three axis motion control components are superficially intact. The machine is in unknown condition with no warranties expressed or implied. Ashley Stillson, the previous owner, don’t remember everything that was wrong with it, but she did not remember anything dangerous. (My specific question was: “Will I burn down the house if power it up?”)

Not burning down the house was a good baseline, so I’ll begin by supplying the machine with some power to see what wakes up. The first task was to replace the XT60 power connector. The XT60 isn’t a type I use and hence I had nothing to plug into it. This type is an excellent connector for high current draw applications, but since I’m not planning to run a heated print bed nor a filament nozzle heater, I can start with something less capable and more generic. So instead of buying some XT60 connectors (*), I replaced it with a jack for a barrel plug (*) that I already had on hand.

The cheap jack I have on hand is listed with outer diameter of 5.5mm and inner diameter of 2.1mm. It is very close but not exactly the correct type to connect to the 12V DC power supply from my disassembled Monoprice Mini printer, which I guess is actually the very similar and popular type 5.5mm OD / 2.5mm ID. But what I have is close enough for a little hacking to permit power to flow.

Later I learned I had made an assumption I didn’t even realize I was making at the time: I assumed the printer wanted 12V power. The Geeetech A10 is actually a 24V printer! This is irrelevant to the electronics, which will run on stepped-down voltage probably 5V. It is most important for the heater elements, which are absent anyway. In the middle are the stepper motor subsystem, where 12V is not ideal leaving them less capable than if they were fed 24V, but they should function well enough to let me evaluate the situation.

When power was supplied, a fan started spinning, a red LED illuminated, followed by the control panel coming to life. We are in business.


(*) Disclosure: As an Amazon Associate I earn from qualifying purchases.

Retired Geeetech A10 3D Printer

My herd of 3D printers has gained a new member: a Geeetech A10. Or at least, most of one. It was a gift from Ashley Stillson, who retired this printer after moving on to other machines. Wear on the rollers indicated it has lived a productive life. Its age also showed from missing several of the improvements visible in the product listing for the current version. (And here it is on Amazon *)

In addition to those new features, this particular printer is missing several critical components of a 3D printer. There is no print head to deposit melted plastic filament, it has no extruder to push filament into the print head. The Bowden tube connecting those two components are missing. There is no print bed to deposit filament on to, and there is no power supply to feed all the electrical appetite.

It does, however, still have all three motorized axis X, Y, and Z, and a logic board with control panel. X and Y axis still had their end stop switches, but the Z axis switch is absent leaving only a connector for the switch.

Geeetech A10 Z endstop connector

The only remnant of the power supply system is a XT60 plug. I don’t use XT60 in my own projects and have none on hand, so I will either need to buy some (*) or swap out the connector to match a power supply I have on hand.

Geeetech A10 XT60 power connector

It would take some work to bring it back into working condition as a 3D printer, but that’s not important right now because my ideas for this chassis is not to bring it back to printing duty. I’m interested in putting its three-axis motion control capability. to other use. But first, I need to get its three axis moving, which means giving it some power.


(*) Disclosure: As an Amazon Associate I earn from qualifying purchases.

And Now I’m Up To (Most Of) Five 3D Printers

When I first got started in 3D printing, I was well aware of the trend for enthusiasts in the field to quickly find themselves with an entire flock of them. I can confirm that stereotype, as now I am in the possession of (most of) five printers.

My first printer, a Monoprice Select Mini, was still functional but due to its limitations I had not used it for many months. I had been contemplating taking it apart to reuse its parts. When I talked about that idea with some local people, I found a mutually beneficial trade: in exchange for my functioning printer, I traded it for a nearly identical but non-functioning unit to take apart.

My second, a Monoprice Maker Ultimate, has experienced multiple electrical failures with an infamous relay, and I suspect those failures had secondary repercussions that triggered other failures in the system. It is currently not working and awaiting a control board upgrade.

My third printer, a Monoprice Maker Select, was very affordable but there were trade-offs made to reach that price point. I’ve since had to make several upgrades to make it moderately usable, but it was never a joyous ownership experience.

Those three printers were the topic of the tale of 3D printing adventures I told to Robotics Society of Southern California. One of my parting advise was that, once we get to the ~$700 range of the Maker Ultimate, there were many other solid options. The canonical default choice is a Prusa i3 and I came very close to buying one of my own several times.

What I ended up buying is a MatterHackers Pulse, a derivative of the Prusa i3. I bought it during 2019’s “Black Friday” sale season, when MatterHackers advertised their Pulse XE variant at a hefty discount. Full of upgrades that I would have contemplated installing anyway, it has performed very well and I can happily recommend this printer.

Why would I buy a fifth printer when I had a perfectly functioning Pulse XE? Well, I wouldn’t. I didn’t get this printer because it was better, I picked it up because it was free. I have some motion control (not 3D printing) projects on the candidate list and a retired partial Geeetech A10 printer may prove useful.

OpenCV AI Kit

For years I’ve been trying to figure out how to do machine vision affordably so I could build autonomous robots. I looked at hacking cheap LIDAR from a Neato robot vacuum. I looked at an old Kinect sensor bar. I looked at Google AIY Vision. I looked at JeVois. I tried to get a grounding in OpenCV. And I was in the middle of getting up to speed on Google ARCore when the OpenCV AI Kit (OAK) Kickstarter launched.

Like most Kickstarters, the product description is written to make it sound like a fantastic dream come true. The difference between this and every other Kickstarter is that it is describing my dream of an affordable robot vision sensor coming true.

The Kickstarter is launching two related products. The first is OAK-1, a single camera backed by hardware acceleration for computer vision algorithms. This sounds like a supercharged competitor to machine vision cameras like the JeVois and OpenMV. However, it is less relevant to a mobile autonomous robot than its stablemate, the OAK-D.

Armed with two cameras for stereoscopic vision plus a third for full color high resolution image capture, the OAK-D promises a tremendous amount of capability for (at least the current batch of backers) a relatively affordable $149. Both from relatively straightforward stereo distance calculations to more sophisticated inferences (like image segmentation) aided by that distance information.

Relatively to the $99 Google AIY Vision, the OAK-D has far more promise for helping a robot understand the structure of its environment. I hope it ships and delivers on all its promises, because then an OAK-D would become the camera of choice for autonomous robot projects, hands down. But even if not, it is still a way to capture stereo footage for calculation elsewhere, and only moderately overpriced for a three-camera peripheral. Or at least, that’s how I justified backing an OAK-D for my own experiments. The project has easily surpassed its funding goals, so now I have to wait and see if the team can deliver the product by December 2020 as promised.

Window Shopping ARCore: API Documentation

Investigating Google ARCore for potential robotics usage, it was useful to review their Fundamental Concepts and Design Guidelines because it tells us the motivation behind various details and the priorities of the project. That gives us context around what we see in the nuts and bolts of the actual APIs.

But the APIs are where “the rubber meets the road” and where we leave all the ambitions and desires behind: the actual APIs implemented in shipping phones define the limitations of reality.

We get a dose of reality pretty immediately: estimation of phone pose in the world comes with basically no guarantees on global consistency.

World Coordinate Space
As ARCore’s understanding of the environment changes, it adjusts its model of the world to keep things consistent. When this happens, the numerical location (coordinates) of the camera and Anchors can change significantly to maintain appropriate relative positions of the physical locations they represent.

These changes mean that every frame should be considered to be in a completely unique world coordinate space. The numerical coordinates of anchors and the camera should never be used outside the rendering frame during which they were retrieved. If a position needs to be considered beyond the scope of a single rendering frame, either an anchor should be created or a position relative to a nearby existing anchor should be used.

Since it is on a per-frame basis, we could get Pose and PointCloud from a Frame. And based on that text, these would then need to be translated through anchors somehow? The first line of Anchor page makes it sound that way:

Describes a fixed location and orientation in the real world. To stay at a fixed location in physical space, the numerical description of this position will update as ARCore’s understanding of the space improves.

However, I saw no way to retrieve any kind of identifying context for these points. Ideally I would want “Put an anchor on that distinctive corner of the table” or some such. But still, “Working with anchors” has basic information on how it is useful. But as covered in many points throughout ARCore documentation, use of anchors must be kept at a minimum due to computational expense. Each Anchor is placed relative to a Trackable, and there are many ways to get one. The biggest hammer seems to be getAllTrackables from Sesson, which has a shortcut of createAnchor. There are more narrowly scoped ways to query for Trackable points depending on scenario.

Given what I see of ARCore APIs right now, I’m still a huge fan of future potential. Unfortunately its current state is not a slam dunk for robotics application, and that is not likely to change in the near future due to explicit priorities set by the product team.

But while I had my head buried in studying ARCore documentation, another approach popped up on the radar: the OpenCV AI Kit.

Window Shopping Google ARCore: Design Guidelines

After I got up to speed on fundamental concepts of Google ARCore SDK, I moved on to their design recommendations. There are two parts to their design guidelines: an user experience focused document, and a software development focused variant. They cover many of the same points, but from slightly different perspectives.

Augmented reality is fascinating because it has the potential to create some very immersive interactive experiences. The downside is that an user may get so immersed in the interaction they lose track of their surroundings. Much of the design document described things to avoid in an AR app that boiled down to: please don’t let the user hurt themselves. Many potential problems were illustrated by animated cartoon characters, like this one of an user walking backwards so focused on their phone they trip over an obstacle. Hence one of the recommendations is to avoid making users walk backwards.

Image source: Google

Some of the user experience guidelines help designers avoid weaknesses in ARCore capabilities. Like an admission that vertical surfaces can be challenging, because they usually have fewer identifiable features as compared to floors and tabletops. I found this interesting because some of the advertised capabilities, such as augmented images, are primarily targeted to vertical surfaces yet it isn’t something they’ve really figured out yet.

What I found most interesting was the discouragement of haptic feedback in both the UX design document and the developer document. Phone haptic feedback are usually implemented as a small electric motor spinning an unbalanced weight, causing vibration. This harms both parts of Structure from Motion calculations at the heart of phone AR: vibration adds noise to the IMU (inertial measurement unit) tracking motion, and vibration blurs the video captured by the camera.

From a robotics adaption viewpoint, this is discouraging. A robot chassis will have motors and their inevitable vibrations, some of which would be passed on to a phone bolted to the chassis. The characteristics of this vibration noise would be different from shaky human hands, and given priority of the ARCore team they would work to damp out the human shakiness but robotic motions would not be a priority.

These tidbits of information have been very illuminating, leading to the next step: find more details in the nitty gritty API documentation.

Window Shopping Google ARCore: Tracking

I started learning about Google ARCore SDK by reading the “Fundamental Concepts” document. I’m not in it for augmented reality, but to see if I can adapt machine vision capabilities for robotics. So while there are some interesting things ARCore could tell me about a particular point in time, the real power is when things start moving and ARCore works to track them.

The Trackable object types in the SDK represent data points in three dimension space that ARCore, well, tracks. I’m inferring these are visible features that are unique enough in the visible scene to be picked out across multiple video frames, and whose movement across those frames were sufficient for ARCore to calculate its position. Since those conditions won’t always hold, individual points of interest will come and go as the user moves around in the environment.

From there we can infer such an ephemeral nature would require a fair amount of work to make such data useful for augmented reality apps. We’d need to follow multiple feature points so that we can tolerate individual disappearances without losing our reference. And when new interesting features comes on to the scene, we’d need to decide if they should be added to the set of information followed. Thankfully, the SDK offers the Anchor object to encapsulate this type of work in a form usable by app authors, letting us designate a particular trackable point as important, and telling ARCore it needs to put in extra effort to make sure that point does not disappear. This anchor designation apparently brings in a lot of extra processing, because ARCore can only support a limited number of simultaneous anchors and there are repeated reminders to release anchors if they are no longer necessary.

So anchors are a limited but valuable resource for tracking specific points of interest within an environment, and that led to the even more interesting possibilities opened up by ARCore Cloud Anchor API. This is one of Google’s cloud services, remembering an anchor in general enough terms that another user on another phone can recognize the same point in real world space. In robot navigation terms, it means multiple different robots can share a set of navigation landmarks, which would be a fascinating feature if it can be adapted to serve as such.

In the meantime, I move on to the ARCore Design Guidelines document.

Window Shopping Google ARCore: Concepts

I thought Google’s ARCore SDK offered interesting capabilities for robots. So even though the SDK team is explicitly not considering robotics applications, I wanted to take a look.

The obvious starting point is ARCore’s “Fundamental Concepts” document. Here we can confirm the theory operation is consistent with an application of Structure from Motion algorithms. Out of all the possible type of information that can be extracted via SfM, a subset is exposed to applications using the ARCore SDK.

Under “Environmental Understanding” we see the foundation supporting AR applications: an understanding of the phone’s position in the world, and of surfaces that AR objects can interact with. ARCore picks out horizontal surfaces (tables, floor) upon which an AR object can be placed, or vertical surfaces (walls) upon which AR images can be hung like a picture. All other features build on top of this basic foundation, which also feel useful for robotics: most robots only navigate on horizontal surfaces, and try to avoid vertical walls. Knowing where they are relative to current position in the world would help collision detection.

The depth map is a new feature that caught my attention in the first place, used for object occlusion. There is also light estimation, helping to shade objects to fit in with their surroundings. Both of these allow a more realistic rendering of a virtual object in real space. While the depth map has obvious application for collision detection and avoidance more useful than merely detecting vertical wall surfaces. Light estimation isn’t obviously useful for a robot, but maybe interesting ideas will pop up later.

In order for users to interact with AR objects, the SDK includes the ability to map the user’s touch coordinate in 2D space into the corresponding location in 3D space. I have a vague feeling it might be useful for a robot to know where a particular point in view is in 3D space, but again no immediate application comes to mind.

ARCore also offers “Augmented Images” that can overlay 3D objects on top of 2D markers. One example offered: “for instance, they could point their phone’s camera at a movie poster and have a character pop out and enact a scene.” I don’t see this as a useful capability in a robotics application.

But as interesting as these capabilities are, they are focused on a static snapshot of a single point in time. Things get even more interesting once we are on the move and correlate data across multiple points in space or even more exciting, multiple devices.

Robotic Applications for “Structure From Motion” and ARCore

I was interested to explore if I can adapt capabilities of augmented reality on mobile device to an entirely different problem domain: robot sensing. First I had to do a little study to verify it (or more specifically, the Structure from Motion algorithms underneath) isn’t fundamentally incompatible with robots in some way. Once I gained some confidence I’m not barking up the wrong tree, a quick search online using keywords like “ROS SfM” returned several resources for applying SfM to robotics including several built on OpenCV. A fairly consistent theme is that such calculations are very computationally intensive. I found that curious, because such traits are inconsistent with the fact they run on cell phone CPUs for ARCore and ARKit. A side trip explored whether these calculations were assisted by specialized hardware like “AI Neural Coprocessor” that phone manufacturers like to tout on their spec sheet, but I decided that was unlikely for two reasons. (1) If deep learning algorithms are at play here, I should be able to find something about doing this fast on the Google AIY Vision kit, Google Coral dev board, or NVIDIA Jetson but I came up empty-handed. (2) ARCore can run on some fairly low-frills mid range phones like my Moto X4.

Finding a way to do SFM from a cell phone class processor would be useful, because that means we can potentially put it on a Raspberry Pi, the darling of hobbyist robotics. Even better if I can leverage neural net hardware like those listed above, but that’s not required. So far my searches have been empty but something might turn up later.

Turning focus back to ARCore, a search for previous work applying ARCore to robotics returned a few hits. The first hit is the most discouraging: ARCore for Robotics is explicitly not a goal for Google and the issue closed without resolution.

But that didn’t prevent a few people from trying:

  • An Indoor Navigation Robot Using Augmented Reality by Corotan, et al. is a paper on doing exactly this. Unfortunately, it’s locked behind IEEE paywall. The Semantic Scholar page at least lets me sees the figures and tables, where I can see a few tantalizing details that just make me want to find this paper even more.
  • Indoor Navigation Using AR Technology (PDF) by Kumar et al. is not about robot but human navigation, making it less applicable for my interest. Their project used ARCore to implement an indoor navigation aid, but it required the environment to be known and already scanned into a 3D point cloud. It mentions the Corotan paper above as part of “Literature Survey”, sadly none of the other papers in that section was specific to ARCore.
  • Localization of a Robotic Platform using ARCore (PDF) sounded great but, when I brought it up, I was disappointed to find it was a school project assignment and not results.

I wish I could bring up that first paper, I think it would be informative. But even without that guide, I can start looking over the ARCore SDK itself.

Augmented Reality Built On “Structure From Motion”

When learning about a new piece of technology in a domain I don’t know much about, I like to do a little background research to understand the fundamentals. This is not just for idle curiosity: understanding theoretical constraints could save a lot of grief down the line if that knowledge spares me from trying to do something that looked reasonable at the time but is actually fundamentally impossible. (Don’t laugh, this has happened more than once.)

For the current generation of augmented reality technology that can run on cell phones and tablets, the fundamental area of research is “Structure from Motion“. Motion is right in the name, and that key component explains how a depth map can be calculated from just a 2D camera image. A cell phone does not have a distance sensor like Kinect’s infrared projector/camera combination, but it does have motion sensors. Phones and tablets started out with only a crude low resolution accelerometer for detecting orientation, but that’s no longer the case thanks to rapid advancements in mobile electronics. Recent devices have high resolution, high speed sensors that integrate accelerometer, gyroscope, and compass across X, Y, and Z axis. These 9-DOF sensors (3 types of data * 3 axis = 9 Degrees of Freedom) allow the phone to accurately detect motion. And given motion data, an algorithm can correlate movement against camera video feed to extract parallax motion. That then feeds into code which builds a digital representation of the structure of the phone’s physical surroundings.

Their method of operation would also explain how such technology could not replace a Kinect sensor, which is designed to sit on the fireplace mantle and watch game players jump around in the living room. Because the Kinect sensor bar does not move, there is no motion from which to calculate structure making SfM useless for such tasks. This educational side quest has thus accomplished the “understand what’s fundamentally impossible” item I mentioned earlier.

But mounted on a mobile robot moving around in its environment? That should have no fundamental incompatibilities with SfM, and might be applicable.

Google ARCore Depth Map Caught My Attention

Once I decided to look over an augmented reality SDK with an intent for robotics applications, I went to look at Google’s ARCore instead of Apple’s ARKit for a few reasons. The first is hardware: I have been using Android phones so I have several pieces of ARCore compatible hardware on hand. I also have access to computers that I might be able to draft into Android development duty. In contrast, Apple ARKit development requires MacOS desktop machines and iOS hardware, which is more expensive and rare in my circles.

The second reason was their announcement that ARCore now has a Depth API. Their announcement included two animated GIFs that caught my immediate attention. The first shows that they can generate a depth map, with color corresponding to distance from camera.

ARCore depth map
Image source: Google

This is the kind of data I had previously seen from a Xbox 360 Kinect sensor bar, except the Kinect used an infrared beam projector and infrared camera to construct that depth information on top of its RGB camera. In comparison, Google’s demo implies that they can derive similar information from just a RGB camera. And given such a depth map, it should be theoretically possible to use it in a similar fashion to a Kinect. Except now the sensor would be far smaller, battery powered, and works in bright sunlight unlike the Kinect.

ARCore occlusion
Image source: Google

Here is that data used in ARCore context: letting augmented reality objects be properly occluded by obstacles in the real world. I found this clip comforting because its slight imperfections assured me this is live data of a new technology, and not a Photoshop rendering of what they hope to accomplish.

It’s always the first question we need to ask of anything we see on the internet: is it real? The depth map animation isn’t detailed enough for me to see if it’s too perfect to be true. But the occlusion demo is definitely not too perfect: there are flaws in the object occlusion as the concrete wall moved in and out of the line of sight between us and the animated robot. This is most apparent in the second half of the clip, as the concrete wall retreated we could see bits of stair that should have been covered up by the robot but is still visible because the depth map hadn’t caught on yet.

Incomplete occlusion

So this looks nifty, but what was the math magic that made it possible?

Might A Robot Utilize Google ARCore?

Machine vision is a big field, because there are a lot of useful things we can do when a computer understands what it sees. In narrow machine-friendly niches it has become commonplace, for example the UPC bar code on everyday merchandise is something created for machines to read, and a bar code reader is a very simplified and specific niche of machine vision.

But that is a long, long way from a robot understanding its environment through cameras, with many sub sections along the path which are entire topics in their own right. Again we have successes in narrow machine-friendly domains such as a factory floor set up for automation. Outside of environments tailored for machines, it gets progressively harder. Roomba and similar robot home vacuums like Neato could wander through a human home, but their success depends on a neat and tidy spacious home. As a home becomes more cluttered, success rate of robot vacuums decline.

But they’re still using specialized sensors and not a camera with vision comparable to human sight. Computers have no problems chugging through a 2D array of pixel data, but extracting useful information is hard. The recent breakthrough in deep learning algorithms opened up more frontiers. The typical example is a classifier, and it’s one of the demos that shipped with Google AIY Vision kit. (Though not the default, which was the “Joy Detector.”) With a classifier the computer can say “that’s a cat” which is a useful step toward something a robot needs, which is more like “there’s a house pet in my path and I need to maneuver around it, and I also need to be aware it might get up and move.” (This is a very advanced level of thinking for a robot…)

The skill to pick out relevant physical structure from camera image is useful for robots, but not exclusively to robots. Both Google and Apple are building augmented reality (AR) features into phones and tablets. Underlying that feature is some level of ability to determine structure from image, in order to overlay an AR object over the real world. Maybe that capability can be used for a robot? Time for some research.

I Do Not (Yet?) Meet The Prerequisites For Multiple View Geometry in Computer Vision

Python may not be required for performing computer vision with or without OpenCV, but it does make exploration easier. There are unfortunately limits to the magic of Python, contrary to glowing reviews humorous or serious. An active area of research that is still very challenging is extracting world geometry from an image, something very important for robots that wish to understand their surroundings for navigation.

My understanding of computer vision says the image segmentation is very close to an answer here, and while it is useful for robotic navigation applications such as autonomous vehicles, it is not quite the whole picture. In the example image, pixels are assigned to a nearby car, but such assignment doesn’t tell us how big that car is or how far away it is. For a robot to successfully navigate that situation, it doesn’t even really need to know if a certain blob of pixels correspond to a car. It just needs to know there’s an object, and it needs to know the movement of that object to avoid colliding with it.

For that information, most of today’s robots use an active sensor of some sort. Expensive LIDAR for self driving cars capable of highway speeds, repurposed gaming peripherals for indoor hobby robot projects. But those active sensors each have their own limitations. For the Kinect sensor I had experimented with, the limitation were that it had a very limited range and it only worked indoors. Ideally I would want something using passive sensors like stereoscopic cameras to extract world geometry much as humans do with our eyes.

I did a bit of research to figure out where I might get started to learn about the foundations of this field, following citations. One hit that came up frequently is the text Multiple View Geometry in Computer Vision (*) I found the web page for this book, where I was able to download a few sample chapters. These sample chapters were enough for me to decide I do not (yet) meet the prerequisites for this class. Having a robot make sense of the world via multiple cameras and computer vision is going to take a lot more work than telling Python to import vision.

Given the prerequisites, it looks pretty unlikely I will do this kind of work myself. (Or more accurately, I’m not willing to dedicate the amount of study I’d need to do so.) But that doesn’t mean it’s out of reach, it just means I have to find some related previous work to leverage. “Understand the environment seen by a camera” is a desire that applies to more than just robotics.


(*) Disclosure: As an Amazon Associate I earn from qualifying purchases.