First Few Issues of ROS on Ubuntu on Crouton on Chrome OS

Some minor wrong turns aside, I think I’ve successfully installed ROS Melodic on Ubuntu 18 running within a Crouton chroot inside a Toshiba Chromebook 2 (CB35-B3340). The first test is to launch roscore, verify it is up and running without errors, then run rostopic /list to verify the default set of topics are listed.

With that done, then next challenge is to see if ROS works across machines. First I tried running roscore on another machine, and set ROS_MASTER_URI to point to that remote machine. With this configuration, rostopic /list shows the expected list of topics.

Then I tried the reverse: I started roscore on the Chromebook and pointed another machine’s ROS_MASTER_URI to the IP address my WiFi router assigned to the Chromebook. In this case rostopic/list failed to communicate with master. There’s probably some sort of networking translation or tunneling between Chrome OS and an installation of Ubuntu running inside Crouton chroot, and that’s something I’ll need to dig into and figure out. Or it might be a firewall issue similar to what I encountered when running ROS under Windows Subsystem for Linux.

In addition to the networking issue, if I want to embed this Chromebook into a robot as its brain, I’ll also need to figure out power-up procedure.

First: upon power-up, a Chromebook in developer mode puts up a dialog box notifying the user as such, letting normal users know a Chromebook in developer mode is not trustworthy for their personal data. This screen is held for about 30 seconds with an audible beep, unless the user presses a key combination prescribed onscreen. How might this work when embedded in a robot?

Second: when Chrome OS boots up, how do I also launch Ubuntu 18 inside Crouton chroot? The good news is that this procedure is covered in Crouton wiki, the bad news is that it is pretty complex and involves removing a few more Chromebook security provisions.

Third: Once Ubuntu 18 is up and running inside Crouton chroot, how do I launch ROS automatically? My current favorite “run on bootup” procedure for Linux is to create a service, but systemctl does not run inside chroot so I’ll need something else.

And that’s only what I can foresee right now, I’m sure there are others I haven’t even thought about yet. There’ll be several more challenges to overcome before a Chrome OS machine can be a robot brain. Perhaps instead of wrestling with Chrome OS, I should consider bypassing Chrome OS entirely?

Ubuntu 18 and ROS on Toshiba Chromebook 2 (CB35-B3340)

Following default instructions, I was able to put Ubuntu 16 on a Chromebook in developer mode. But the current LTS (Longer Term Support) release for ROS (Robot Operating System) is their “M” or Melodic Morenia release whose corresponding Ubuntu LTS is 18. (Bionic Beaver)

As of this writing, Ubuntu 18 is not officially supported for Crouton. It’s not explicitly forbidden, but it does come with a warning: “May work with some effort.” I didn’t know exactly what the problem might be, but given how easy it is to erase and restart on a Chromebook I decided to try it and see what happens.

It failed failed with a hash sum failure during download. This wasn’t the kind of failure I thought might occur with an unsupported build, download hash sum failure seems more like a flawed or compromised download server. I didn’t understand enough about the underlying infrastructure to know what went wrong, never mind fixing it. So in an attempt to tackle a smaller problem with a smaller surface area, I backed off to the minimalist “cli-extra” install of Bionic which skips graphical user interface components. This path succeeded without errors, and I now have a command line interface that reported itself to be Ubuntu 18 Bionic.

As a quick test to see if hardware is visible to software running inside this environment, I plugged in a USB to serial adapter. I was happy to see dmesg reported the device was visible and accessible via /dev/ttyUSB0. Curiously, the owner showed up as serial group instead of the usual dialout I see on Ubuntu installations.

A visible serial peripheral was promising enough for me to proceed and install ROS Melodic. I thought I’d try installation with Python 3 as the Python executable, but that went awry. I then repeated installation with the default Python 2. Since I have no GUI, I installed the ros-melodic-ros-base package. Its installation completed with no errors, allowing me to poke around and see how ROS works in this environment.

Window Shopping: ElectronJS

The Universal Windows Platform allows Windows application developers to create UI that can dynamically adapt to different screen sizes and resolutions, as well as adapting to different input methods like mouse vs. touchscreen. The selling point is to make it as easy and robust as a web page.

So… why not have a web page? Web developers were the pioneers in solving these problems and we might want to adapt existing solutions instead of Microsoft’s effort to replicate them on Windows. But a web page has limitations relative to native applications, and hardware access is definitely one such category. (For USB specifically, there is web USB, but that is not a general hardware access solution.)

Thus occasionally developers familiar with web technology had a need to build platform native applications. Some of them decided to build their own native application framework to support web-style interfaces across multiple platforms. This is why we have Electron. (Sometimes ElectronJS to differentiate it from its namesake.)

All the x86_64 operating systems are supported: Windows, MacOS, and Linux are first tier platforms. There’s no fundamental reason Electron won’t work elsewhere, but apparently users need to be prepared to deal with various headaches to run it on platforms like a Raspberry Pi. And that’s just getting it to run, that doesn’t even touch on the most interesting part of running on a Raspberry Pi: its GPIO pins.

Like UWP, given graphical capabilities of modern websites, I have no doubt I can display arbitrary data visualization under Electron. My favorite demo of what modern WebGL is capable of is this fluid dynamics simulation.

The attention then turns to serial communication, and a web search quickly pointed me to electron-serialport Github repo. At first glance this looks promising, though I have to be careful when building it into an Electron app. The tricky part is that this serial support is native code and must be compiled to match the version in a particular release of Electron. It appears the tool electron-rebuild can take care of this particular case. However, it sets expectation that anything Electron app dealing with hardware would likely also require a native code component.

If I ever need to build a graphically dynamic application that needs to run across multiple operating systems, plus hardware access that is restricted to native applications, I’ll come back and take a closer look at Electron. But it’s not the only game in town for a offline local application based on web technology. For applications whose purpose is less about local hardware and more about online connectivity, we also have the option of Progressive Web Applications.

Window Shopping: Universal Windows Platform Fluent Design

Looking over National Instruments’ Measurement Studio reinforced the possibility that there really isn’t anything particularly special about what I want to do for a computer front-end to control my electronics projects. I am confident that whatever I want to do in such a piece of software, I can put it in a Windows application.

The only question is what kind of trade-offs are involved for different approaches, because there is certainly no shortage of options. There have been many application frameworks over the long history of Windows. I criticised LabWindows for faithfully following the style of an older generation of Windows applications and failed to keep updated since. So if I’m so keen on the latest flashy gizmo, I might as well look over the latest in Windows application development: the Universal Windows Platform.

People not familiar with Microsoft platform branding might get unduly excited about “Universal” in the name, as it would be amazing if Microsoft released a platform that worked across all operating systems. The next word dispelled that fantasy: “Universal Windows” just meant across multiple Microsoft platforms: PC, Xbox, and Hololens. UWP was also going to cover phone as well, but well, you know

Given the reduction in scope and the lack of adoption, some critics are calling UWP a dead end. History will show if they are right or not. However that shakes out, I do like Fluent Design that was launched alongside UWP. A similar but competitive offering to Google’s Material Design, I think they both have potential for building some really good user interactivity.

Given the graphical capabilities, I’m not worried about displaying my own data visualizations. But given UWP’s intent to be compatible across different Windows hardware platforms, I am worried about my ability to communicate with my own custom built hardware. If something was difficult to rationalize a standard API across PC, Xbox, and Hololens, it might not be supported entirely.

Fortunately that worry is unfounded. There is a UWP section of the API for serial communication which I expect to work for USB-to-serial converters. Surprisingly, it actually went beyond that: there’s also an API for general USB communication even with devices lacking standard Windows USB support. If this is flexible enough to interface arbitrary USB hardware other than USB-to-serial converters, it has a great deal of potential.

The downside, of course, is that UWP would be limited to Windows PCs and exclude Apple Macintosh and Linux computers. If the objective is to build a graphically rich and dynamically adaptable user interface across multiple desktop application platforms (not just Windows) we have to use something else.

A Quick Look At NI Measurement Studio

While digging through National Instruments online documentation to learn about LabVIEW and LabWindows/CVI, I also came across something called Measurement Studio. This trio of products make up their category of Programming Environments for Electronic Test and Instrumentation. Since I’ve looked at two out of three, might as well look at the whole set and jot down some notes.

Immediately we see a difference in the product description. Measurement Studio is not a standalone application, but an extension to Microsoft Visual Studio. By doing so, National Instruments takes a step back and allows Microsoft Visual Studio to handle most of the common overhead of writing an application, stepping in only when necessary to deliver functionality valuable to their target market. What are these functions? The product page lists three bullet points:

  • Connect to Any Hardware – Electronics equipment industry standard communication protocols GPIB, VISA, etc.
  • Engineering UI Controls – on-screen representation of tasks an electronics engineer would want to perform.
  • Advanced Analysis Libraries – data processing capabilities valuable to electronics engineers.

Basically, all the parts of LabVIEW and LabWindows/CVI that I did not care about for my own projects! Thus if I build a computer control application in Microsoft Visual Studio, I’m likely to just use Visual Studio by itself without the Measurement Studio extension. I am not quite the target market for LabVIEW or LabWindows, and I am completely the wrong market for Measurement Studio.

Even if I needed Measurement Studio for some reason, the price of admission is steep. Because Measurement Studio is not compatible with the free Community Edition of Visual Studio, developing with Measurement Studio requires buying license for a paid tier of Microsoft Visual Studio in addition to the license for Measurement Studio.

And finally, it has been noted that the National Instruments products require low level Win32 API access that prevents them from being a part of the new generation of Windows app that can be distributed via Microsoft Store. These newer apps promise to have better installation and removal experience, automatic updates, and better isolated from each other to avoid incompatibilities like “DLL Hell”. None of those benefits are available if an application pulls in National Instruments software components, which is a pity.

Earlier I said “if I build a computer control application in Microsoft Visual Studio, I’ll just use Visual Studio by itself without the Measurement Studio extension” which got me thinking: that’s a good point! What if I went ahead and wrote a standard Windows application with Visual Studio?

Digging Further Into LabWindows/CVI

Following initial success of RS-232 serial communication in a LabWindows/CVI sample program, I dived deeper into the documentation. This RS-232 library accesses the serial port at a very low level. On the good side, it allows me to communicate with non-VISA peripherals like a consumer 3D printer. On the bad side, it means I’ll be responsible for all the overhead of running a serial port.

The textbook solution is to leave the main program thread to maintain responsive UI, and spin up another thread to keep an eye on the serial port so we know when data comes in. The good news here is that LabWindows/CVI help files say RS-232 library code is thread safe, the bad news here is that I’m responsible for thread management myself. I failed to find much in the help files, but I did find something online for LabWindows/CVI multi-threading. Not super easy to use, but powerful enough to handle the scenario. I can probably make this work.

Earlier I noted that LabWindows/CVI design seems to reflect the state of the art about ten years ago and not advanced since. This was most apparent in the visual appearance of both the tool itself and of the programs it generated. Perhaps the target paying audience won’t put much emphasis on visual design, but I like to put in some effort in my own projects.

Which is why it really pained me when I realized the layout in a LabWindows/CVI program is fixed. Once they are laid out in the drag-and-drop tool, that’s it, forever. Maximizing the window will only make the window larger, all the controls stay the same and we just get more blank space. I searched for an option to scale windows and found this article in National Instruments support, but it only meant scaling in the literal sense. When this option is used, and I maximize a window, all the controls still keep the same layout but they just get proportionally larger. There’s no easy way to take advantage of additional screen real estate in a productive way.

This means a default LabWindows/CVI program will be unable to adapt to a screen with different aspect ratio, or be stacked side-by-side with another window, or any of the dynamic layout capabilities I’ve come to expect of applications today. This makes me sad, because the low-level capabilities are quite promising. But due to the age of the design and the high cost, I’m likely to look elsewhere for my own projects. But before I go, a quick look at one other National Instruments product: Measurement Studio.

LabWindows/CVI Serial Communication Test

Once I was done with LabWindow’s Hello World tour, it was time for some independent study, focused on fields I’m personally interested in. Top of the list was serial port communications. Researching them ahead of time indicated it was capable of arbitrary protocols. Was that correct? I dived into the RS-232 API to find out.

Before we can open a serial port for communication, we must first find it. And the LabWindows/CVI RS-232 library for enumerating serial port is… nothing. There isn’t one. A search on user forums indicate this is the consensus: if someone wants to enumerate serial ports, they have to go straight to the underlying Win32 API.

Puzzled at how a program is supposed to know which COM port to open without an enumeration API, I went into the sample applications directory and found a generic serial terminal program. How did they solve this problem? They did not: they punted it to the user. There is a slider control for the user to select the COM port to open. If the user doesn’t know which device is mapped to which COM port, it is not LabWindows’ problem. So much for user-friendliness.

I had a 3D printer handy for experimentation, so I tried to use the sample program to send some Marlin G-code commands. The first obstacle is baud rate: USB serial communication can handle much faster speeds than old school RS-232 so my printer defaults to 250,000 baud. The sample program’s baud selection control only went up to 57,600 baud so the sample program had to be modified to add a 250,000 baud option. After that was done, everything worked: I could command the printer to home its axis, move to position, etc.

First test: success! Time to dig deepeer.

LabWindows/CVI Getting Started Guide

A quick look through the help files for LabWindows/CVI found it to be an interesting candidate for further investigation. It’s not exactly designed for my own project goals, but there is enough alignment to justify a closer look.

Download is accomplished through National Instruments Package Manager. Once installed and updated I could scroll through all of the National Instruments software and select LabWindows/CVI for installation. As is typical of development tools, it’s not just one package but many (~20) separate packages that get installed. Ranging from the actual IDE to runtime library redistributable binaries.

Once up and running I find that my free trial period lasts only a week, but that’s fine as I only wanted to run through their Hello World tutorial in LabWindows/CVI Getting Started Guide (PDF). The tutorial walks through generating a simple application with a few buttons and a graphing control that displays a generated sine wave. I found the LabWindows/CVI interface to be familiar, with a strong resemblance to Microsoft Visual Studio which is probably not a complete coincidence. The code editor, file browser, debug features, and drag-and-drop UI editor are all features I’ve experienced before.

The biggest difference worth calling out is the UI-based tool “Function Panel” for generating library calls. While library calls can be typed up directly in text like any other C API, there’s the option to do it with a visual representation. The function panel is a dialog box that has a description of the function and all its parameters listed in text boxes that the developer can fill in. When applicable, the panel also shows an example of the resulting configuration. Once we are satisfied with a particular setup, selecting “Code”/”Insert Function Call” puts all parameters in their proper order in C source code. It’s a nifty way to go beyond a help page of text, making it a distinct point of improvement over the Visual Studio I knew.

Not the modern Microsoft Visual Studio, though, more like Visual Studio of many years ago. The dated visual appearance of the tool itself are consistent with old appearance of available user controls. They are also consistent with the documentation, as that Getting Started PDF was dated October 2010 and I couldn’t find anything more recent. The latest edition of the more detailed LabWindows/CVI Programmer’s Reference Manual (PDF) is even older, at June 2003.

All of these data points make LabWindows appear to be a product of an earlier generation. But never mind the age – how well does it work?

Window Shopping LabWindows/CVI

I’ve taken a quick look over Keysight VEE and LabVIEW, both tools that present software development in a format that resembles physical components and wires: software modules are virtual instruments, data flow are virtual wires. This is very powerful for expressing certain problem domains and naturally imposes a structure. From a software perspective, explicit description of data flow also makes it easier to take advantage of parallel execution possible on modern multicore processors.

But imposing certain structures also make it hard to venture off the beaten path, which is why attention now turns to LabVIEW’s stablemate, LabWindows/CVI. They both offer access to industry standard communication protocols plus data analysis and visualization tools, but the data flow and program structure is entirely different. Instead of LabVIEW’s visual “G” language, LabWindows/CVI uses ANSI C to connect all its components and control flow of data and execution. I am optimistic it will be more aligned with my software experience.

Like LabVIEW, the program help files for LabWindows/CVI is also available for download and perusal. Things look fairly promising at first glance.

I found a serial communication API that can read and write raw bytes under:

  • Library Reference
    • RS-232 Library
      • function tree

For user display, I found something that resembles LabVIEW’s “2D Picture Control” here called a “Canvas Control”. An overview of drawing with Canvas Control’s basic drawing primitives can be found under:

  • Library Reference
    • User Interface Library
      • Controls
        • Control Types
          • Canvas Controls
            • Programming with Canvas Controls

I’m encouraged by what I found looking through LabWindows/CVI help files, enough to download the actual development tool and get hands-on with it.

Window Shopping: LabVIEW 2019

After taking a quick look over Keysight VEE, I switched focus to LabVIEW by National Instruments. I don’t know how directly these two products compete in the broader market, but I do know they have some overlap relating to instrument control. I had some exposure to LabVIEW many years ago thanks to LEGO Mindstorms, which had used a version of LabVIEW for programming the NXT brick. Back then the Mindstorm-specific version was very closely guarded and, when I lost track of my CD-ROM, I was out of luck because neither NI nor LEGO made it available for download. Thankfully that has since changed and the Mindstorm flavor of LabVIEW is available for download.

But I’m not focused on LEGO right now, today’s aim is to see how I might fulfill my general computer control goals with this tool. For that information I was thankful National Instruments made help files for LabVIEW available for download so I can investigate without a full download and installation of the full tool suite. It took a bit of hunting around to find them, though, and the download page was titled LabVIEW 2018 but it has a download link for the 2019 help files.

I found a help page “Serial Port Communication” under the section:

  • Controlling Instruments
    • Types of Instruments

And it assumes the user would only be controlling devices that can communicate to VISA protocol, not general serial communication. There were more serial communication information in the section:

  • VISA Resource
    • I/O Session
      • Serial Instr

There’s also an online tutorial for instrument communication. This page has a flowchart that implied there’s a “Direct I/O” that we can fallback to if all else fails, but I found no mention for performing this direct I/O in the help files.

The graphics rendering side was more straightforward. There’s no mention of ActiveX control here, but under:

  • Fundamentals
    • Graphs and Charts
      • Graphics and Sound VIs

There are multiple pages of information for a “2D Picture Control” with drawing primitives like points, lines, arcs, etc. Details on this drawing API are found under:

  • VI and Function Reference
    • Programming VIs and Functions
      • Graphics & Sound VIs
        • Picture Plot VIs

However, it’s not clear this functionality scales to complex drawings with thousands (or more) of primitives. It certainly wouldn’t be the first time I used an API that stumbled as the data size grew.

So the drawing side looks workable pending a question mark on how well it scales, but the serial communication side is blocked. Until I find a way to perform that mystical direct I/O, I’m going to set LabVIEW aside and look at its sibling LabWindows/CVI.

[UPDATE: I’ve since found LabVIEW MakerHub and LINX, which allows LabVIEW to communicate with maker level hardware over serial.]

Window Shopping: Keysight VEE Custom Data Display

Looking over Keysight VEE’s support for device communication, I found there is only support for a limited subset of USB serial communication patterns. And even for the supported transaction model, it seems to be quite labor intensive to craft. It left me with the impression venturing outside VEE’s supported list of equipment is something to be avoided.

Attention then turn to VEE’s support for arbitrary display or data. Like its competitors in the space of test instrumentation software, there is an extensive library for data analysis common for the problem domain. This is useful for their paying customers, but again quite restrictive if we want to venture outside their supported list.

As far as I can tell by just reading their Advanced Techniques PDF, the method to add a custom data visualization component is to create an ActiveX control. This is a technology I haven’t thought about in years! I first learned of it in the context of Microsoft Visual Basic decades ago, where people could drag and drop UI components to build their application. Each of these visual components were built with technology that eventually became named ActiveX controls. This technology is so old not even Microsoft is investing in it now. They have moved on, giving stewardship to an open standards body.

The fact ActiveX is the state-of-the-art technology for extending VEE is telling. Looking over the recent history of VEE software releases, it has all the signs of a piece of software living on continuing life support. They are still releasing new versions on a regular basis, but the advances between releases are mostly in the form of new instrument support (GPIB and otherwise) and certification it will run on latest edition of Windows. I have seen very little in the way of new feature development or general evolution.

VEE seems to be perfectly suited to their target market: electronics engineers trying to automate a collection of instruments, every one of which support industry standard protocols. (Especially those made by Keysight.) Then, perform analysis of that data as typically needed by electrical engineers. But since my goal is to control arbitrary equipment communicating over USB serial, then process and display that data in ways unrelated to electrical engineering, I should set VEE aside and look at other options.

Window Shopping: Keysight VEE Serial Communication

When I was learning about industry standards for electronics test and measurement equipment automation, I quickly came across GPIB which has its roots in something Hewlett-Packard developed for their equipment. It made sense, then, that they would have a software suite to run on a PC and talk to these instruments via GPIB. This turned out to be something called VEE, but it is no longer a HP product. It has had multiple custodians. From HP it moved to Agilent, and now it is in the hands of Keysight Technologies.

So it was no surprise the focus would be on professional equipment with GPIB or the closely related USB-based successor USBTMC. There is also built-in support for a few other instrument standards, all packaged together in the IO Libraries Suite of a full VEE installation. However, I had a hard time finding any mention of how to communicate with custom-built equipment outside of supported protocols. It certainly wasn’t covered in their Quick Start Guide (PDF), so I moved on to Advanced Techniques (PDF).

I thought perhaps I would have to create what they call a “panel driver” for installation into VEE in order to support custom equipment, but a search for “How to write a VEE Panel Driver” failed to retrieve useful links. How do instrument manufacturers create and release VEE Panel Driver for their equipment? So far that is still a mystery.

In the absence of a custom panel driver, the next option is to specify a custom communication protocol directly in VEE, and such a thing can be built from their Transaction I/O mechanism. Suitable at least for query/response types of interaction. The exact commands being sent out and expected to be received are crafted step by step using VEE GUI. This seems very labor intensive but has the advantage of avoiding annoying and common byte processing bugs typical when such serial byte stream processing are written in C.

It’s not obvious from reading the document what happens if the VEE transaction specification is wrong and incoming serial data doesn’t match. This is not encouraging, neither were there any mechanisms to help support development of transaction I/O. There’s just trial and error. Seriously. This is a direct quote from the manual:

Many times the best way to develop the transactions you need is by using trial and error

(Chapter 4: Using Transaction I/O / Creating and Reading Transactions / Editing the Data Field / Suggestions for Developing Transactions)

I didn’t find any mention of how to deal with continuous data stream from devices that do not perform transaction-based communication. For example a thermometer that continuously reports temperature without prompting, or the Neato LIDAR. VEE does have a data polling feature, but that seems to be restricted to devices on specific subset of supported protocols and not arbitrary serial communication.

From this brief survey, it appears VEE support for arbitrary USB serial communication is quite limited. The next step is to look at how VEE support displaying arbitrary data.

Preparing For ROS 2 Transition Looks Complicated

Before I decided to embark on a ROS Melodic software stack for Sawppy, I thought about ignoring the long legacy of ROS 1 and going to the newer ROS 2 built on more modern infrastructure. I mean, I told people to look into it, so I should walk the walk right? Eventually I decided against putting Sawppy on ROS 2, the deal breaker was that the Raspberry Pi is not a tier 1 platform for ROS 2. This means there’s no guarantee on regular binary releases for it, or that it will always function. I may have to build my own arm32 binaries for Raspbian from source code, and I would be on my own to verify functionality. I’ve done a superficial survey of other candidates for a Sawppy brain, but for today Sawppy is still thinking with a Raspberry Pi.

But even after making that decision I wanted to keep ROS 2 in mind. Open Robotics has a  ROS 2 migration guide for helping ROS node authors navigate the transition, and it doesn’t look trivial to me. But then again, I don’t have the ROS expertise to accurately judge the effort involved.

The biggest headache for some nodes will be the lack of Python 2 support. Mainly impact ROS nodes with a long legacy of Python 2 code, it does not impact a new project written against ROS Melodic which is supposed to support Python 3.

The next headache is the fact that it’s not possible to write if/else blocks to allow a single ROS node to simultaneously support ROS 1 and 2. The recommendation is to put all specialized logic into generic non-ROS-specific code in a library that can be shared. Then have separate code tailored to the infrastructure paradigms of ROS and ROS 2. This way all the code integrating with a ROS platform can be separated, but calling into a shared library.

And it also sounds like the ROS/ROS 2 build systems conflict so they can’t even coexist side by side at the same time. Different variants of a node have to live in separate branches of a repository, with the shared library code merged across branches as development continues. Leaving ROS/ROS 2 specific infrastructure code live in their separate branches.

I can see why a vocal fraction of ROS developers are unhappy with this “best practice”. And since ROS is open source, I foresee one or more groups joining forces to keep ROS 1 alive and working with old code even as Open Robotics move on to ROS 2. Right now there are noises being made from people who proclaims to do a similar thing, saying they’ll keep Python 2 alive past official EOL. In a few years we can look back and see if those Python 2 holdouts actually thrived, and we can also see how the ROS 1/ROS 2 situation has evolved.

Using Adobe Photoshop Perspective Warp To Get Top View On Large Chalk Drawings

And now, my own little behind-the-scenes feature for yesterday’s post about Pasadena Chalk Festival 2019. When organizing my photos from the event, I realized it might be difficult to see progression from one picture to the next due to changing viewing angles. When I revisit a specific piece, I could never take another picture from the same perspective. Most of the time it was due to someone else in the crowd blocking my view, though occasionally it’s the artist himself/herself.

Since these chalk drawings were large, we could only take pictures from an oblique angle making the problem even worse. So for yesterday’s post I decided it was time to learn the Perspective Warp tool in Adobe Photoshop and present a consistent view across shots. There are plenty of tutorials on how to do this online, and now we have one more:

Perspective correct 10

Step 1: Load original image

Optional: Use “Image Rotation…” under “Image” menu to rotate it most closely approximating the final orientation. In this specific example, the camera was held in landscape mode (see top) and so the image had to be rotated 90 degrees counterclockwise. Photoshop doesn’t particular care about orientation of your subject, but it’s easier for our human brains to pick up problems as we go when it’s properly oriented.

 

Perspective correct 20

Step 2: Under the “Edit” menu, select “Perspective Warp”

 

Perspective correct 30

Step 3: Enter Layout Mode

Once “Perspective War” has been selected, we should automatically enter layout mode. If not, explicit select “Layout” from the top toolbar.

 

Perspective correct 40

Step 4: Create Plane

Draw and create a single rectangle. There are provisions for multiple planes in layout mode, but we only need one to correspond to the chalk drawing.

 

Perspective correct 50

Step 5: Adjust Plane Corners

Drag corners of perspective plane to match intended surface. Most chalk drawings are conveniently rectangles and have distinct corners we could use for the task.

 

Perspective correct 60

Step 6: Enter Warp Mode

Once the trapezoid is representative of the rectangle we want in the final image, click “Warp” on the top toolbar.

 

Perspective correct 70

Step 7: Warp to desired rectangle

Drag the corners again, this time back into a rectangle in the shape we want. Photoshop has provided tools to help align edges to vertical and horizontal. (See tools to the right of “Warp”) But establishing the proper aspect ratio is up to the operator.

 

Perspective correct 80

Step 8: Perspective Warp Complete

Once perspective correction is done, click the checkbox (far right of top toolbar) to complete the process. At this point we have a good rectangle of chalk art, but the image around it is a distorted trapezoid. Use standard crop tool to trim excess and obtain the desired rectangular chalk art from its center.

 

Chalk festival Monsters Inc 20

Art by Jazlyn Jacobo (Instgram deyalit_jacobo)

Unity 3D Editor for Ubuntu Is Almost Here

So far I’ve dipped my toes in the water of reinforcement learning, reinstalled Ubuntu and TensorFlow, and looked into Unity ML-Agents. It looks like I have a tentative plan for building my own reinforcement learning agents trained with TensorFlow in an Unity 3D environment.

There’s one problem with this plan, though: I have GPU-accelerated TensorFlow running in Ubuntu, but today Unity editor only supports MacOS and Windows. If I wanted to put them all together, on paper it means I’d have to get Nvidia GPU support up and running on my Windows partition and take on all the headaches that entails.

Thankfully, I’m not under a deadline to make this work immediately, so I can hope that Unity brings their editing and creation environment to Ubuntu. The latest preview build was released only a few days ago, and they expected that Linux will be a fully supported operating system for Unity Editor by the end of the year.

I suspect that they’ll be ready before I am, because I still have to climb the newcomer learning curve of reinforcement learning. I first have to learn the ropes using prebuilt OpenAI environments. It’ll be awhile before I can realistically contemplate designing on own agents and simulation environments.

Once I reach that point I hope I will be able to better evaluate whether my plan will actually work. Will Unity ML-Agents work with GPU-accelerated TensorFlow running in a Docker container on Ubuntu? I’ll have to find out when I get there.

First ROS 2 LTS Has Arrived, Let’s Switch

Making a decision to go explore the less popular path of smarter software for imperfect robot hardware has a secondary effect: it also means I can switch to ROS 2 going forward. One of the downsides of going over to ROS 2 now is that I lose access to the vast library of open ROS nodes freely available online. But if I’ve decided I’m not going to use most of them anyway, there’s less of a draw to stay in the ROS 1 ecosystem.

ROS 2 offers a lot of infrastructure upgrades that should be, on paper, very helpful for work going forward. First and foremost on my list is the fact I can now use Python 3 to write code for ROS 2. ROS 1 is coupled to Python 2, whose support stops in January 2020 and there’s been a great deal of debate in ROS land on what to do about it. Open Robotics has declared their future work along this line is all Python 3 on ROS 2. So the community has been devising various ways to make Python 3 work on ROS 1. Switching to ROS 2 now let’s me use Python 3 in a fully supported manner, no workarounds necessary.

And finally, investing in learning ROS 2 now has a much lower risk of having that time be thrown away by a future update. ROS 2 Dashing Diademata has just been released, and it is the first longer term service (LTS) release for ROS 2. I read this as a sign that Open Robotics is confident the period of major code churn for ROS 2 is coming to an end. No guarantees, naturally, especially if they learn of something that affects long term viability of ROS 2, but the odds have dropped significantly with evolution over the past few releases.

The only detraction for my personal exploration is the fact that ROS 2 has not yet released binaries for running on Raspberry Pi. I could build my own Raspberry Pi 3 version of ROS 2 from open source code, but I’m more likely to use the little Dell Inspiron 11 (3180) I had bought as candidate robot brain. It is already running Ubuntu 18.04 LTS on an amd64 processor, making it a directly supported Tier 1 platform for ROS 2.

Let’s Learn To Love Imperfect Robots Just The Way They Are

A few months ago, as part of preparing to present Sawppy to the Robotics Society of Southern California, I described a few of the challenges involved in putting ROS on my Sawppy rover. That was just the tip of the iceberg and I’ve been thinking and researching in this problem area on-and-off over the past few months.

Today I see two divergent paths ahead for a ROS-powered rover.

I can take the traditional route, where I work to upgrade Sawppy components to meet expectations from existing ROS libraries. It means spending a lot of money on hardware upgrades:

  • Wheel motors that can deliver good odometry data.
  • Laser distance scanners faster and more capable than one salvaged from a Neato vacuum.
  • Depth camera with better capabilities than a first generation Kinect
  • etc…

This conforms to a lot of what I see in robotics hardware evolution: more accuracy, more precision, an endless pursuit of perfection. I can’t deny the appeal of having better hardware, but it comes at a steeply rising cost. As anyone dealing with precision machinery or machining knows, physical accuracy costs money: how far can you afford to go? My budget is quite limited.

I find more appeal in pursuing the nonconformist route: instead of spending ever more money on precision hardware, make the software smarter to deal with imperfect mechanicals. Computing power today is astonishingly cheap compared to what they cost only a few years ago. We can add more software smarts for far less money than buying better hardware, making upgrades far more affordable. It is also less wasteful: retired software are just bits, while retired hardware gather dust sitting there reminding us of past spending.

And we know there’s nothing fundamentally wrong with looking for a smarter approach, because we have real world examples in our everyday life. Autonomous vehicle research brag about sub-centimeter accuracy in their 3D LIDAR… but I can drive around my neighborhood without knowing the number of centimeters from one curb to another. A lot of ROS navigation is built on an occupancy grid data structure, but again I don’t need a centimeter-aligned grid of my home in order to make my way to a snack in the kitchen. We might not yet understand how it could be done with a robot, but we know the tasks are possible without the precision and accuracy demanded by certain factions of robotics research.

This is the path less traveled by, and trying to make less capable hardware function using smarter software would definitely have their moments of frustration. However, the less beaten path is always a good place to go looking for something interesting and different. I’m optimistic there will be rewarding moments to balance out those moments of frustration. Let’s learn to love imperfect robots just the way they are, and give them the intelligence to work with what they have.

Researching Simulation Speed in Gazebo vs. Unity

In order to train reinforcement learning agents quickly, we want our training environment to provide high throughput. There are many variables involved, but I started looking at two of them: how fast it would be to run a single simulation, and how easy it would be to run multiple simulation in parallel.

The Gazebo simulator commonly associated with ROS research projects has never been known for its speed. Gazebo environment for the NASA Space Robotic Challenge was infamous for slowing far below real time speed. Taking over 6 hours to simulate a 30 minute event. There are ways to speed up Gazebo simulation, but this forum thread implies it’s unrealistic to expect more than 2-3 times as fast as real time speed.

In contrast, Unity simulation can be cranked all the way up to 100 times real time speed. It’s not clear where the maximum limit of 100 comes from, but it is documented under limitations.md. Furthermore, it doesn’t seem to be a theoretical limit no one can realistically reach – at least one discussion on Unity ML Agents indicate people do indeed crank up time multiplier to 100 for training agents.

On the topic of running simulations in parallel, with Gazebo such a resource hog it is difficult to get multiple instances running. This forum thread explains it is possible and how it could be done, but at best it still feels like shoving a square peg in a round hole and it’ll be a tough act to get multiple Gazebo running. And we haven’t even considered the effort to coordinate learning activity across these multiple instances.

Things weren’t much better in Unity until recently. This announcement blog post describes how Unity has just picked up the ability to run multiple simulations on a single machine and, just as importantly, coordinate learning knowledge across all instances.

These bits of information further cements Unity as something I should strongly consider as my test environment for playing with reinforcement learning. Faster than real time simulation speed and option for multiple parallel instances are quite compelling reasons.

 

Quick Overview: Unity ML Agents

Out of all the general categories of machine learning, I find myself most interested in reinforcement learning. These problems (and associated solutions) are most applicable to robotics, forming the foundation of projects like Amazon’s DeepRacer. And the fundamental requirement of reinforcement learning is a training environment where our machine can learn by experimentation.

While it is technically possible to train a reinforcement learning algorithm in the real world with real robots, it is not really very practical. First, because a physical environment will be subject to wear and tear, and second, because doing things in the real world at real time takes too long.

For that reason there are many digital simulation environments in which to train reinforcement learning algorithms. I thought it would be an obvious application of robot simulation software like Gazebo for ROS, but this turned out to only be partially true. Gazebo only addresses half of the requirements: a virtual environment that can be easily rearranged and rebuilt and not subject to wear and tear. However, Gazebo is designed to run in a single instance, and its simulation engine is complex enough it can fall behind real time meaning it takes longer to simulation something than it would be in the real world.

For faster training of reinforcement learning algorithms, what we want is a simulation environment that can scale up to run multiple instances in parallel and can run faster than real time. This is why people started looking at 3D game engines. They were designed from the start to represent a virtual environment for entertainment, and they were built for performance in mind for high frame rates.

The physics simulation inside Unity would be less accurate than Gazebo, but it might be good enough for exploring different concepts. Certainly the results would be good enough if the whole goal is to build something for a game with no aspirations for adapting them to the real world.

Hence the Unity ML-Agents toolkit for training reinforcement learning agents inside Unity game engine. The toolkit is nominally focused on building smart agents for game non-player characters (NPC) but that is a big enough toolbox to offer possibilities into much more. It has definitely earned a spot on my to-do list for closer examination in the future.

One Year Of Daily New Screwdriver Posts

When asked how to be a better writer, many successful writers give the same advice: start writing and keep writing. It doesn’t matter the topic or the length. It doesn’t matter if there is only an audience of one (yourself). Write. As much and as often as you can, write. There’s no guarantee that writing more will necessarily lead anywhere, but it is certain that not writing will not lead to success.

On a parallel front, I had been worried about my chosen path of independent study. Without an established curriculum or schedule, it was all too easy to lose track of what I’ve tried, what I’ve learned, and how I’ve improved as a result. To prevent this, and to keep myself motivated and accountable, I decided to start recording my progress in the format of blog posts.

Hence was born this blog site, NewScrewdriver.com. Named after an offhand reference made by Dr. Who about “inventing a new type of screwdriver” this was a record of my adventures and also a long term and continuing writing exercise. The very first post was about WordPress itself, because that’s obviously what I had to learn to set up this blog. The adventures went on from there. I described projects big and small. I documented significant advances and useless distractions. I wrote about new discoveries and popular knowledge that were nevertheless new to me. Some posts were packed with useful information, some were just mindless rants of a flailing man. But they had one thing in common: it’s what I had been doing with my time.

I eventually settled on a target length of 300 words – any shorter and it’d be difficult to have a whole story with a clear beginning, middle, and end. Any longer, and we see diminishing returns on the time it took to write and read. 300 words takes about an average of a half hour to write, and less than five minutes to read. It seemed like a practical size for the project.

A distressing number of internet blogs start with a few strong posts, then trail off into inactivity. (Sometimes they have only a single post!) I was determined not to let this happen to my own blog, but it took a conscious effort to keep things going. After some pauses and restarts, eventually I could keep up the pace of a post every three days, then a post every other day, then a post every single day. I was content with that pace. Some days I have successes that took multiple ~300 word posts to write up, and they will fill in the days when I was either struggling and had little to talk about, or fully occupied and unable to write.

Today I celebrate a milestone marked by a screenshot from WordPress blog dashboard: there has been a blog post every single day for the past twelve months. I expect that at some point in the future I wouldn’t be able to keep up the daily pace, but today I’m writing down a reminder for myself: I was able to stay with the program for a whole year. Hooray!