I Do Not (Yet?) Meet The Prerequisites For Multiple View Geometry in Computer Vision

Python may not be required for performing computer vision with or without OpenCV, but it does make exploration easier. There are unfortunately limits to the magic of Python, contrary to glowing reviews humorous or serious. An active area of research that is still very challenging is extracting world geometry from an image, something very important for robots that wish to understand their surroundings for navigation.

My understanding of computer vision says the image segmentation is very close to an answer here, and while it is useful for robotic navigation applications such as autonomous vehicles, it is not quite the whole picture. In the example image, pixels are assigned to a nearby car, but such assignment doesn’t tell us how big that car is or how far away it is. For a robot to successfully navigate that situation, it doesn’t even really need to know if a certain blob of pixels correspond to a car. It just needs to know there’s an object, and it needs to know the movement of that object to avoid colliding with it.

For that information, most of today’s robots use an active sensor of some sort. Expensive LIDAR for self driving cars capable of highway speeds, repurposed gaming peripherals for indoor hobby robot projects. But those active sensors each have their own limitations. For the Kinect sensor I had experimented with, the limitation were that it had a very limited range and it only worked indoors. Ideally I would want something using passive sensors like stereoscopic cameras to extract world geometry much as humans do with our eyes.

I did a bit of research to figure out where I might get started to learn about the foundations of this field, following citations. One hit that came up frequently is the text Multiple View Geometry in Computer Vision (*) I found the web page for this book, where I was able to download a few sample chapters. These sample chapters were enough for me to decide I do not (yet) meet the prerequisites for this class. Having a robot make sense of the world via multiple cameras and computer vision is going to take a lot more work than telling Python to import vision.

Given the prerequisites, it looks pretty unlikely I will do this kind of work myself. (Or more accurately, I’m not willing to dedicate the amount of study I’d need to do so.) But that doesn’t mean it’s out of reach, it just means I have to find some related previous work to leverage. “Understand the environment seen by a camera” applies to more than just robotics.


(*) Disclosure: As an Amazon Associate I earn from qualifying purchases.

Notes On OpenCV Outside of Python

It was fun taking a brief survey of PyImageSearch.com guides for computer vision, and I’m sure I will return to that site, but I’m also aware there are large areas of vision which are regarded as out of scope.

The Python programming language is obviously the focus of that site as it’s right in the name PyImageSearch. However, Python is not the only or even the primary interface for OpenCV. According to official OpenCV introduction, it started as a C library which has since moved to a C++ API. Python is but one of several language bindings on top of that API.

Using OpenCV via Python binding is advantageous not only because of Python itself, it also opens access to a large world of Python libraries. The most significant one in this context is NumPy. Other languages may have similar counterparts, but Python and NumPy together is a powerful combination. There are valid reasons to use OpenCV without Python, but they would have to find their own counterparts to NumPy for their number crunching heavy lifting.

Just for the sake of exercise, I looked at a few of the other platforms I recently examined.

OpenCV is accessible from JavaScript, or at least Node.JS, via projects like opencv4nodejs.  This also means OpenCV can be embedded in desktop applications written in ElectronJS, demonstrated with this example project of opencv-electron.

If I wanted to use OpenCV in a Universal Windows Platform app, it appears some people have shared some compiled form of OpenCV up to Microsoft’s NuGet repository. As I understand it, NuGet is to .NET as PyPI is to Python. Maybe there are important differences but it’s a good enough analogy for a first cut. Microsoft’s UWP documentation describes using OpenCV via a OpenCVHelper component. And since UWP can be in C++, and OpenCV is in C++, there’s always the option of compiling from source code.

As promising as all this material is, it is merely the foundation for applying computer vision to the kind of problems I’m most interested in: helping a robot understand its environment for mapping, obstacle avoidance, and manipulation. Unfortunately that field starts to get pretty complex for a casual hobbyist to pick up.

Notes After Skimming PyImageSearch

I’m glad I learned of PyImageSearch from Evan and spent some time to sit down to look it over. The amount of information available on this site is large enough that I resorted to skimming, with the intent to revisit specific subjects later as need arise.

I appreciate the intent of making computer vision accessible to beginners, it is always good to make sure people interested in exploring an area are not frustrated by problems unrelated to the problem domain. Kudos to the guides on command line basics, and on Python’s NoneType errors that are bewildering to beginners.

That said, this site does frequently dive into areas that I felt lacked sufficient explanation for beginners. I remember the difficulty I had in understanding how matrix math related to computer graphics. The guide on rotation discussed the corresponding rotation matrix. Readers got the assurance “This matrix looks scary, but I promise you: it’s not.” but the explanation that followed would not have been enlightening to me back when I was learning the topic. Perhaps a link to more details would be helpful? Still, the effort is appreciated.

There are also bits of Python code that would be confusing to a beginner. Not just Python itself, but also when leveraging the very powerful NumPy library. I had no idea what was going on between tuple and argmin in the code on this page:

extLeft = tuple(c[c[:, :, 0].argmin()][0])

Right now it’s a black box of voodoo magic to me, a sting of non-alphanumeric operators that more closely resemble something I associated with Perl programming. At some point I need to sit down with Python documentation to work through this step by step in Python REPL (read – evaluation – print loop) to understand this syntax. It would be good if the author included footnotes with links to the appropriate Python terminology for these operations.

A fact of life of learning from information on PyImageSearch is the sales pitch for the author’s books. It’s not necessarily a good thing or a bad thing, but it is very definitely a thing. Constant and repetitive reminder “and in my book you will also learn…” on every page. This site exists to draw people in and, if they want to take it further, sell them on the book. I appreciate this obviously stated routine over the underhanded ways some other people make money online, but that doesn’t make it any less repetitive.

Likely related to the above is the fact this site also wants to collect e-mail addresses. None of the code download links takes us to an actual download, they instead take us to a form where we have to fill in our e-mail address before we are given a link to download. Fortunately the simple bits I’ve followed along so far are easy to recreate without the download but I’m sure it will be unavoidable if I go much further.

And finally, this site is focused on OpenCV in Python running on Unix-derived operating systems. Other language bindings for  OpenCV are out of scope, as is the Windows operating system. For my project ideas that involve embedded platforms without Python, or those that will be deployed on Windows, I would need to go elsewhere for help.

But what is within scope is covered well, with an eye towards beginner friendliness, and available freely online in a searchable collection. For that I am thankful to the author, even as I acknowledge that there are interesting OpenCV resources beyond this scope.

Change Is Only Possible If People Have Hope

It’s been over six weeks since United States added “Widespread Civil Unrest” to the list of everything else going wrong with the year 2020. I personally chose to reduce my workshop activities and make time to read up on some things that were left out of my school history textbooks. There were a lot of important events missing! I was a good student that paid attention and did well in tests, but that only covered what was in the book.

On the national stage, I’m glad to see this wasn’t “just another thing” getting brushed aside (as much as some people in positions of leadership tried) but the majority of immediate positive response are just symbolic gestures. Painting “Black Lives Matter” across a street won’t do anything to actually make Black lives matter.

But that doesn’t mean such symbolic gestures are useless. They set a low bar that is easy to clear, a basic floor for discussion on how we can move forward. When that fails to establish common ground, when that becomes controversial, it is really informative. If people can’t even agree on the basic premise that Black lives matter, it really lowers the chances we can have productive discussion on how to provide liberty and justice for all. If some people aren’t even willing to support symbolic gestures, how will they react to real and meaningful changes?

And real and meaningful changes will be required, because ignoring all the underlying problems won’t make them go away. The bad news is that real change takes time, meaning it’s too early to declare either victory or success. There are a lot of policy decisions, legislation either enacted or revoked, and court decisions made, before we can point to any real change in direction. And that is far too slow to be noticeable in this age of instant gratification and fleeting social media exposure, so we’ll just have to wait and see. But as long as people hold on to hope for a better society where Black lives do matter, change is possible.


Notes from workshop tinkering will resume tomorrow, starting with previously scheduled backlog.

Words of Hope, Words of Change

Notes from workshop tinkering are on hold, reading words by others instead.

How to Make this Moment the Turning Point for Real Change by Barack Obama. I think he’s qualified to say a few words, based on his firsthand experience with politics in the United States.

With a decades long career in journalism, Dan Rather has seen some shit. His recent essay posted on Facebook acknowledges things are pretty bad now, but things have been really bad before, too. He wants to remind us that every time before, people holding on to the ideals of of this nation carried it through, and that can happen again.

Skimming Remainder Of PyImageSearch Getting Started Guide

Following through part of, then skimming the rest of, the first section of PyImageSearch Getting Started guide taught me there’s a lot of fascinating information here. Certainly more than enough for me to know I’ll be returning and consult as I tackle project ideas in the future. For now I wanted to skimp through the rest and note the problem areas it covers.

The Deep Learning section immediately follows the startup section, because they’re a huge part of recent advancements in computer vision. Like most tutorials I’ve seen on Deep Learning, this section goes through how to set up and train a convolutional neural network to act as an image classifier. Discussions about training data, tuning training parameters, and applications are built around these tasks.

After the Deep Learning section are several more sections, each focused on a genre of popular applications for machine vision.

  • Face Applications start from recognizing the presence of human faces to recognizing individual faces and applications thereof.
  • Optical Character Recognition (OCR) helps a computer read human text.
  • Object detection is a more generalized form of detecting faces or characters, and there’s a whole range of tools. This will take time to learn in order to know which tools are the right ones for specific jobs.
  • Object tracking: once detected, sometimes we want an object tracked.
  • Segmentation: Detect objects and determine which pixels are and aren’t part of that object.

To deploy algorithms described above, the guide then talks about hardware. Apart from theoretical challenges, there’s also hardware constraint that are especially acute on embedded hardware like Raspberry Pi, Google Coral, etc.

After hardware, there are a few specific application areas. From medical computer vision, to video processing, to image search engine.

This is an impressively comprehensive overview of computer vision. I think it’ll be a very useful resource for me in the future, as long as I keep in mind a few characteristics of this site.

Skimming “Build OpenCV Mini-Projects” by PyImageSearch: Contours

Getting a taste of OpenCV color operations were interesting, but I didn’t really understand what made OpenCV more powerful than other image processing libraries until we got to contours, which covers most of the second half of PyImageSearch’s Start Here guide Step 4: Build OpenCV Mini-Projects.

This section started with an example for finding the center of a contour, which in this case is examining a picture of a collection of non-overlapping paper cut-out shapes. The most valuable concept here is that of image moments, which I think of as a “summary” for a particular shape found by OpenCV. We also got names for operations we’ve seen earlier. Binarization operations turn an image into binary yes/no highlight of potentially interesting features. Edge detection and thresholding are the two we’ve seen.

Things get exciting when we start putting contours to work. The tutorial starts out easy by finding the extreme points in contours, which breaks down roughly what goes on inside OpenCV’s boundingRect function. Such code is then used in tutorials for calculating size of objects in view which is close to a project idea on my to-do list.

A prerequisite for that project is code to order coordinates clockwise, which reading the code I was surprised to learn was done in cartesian space. If the objective is clockwise ordering, I thought it would have been a natural candidate for processing in polar coordinate space. This algorithm was apparently originally published with a boundary condition bug that, as far as I can tell, would not have happened if the coordinate sorting was done in polar coordinates.

These components are brought together beautifully in an example document scanner application that detects the trapezoidal shape of a receipt in the image and performs perspective correction to deliver a straight rectangular image of the receipt. This is my favorite feature of Office Lens and if I ever decide to write my own I shall return to this example.

By the end of this section, I was suitably impressed by what I’ve seen of OpenCV, but I also have the feeling a few of my computer vision projects would not be addressed by the parts of OpenCV covered in the rest of PyImageSearch’s Start Here guide.

Skimming “Build OpenCV Mini-Projects” by PyImageSearch: Colors

I followed through PyImageSearch’s introductory Step 3: Learn OpenCV by Example (Beginner) line by line both to get a feel of using Python binding of OpenCV and also to learn this particular author’s style. Once I felt I had that, I started skimming at a faster pace just to get an idea of the resources available on this site. For Step 4: Build OpenCV Mini-Projects I only read through the instructions without actually following along with my own code.

I was impressed that the first part of Step 4 is dedicated to Python’s NoneType errors. The author is right — this is a very common thing to crop up for anyone experimenting with Python. It’s the inevitable downside of Python’s lack of static type checking. I understand the upsides of flexible runtime types and really enjoy the power it gives Python programmers, but when it goes bad it can go really bad and only at runtime. Certainly NoneType is not the only way it can manifest, but it is certainly going to be the most common and I’m glad there’s an overview of what beginners can do about it.

Which made the following section more puzzling. The topic was image rotation, and the author brought up the associated rotation matrix. I feel that anyone who would need an explanation of NoneType errors would not know how a mathematical matrix is involved in image rotation. Most people would only know image rotation from selecting a menu in Photoshop or, at most, grabbing the rotate handle with a mouse. Such beginners to image processing would need an explanation of how matrix math is involved.

The next few sections were focused on color, which I was happy to see because most of Step 3 dealt with gray scale images stripped of their color information. OpenCV enables some very powerful operations I want to revisit when I have a project that can make use of them. I am the most fascinated by the CIE L*a*b color space, something I had never heard of before. A color space focused on how humans perceived color rather than how computers represented it meant code working in that space will have more human-understandable results.

But operations like rotation, scaling, and color spaces are relatively common things shared with many other image manipulation libraries. The second half goes into operations that make OpenCV uniquely powerful: contours.

Notes On “Learn OpenCV by Example” By PyImageSearch

Once basic prebuilt binaries of OpenCV has been installed in an Anaconda environment on my Windows PC, Step #2 of PyImageSearch Start Here Guide goes into command line arguments. This section was an introduction for people who have little experience with the command line, so I was able to skim through it quickly.

Step #3 Learn OpenCV by Example (Beginner) is where I finally got some hands-on interaction with basic OpenCV operations. Starting with basic image manipulation routines like scale, rotate, and crop. These are pretty common with any image library, and illustrated with a still frame from the movie Jurassic Park.

The next two items were more specific to OpenCV: Edge detection attempts to extract edges from am image, and thresholding drops detail above and below certain thresholds. I’ve seen thresholding (or close relative) in some image libraries, but edge detection is new to me.

Then we return to relatively common image manipulation routines, like drawing operations on an image. This is not unique to OpenCV but very useful because it allows us to annotate an image for human-readable interpretation. Most commonly drawing boxes to mark regions of interest, but also masking out areas not of interest.

Past those operations, the tutorial concludes with a return to OpenCV specialties in the form of contour and shape detection algorithms, executed on a very simple image with a few Tetris shapes.

After following along through these exercises, I wanted to try those operations on one of my own pictures. I selected a recent image on this blog that I thought would be ideal: high contrast with clear simple shapes.

Xbox One

As expected, my first OpenCV run was not entirely successful. I thought this would be an easy image for edge detection and I learned I was wrong. There were false negatives caused by the shallow depth of field. Vents on the left side of the Xbox towards the rear was out of focus and edges were not picked up. False positives in areas of sharp focus came from two major categories: molded texture on the front of the Xbox, and little bits of lint left by the towel I used to wipe off dust. In hindsight I should have taken a picture before dusting so I could compare how dust vs. lint behaved in edge detection. I could mitigate false positives somewhat by adjusting the threshold parameters of the edge detection algorithm, but I could not eliminate them completely.

Xbox Canny Edge Detect 30 175

With such noisy results, a naive application of contour and shape detection algorithms used in the tutorial returned a lot of data I don’t yet know how to process. It is apparent those algorithms require more processing and I still have a lot to learn to deliver what they needed. But still, it was a fun first run! I look forward to learning more in Step 4: Build OpenCV Mini-Projects.

Notes on OpenCV Installation Guide by PyImageSearch

Once I decided to try PyImageSearch’s Getting Started guide, the obvious step #1 is about installing OpenCV. Like many popular open source projects, there are two ways to get it on a computer system: (1) use a package manager, or (2) build from source code. Since the focus here is using OpenCV from Python, the package manager of choice is pip.

Packages that can be installed via pip are not necessarily done by the original authors of a project. If it’s popular enough, someone will take on the task of building from open source code and make those built binaries available to others, and PyImageSearch pip install opencv guide says that is indeed the case here for OpenCV.

I appreciated the explanation of differences between the four different packages, a result of two different yes/no options: headless or not, and contrib modules or not. The headless option is appropriate for machines used strictly for processing and do not need to display any visual interface, and contrib describes a set of modules that were contributed by people outside of core OpenCV team. These have grown popular enough to be offered as a packaged bundle.

What’s even more useful was an explanation of what was not in any of these packages available via pip: modules that implement patented algorithms. These “non-free” components are commonly treated as part of OpenCV, but are not distributed in compiled binary form. We may build them from source code for exploration, but any binary distribution (for example, use in commercial software product) requires dealing with lawyers representing owners of those patents.

Which brings us to the less easy part of the OpenCV installation guide: building from source code. PyImageSearch offers instructions to do so on macOS, Ubuntu, and Raspbian for a Raspberry Pi. The author specifically does not support Windows as a platform for learning OpenCV. If I want to work in Windows, I’m on my own.

Since I’m just starting out, I’m going to choose the easy method of using pre-built binaries. Like most Python tutorials, PyImageSearch highly recommends a Python environment manager and includes instructions for virtualenv. My Windows machine already had Anaconda installed, so I used that instead to install opencv-contrib-python in an environment created for this first phase of OpenCV exploration.

Trying OpenCV Getting Started Guide By PyImageSearch

I am happy that I made some headway in writing desktop computer applications controlling hardware peripheral over serial port, in the form of a test program that can perform a few simple operations with a 3D printer. But how will I put this idea to work doing something useful? I have a few potential project ideas that leverage the computing power of a desktop computer, several of them in the form of machine vision.

Which meant it was time to fill another gap in my toolbox of solving problems with software: get a basic understanding of what I can and can’t do with machine vision. There are two meanings to “can” in that sentence, both of them apply: “is this even theoretically possible” sense and also the “is this within the reach of my abilities” sense. The latter will obviously be more limiting, and the limit is something I can devote the time to learn and fix. But getting an idea of the former is also useful so I don’t go off on a doomed project trying to build something impossible.

Which meant it was time to learn about OpenCV, the canonical computer vision library. I came across OpenCV in various contexts but it’s just been a label on a black box. I never devoted the time to sit down and learn more about this box and how I might be able to leverage it in my own projects. Given my interest in robotics, I knew OpenCV was on my path but didn’t know when. I guess now is the time.

Given that OpenCV is the starting point for a lot of computer vision algorithms and education, there are many tutorials to choose from and I will probably go through several different ones before I will feel comfortable with OpenCV. Still, I need to pick a starting point. Upon this recommendation from Evan who I met at Superconference, I’ll try Getting Started guide by PyImageSearch. First step: installing OpenCV.

Simple Logger Extended With Subset List

One of the features about ETW I liked was LoggingLevel. It meant I no longer had to worry about whether something might be too overly verbose to log. Or that certain important messages might get buried in a lot of usually-unimportant verbose details. By assigning a logging level, developers have the option to filter messages by level during later examination. Unfortunately I got lost with ETW and had to take a step back with my own primitive logger, but that didn’t make the usefulness go away. In fact I quickly found that I wanted it as things got complex.

In my first experiment Hello3DP I put a big text box on the application to dump data. For the second experiment PollingComms I have a much smaller text area so I could put some real UI on the screen. However, the limited area meant the verbose messages quickly overflowed the area, pushing potentially important information off screen. I still want to have everything in the log file, but I only need a subset displayed live in the application.

I was motivated to take another stab at ETW but was similarly unsuccessful. In order to resolve my immediate needs I started hacking away at my simple logger. I briefly toyed with the idea of using a small database like SQLite. Microsoft even put in the work for easy SQLite integration in UWP applications. Putting everything into a database would allow me to query by LoggingLevel, but I thought it was overkill for solving the problem at hand.

I ended up adding a separate list of strings. Whenever the logger receives a message, it looks at the level and decides if it should be added to the subset of string as well. By default I limited the subset to 5 entries, and only at LoggingLevel.Information or higher. This was something I could pass into the on screen text box to display and notify me in real time (or at least within ~1 second) what is going wrong in my application.

Once again I avoided putting in the work to learn how to work with ETW. I know I can’t put it off forever but this simple hack kicked that can further down the road.

[This kick-the-can exercise is publicly available on GitHub]

Window Shopping Firmata: Connect Microcontrollers To Computers

Reading about LabVIEW LINX stirred up memory of something with a similar premise. I had forgotten its name and it took a bit of research to re-discover Firmata. Like LINX, Firmata is a protocol for communicating between microcontrollers and desktop computers. Like LINX, there are a few prebuilt implementations available for direct download, such as their standard Arduino implementation of Firmata.

There’s one aspect of the Firmata protocol I found interesting: its relationship to MIDI messages. I had originally thought it was merely inspired by MIDI messages, but the Firmata protocol documentation says it is actually a proper subset of MIDI. This means Firmata messages have to option to coexist with MIDI messages on the same channel, conveying data that is mysterious to MIDI instruments but well-formed to not cause problems. This was an interesting assertion, even with the disclaimer that in practice Firmata typically runs at a higher serial speed on its own bus.

Like LINX, Firmata is intended to be easily implemented by simple hardware. The standard Arduino implementation can be customized for specific projects, and anything else that can communicate over a serial port is a candidate hardware endpoint for Firmata.

But on the computer side, Firmata is very much unlike LINX in its wide range of potential software interfaces. LINX is a part of LabVIEW, and that’s the end of the LINX story. Firmata can be implemented by anything that can communicate over a serial port, which should cover almost anything.

Firmata’s own Github hosts some Python sample code, and it is but one of five options for Python client libraries listed on the protocol web site and they carry along some useful tips like using Python’s ord()/chr() to convert hexadecimal data to/from Firmata packets. Beyond Python, every programming language I know of are invited to the Firmata party: Processing, Ruby, JavaScript, etc.

Since I had been playing with C# and .NET recently, I took a quick glance at their section of Firmata. These are older than UWP and use older non-async APIs. The first one on the list used System.IO.Ports.SerialPort, and needed some workaround for Mono. The second one isn’t even C#: It’s aimed at Visual Basic. I haven’t looked at the third one on the list.

If I wanted to write an UWP application that controls hardware via Firmata, writing a client library with the newer async Windows.Devices.SerialCommunication.SerialDevice API might be a fun project.

Windows Shopping LINX: Connecting LabVIEW To Maker Hardware

When I looked over LabVIEW earlier with the eyes of a maker, my biggest stumbling block was trying to connect to the kind of hardware a maker would play with. LabVIEW has a huge library of hardware interface components for sophisticated professional level electronics instrumentation. But I found nothing for simple one-off projects like the kind I have on my workbench. I got as far as finding a reference to a mystical “Direct I/O” mechanism, but no details.

In hindsight, that was perfectly reasonable. I was browsing LabVIEW information presented on their primary site targeted to electronics professionals. I thought the lack of maker-friendly information meant National Instruments didn’t care about makers, but I was wrong. It actually meant I was not looking at the right place. LabVIEW’s maker-friendly face is on an entirely different site, the LabVIEW MakerHub.

Here I learned about LINX, an architecture to interface with maker level hardware starting with the ubiquitous Arduino, Raspberry Pi, and extensible to others. From the LINX FAQ and the How LINX Works page I got the impression it allows individual LabVIEW VI (virtual instrument) to correspond to individual pieces of functionality on an Arduino. But very importantly, it implies that representation is distinct from the physical transport layer, where there’s only one serial (or WiFi, or Ethernet) connection between the computer running LabVIEW and the microcontroller.

If my interpretation is true this is a very powerful mechanism. It allows the bulk of LabVIEW program to be set up without worrying about underlying implementation. Here’s one example that came to mind: A project can start small with a single Arduino handling all hardware interface. Then as the project grows, and the serial link becomes saturated, functions can be split off into separate Arduinos with their own serial link plugged in to the computer. Yet doing so would not change the LabVIEW program.

That design makes LabVIEW much more interesting. What dampens my enthusiasm is the lack of evidence of active maintenance on LabVIEW MakerHub. I see support for BeagleBone Black, but not any of the newer BeagleBone boards (Pocket is the obvious candidate.) The list of supported devices list Raspberry Pi only up to 2, Teensy only up to 3.1, Espressif ESP8266 but not the ESP32, etc. Balancing that discouraging sight is that the code is on Github, and we see more recent traffic there as well as the MakerHub forums. So it’s probably not dead?

LINX looks very useful when the intent is to interface with LabVIEW on the computer side. But when we want something on the computer other than LabVIEW, we can use Firmata which is another implementation of the concept.

UPDATE: And just after I found it (a few years after it launched) NI is killing MakerHub with big bold red text across the top of the site: “This site will be deprecated on August 1, 2020″

Communicating With 3D Printer Added A Twist

I chose to experiment with UWP serial device communication with my 3D printer because (1) it sends out a sequence of text immediately upon connection and (2) it was already sitting on the table. Just trying to read that text was an educational exercise, including a side trip through the world of logging.

The next obvious step was to send a command and read the printer’s response. This is where I learned 3D printers like this MatterHackers Pulse XE behaves a little differently from serial devices I’ve worked with before. RoboClaw motor controllers or serial bus servos like the Dynamixel AX-12 or LewanSoul/Hiwonder LX-16A have one behavior in common: They listen for a command in a known format, then they send a response also in a known format. This isn’t what my 3D printer control board does.

It wasn’t obvious to me until in hindsight, but I should have known as soon as I saw it send out information upon connection before receiving any commands. That’s not the only time the printer would send out unprompted information. Sometimes it sends text about SD card status, or to indicate it is busy processing the previous command. Without a 1:1 mapping between command and response, the logic to read and interpret printer response to commands has to be a little more sophisticated than what I’ve needed to write for earlier projects.

Which is a great opportunity to learn how to structure my code to solve problems with the async/await pattern. When I had a strict command/response pattern, it was easy to write code that assumes the information I read is in direct response to the command I sent. Now that data may arrive unprompted, the read and write operations have to be separated into their own asynchronous processing loops. When the read loop receives data, it needs to be able to interpret that possibly in the absence of a corresponding command. But if there is a corresponding command, it needs to pair up the response with the command sent. Which meant I needed a queue of commands awaiting responses and logic to decide when to dequeue them and send a response back to their caller.

Looking at code behavior I can see another potential: that of commands that do not expect a corresponding response. Thankfully I haven’t had to deal with that combination just yet, what I have on hand is adding enough challenge for this beginner. Certainly getting confusing enough I was motivated to extended my logging mechanism to help understand the flow.

[The crude result of this exercise is available publicly on GitHub]

Simple Logging To Text File

Even though I aborted my adventures into Windows ETW logging, I still wanted a logging mechanism to support future experimentation into Universal Windows Platform. This turned into an educational project in itself, learning about other system interfaces of this platform.

Where do I put this log file?

UWP applications are not allowed arbitrary access to the file system, so if I wanted to write out a log file without explicit user interaction, there are only a few select locations available. I found the KnownFolders enumeration but those were all user data folders, I didn’t want these log files clogging up “My Documents” and such. I ended up putting the log file in ApplicationData.TemporaryFolder. This folder is subject to occasional cleanup by the operating system, which is fine for a log file.

When do I open and close this log file?

This required a trip into the world of UWP application lifecycle. I check if the log file existed and, if not, create and open the log file from three places: OnLaunched, OnActivated, and OnResuming. In practice it looks like I mostly see OnLaunched. The flipside is OnSuspending, where the application template has already set up a suspension deferral buying me time to write out and close the log file.

How do I write data out to this log file?

There is a helpful Getting Started with file input/output document. In it, the standard recommendation is to use the FileIO class. It links to a section in the UWP developer’s guide titled Files, folders, and libraries. The page Create, write, and read a file was helpful for me to see how these differ from classic C file I/O API.

These FileIO classes promise to take care of all the complicated parts, including async/await methods so the application is not blocked on file access. This way the user interface doesn’t freeze until the load or save operation completes, instead remaining responsive while file access was in process.

But when I used the FileIO API naively, writing upon every line of the log file, I received a constant stream of exceptions. Digging into the call stack of the exception (actually several levels deep in the chain) told me there was a file access collision problem. It was the page Best practices for writing to files that cleared things up for me: these async FileIO libraries create temporary files for each asynchronous action and copy over the original file upon success. When I was writing once per line, too many operations were happening in too short of a time resulting in the temporary files colliding with each other.

The solution was to write less frequently, buffer up a set of log messages so I write a larger set of them with each FileIO access, rather than calling once per log entry. Reducing the frequency of write operations resolved my collision issue.

[This simple text file logging class is available on GitHub.]

Complexity Of ETW Leaves A Beginner Lost

When experimenting with something new in programming, it’s always useful to step through the code in a debugger the first time to see what it does. An unfortunate side effect is far slower than normal execution speed, which interferes with timing-sensitive operations. An alternative is to have a logging mechanism that doesn’t slow things down (as much) so we can read the logs afterwards to understand the sequence of events.

Windows has something called Event Tracing for Windows (ETW) that has evolved over the decades. This mechanism is implemented in the Windows kernel and offers dynamic control of what events to log. The mechanism itself was built to be lean, impacting system performance as little as possible while logging. The goal is that it is so fast and efficient that it barely affects timing-sensitive operations. Because one of the primary purposes of ETW is to diagnose system performance issues, and obviously it can’t be useful it if running ETW itself causes severe slowdowns.

ETW infrastructure is exposed to Universal Windows Platform applications via the Windows.Foundation.Diagnostics namespace, with utility classes that sounded simple enough at first glance: we create a logging session, we establish one or more channels within that session, and we log individual activities to a channel.

Trying to see how it works, though, can be overwhelming to the beginner. All I wanted is a timestamp and a text message, and optionally an indicator of importance of the message. The timestamp is automatic in ETW. The text message can be done with LogEvent, and I can pass in a LoggingLevel to signify if it is verbose chatter, informative message, warning, error, or a critical event.

In the UWP sample library there is a logging sample application showcasing use of these logging APIs. The source code looks straightforward, and I was able to compile and run it. The problem came when trying to read this log: as part of its low-overhead goal and powerful complexity, the output of ETW is not a simple log file I can browse through. It is a task-specific ETL file format that requires its own applications to read. Such tools are part of the Windows Performance Toolkit, but fortunately I didn’t have to download and install the whole thing. The Windows Performance Analyzer can be installed by itself from the Windows store.

I opened up the ETL file generated by the sample app and… got no further. I could get a timeline of the application, and I can unfold a long list of events. But while I could get a timestamp for each event, I can’t figure out how to retrieve messages. The sample application called LogEvent with a chunk of “Lorem ipsum” text, and I could not figure out how to retrieve it.

Long term I would love to know how to leverage the ETW infrastructure for my own application development and diagnosis. But after spending too much time unable to perform a very basic logging task, I shelved ETW for later and wrote my own simple logger that outputs to a plain text file.

What To Do When Await Waits Too Long

I had originally planned to defer learning about how to cancel out of an asynchronous operation, but the unexpected “timeout doesn’t time out” behavior of asynchronous serial port read gave me the… opportunity… to learn about cancellation now.

My first approach was hilariously clunky in hindsight: I found Task.WhenAny first, which will complete the await operation if any of the given list of Task objects completed. So I built a new Task object whose only job is to wait through a short time and complete. I packed it and the serial read operation task together into an array, and when the await operation completed I could see whether the read or the timeout Task completed first.

It seemed to work, but I was unsatisfied. I felt this must be a common enough operation that there would be other options, and I was right: digging through the documentation revealed there’s a very similar-sounding Task.WaitAny which has an overload that accepts a TimeSpan as one of the parameters. This is a shorter version of what I did earlier, but I still had to pack the read operation into a Task array of a single element.

Two other overloads of Task.WaitAny accepted a CancellationToken instead, and I initially dismissed them. Creating a CancellationTokenSource is the most flexible way to give me control over when to trigger a cancellation, but I thought that was for times when I had more sophisticated logic deciding when to cancel. Spinning up a whole separate timer callback to call Cancel() felt like overkill.

Except it didn’t have to be that bad: CancellationTokenSource has a constructor that accepts a count of milliseconds before canceling, so that timer mechanism was already built-in to the class. Furthermore, by using CancellationTokenSource, I still retain the flexibility of canceling earlier than the timeout if I should choose. This felt like the best choice when I only have a single Task at hand. I can reserve Task.WhenAny or Task.WaitAny for times when I have multiple Task objects to coordinate. Which is also something I hope to defer until later, as I’m having a hard enough time understanding all the nuances of a single Task in practice. Maybe some logging can help?

[This Hello3DP programming exercise is publicly available on GitHub]

Unexpected Behavior: Serial Device Read Timeout Only Applies When There’s Data

After playing with the Custom Serial Device Access demonstration application to read a 3D printer’s greeting message, I created a blank C# application from the Universal Windows Platform application template in Visual Studio and copy/pasted the minimum bits of code to read that same printer greeting message and send it to text on screen.

The sample application only showed a small selection of text, but I wanted to read the entire message in my test application. This is where I ran into an unexpected behavior. I had set the SerialDevice.ReadTimeout property to various TimeSpan on the scale of a few seconds. Sometimes I would get the timeout behavior I expected, returning with some amount of data less than buffer size. But other times my read operation hangs indefinitely past the timeout period.

I thought I did something wrong with the async/await pattern causing me to await forever, but I cut the code way back to the minimum while still following the precedence of the sample app, and it was still happening unpredictably. Examining the data that was returned, it looked like the same greeting message I saw when I connected via PuTTY serial terminal, nothing to indicate a problem.

Eventually I figured out the factor wasn’t anything in the data I have read, but the data I have not yet read. Specifically, the hanging behavior occurs when there is no further data at all from the serial port waiting to be read. If there was even just one byte, everything is fine: the platform will pull that byte from the serial port, put it in my allocated buffer (I experimented with 1 kilobyte size buffer, 2 KB, 4KB, it didn’t matter) and return to me after the timeout period. But if there are no bytes at all, it hangs waiting.

I suppose this makes some sort of sense, it’s just not what I had expected. The documentation for ReadTimeout mentions that there’s an underlying Win32 data structure SERIAL_TIMEOUTS dictating underlying behavior. A quick glance through that page failed to find anything that corresponds to what I think is happening, which worries me somewhat. Fortunately, there are ways to break out of an await that has waited longer than desired.

[This Hello3DP programming exercise is publicly available on GitHub]

3D Printer as Serial Communication Test Device

Reading about novel programming patterns and contemplating unexpected hardware platform potential are great… but none of it is real until I sit down and make some code run. Since my motivation centered around controlling external peripherals via serial port, I’ll need a piece of hardware to experiment against. In the interest of reducing variables, I didn’t want to start with one of my own past projects. I expect to run into enough headaches debugging what went wrong, I don’t need to wonder if the problem is my code or my project hardware.

So what do I have sitting around at home that are serial controlled hardware? The easy answer is the brains of a 3D printer. And the most easily accessible item is my MatterHackers Pulse XE printer, which is conveniently sitting on the same table as the computer.

In order to get an idea of what I’m getting into, I connected to the printer via serial port terminal of PuTTY. I saw that a greeting message is sent out via serial as soon as I connected. This is great, because it meant I have some data to read immediately upon connect, no need to worry about sending a command before getting a response.

Moving to the development platform, I loaded up the UWP example program Custom Serial Device Access. After reading the source code to get a rough feel of what the application did, I compiled and ran it. It was able to enumerate the USB serial connection, but when I connect, I did not see the greeting message. Even though the parameters I used for PuTTY (250000 N 8 1) were also used here.

As tempting as it might have been to blame the example program and say it is wrong, I thought it was more likely that one of the other parameters of SerialDevice had to be changed from their default value. Flipping settings one by one to see if they change behavior, I eventually figured out that I needed to set IsDataTerminalReadyEnabled to true in order to receive data from a Pulse XE. I’m lucky it was only a single boolean value I had to change because if I had to change multiple values to a specific combination, there would have been too many possibilities to find by trial and error.

It’s always good to start as simple as possible because I never know what seemingly basic issue would crop up. IsDataTerminalReadyEnabled wasn’t even the only surprise I found, ReadTimeout behavior was also unexpected.