Larson Scanner Demo for Tape Deck LCD

I am happy with a sense of accomplishment after I deciphered all the information necessary to utilize this circuit board, formerly the faceplate for a salvaged car tape deck. I started this investigation when I found I could power it up under control of the original mainboard. Now, I can work with the LCD and read all knobs and buttons with an Arduino, independent of the original mainboard. My original intent was just to see if I could get to this point. I thought I would learn a lot whether I succeeded or failed trying to control this faceplate. I have gained knowledge and experience I didn’t have before, and a faceplate I can control.

Now what?

It feels like I should be able to build something nifty with this faceplate, there’s room to be creative in repurposing it. At the moment I don’t have any ideas that would creatively utilize the display and button/knob input, but I could build a simple demo. This LCD is wide and not very tall, so I thought I would make it into a simple Larson Scanner. (a.k.a. Cylon lights a.k.a. K.I.T.T. lights.)

First, I divided the LCD segments into 16 groups of roughly similar size. I started using my segment map to generate the corresponding bit patterns by hand but then decided I should make a software tool to do it instead. I’ve already written code to light up one segment at a time for generating the segment map, it took only a bit of hacking to make it into a drawing tool. I use the knob to move segments as I did before, but now I could press the knob to toggle the selected segment. Every time I pressed the knob, I print the corresponding bit pattern out to serial terminal in a format that I could copy into C source code.

I then added a different operating mode to my Arduino test program. Pressing the Volume knob would toggle between drawing mode and Larson Scanner mode. While in Larson Scanner mode, I would select two of those 16 groups based on scanner position, and bitwise OR them together into my display. This gives me a nice little demo that is completely unrelated to this LCD’s original purpose, and confidence I no longer need this tape deck’s original mainboard.

Source code for this demo is publicly available on GitHub.

Pinout of Tape Deck Faceplate (Toyota 86120-08010)

It is time to wrap up investigation into the workings of a tape deck faceplate, salvaged from the stock audio head unit of a 1998 Toyota Camry LE. I believe I’ve deciphered all the information necessary to reuse this faceplate independently from the rest of the tape deck. Summarized in this pinout report with links to more details.

The faceplate circuit board is largely built around a Sanyo LC75853N chip, which communicates via a Sanyo proprietary protocol called CCB (Computer Control Bus). An external microcontroller (I used an Arduino Nano in experiments to date) can dictate what is displayed on the LCD (see segment map here) and scan pressed/not-pressed state of buttons (see button map here).

Some faceplate components are independent of LC75853N:

From right-to-left, functionality I observed on these pins are:

ACC5VPower supply for digital logic.
+5V relative to GND
LCD-DOCCB digital data out.
LC75853N address 0x43
Single 4-byte transmission to microcontroller.
LCD-DICCB digital data in.
Microcontroller to LC75853N address 0x42.
Three 7-byte transmissions to LC75853N.
LCD-CLKCCB clock signal.
Active-low generated by microcontroller.
LCD-CECCB enable.
Active-high generated by microcontroller.
CD-EJEEject button.
Normally open, shorts to GND when “Eject” button is pressed.
ILLIllumination power supply (positive).
+5V to +14.4V (~60mA) relative to ILL- for variable button backlight brightness.
LCD-BLLCD backlight power supply (positive).
+6V (~60mA) relative to BL-
VOL.CONVolume control potentiometer.
Voltage between ACC5V (full clockwise) and GND (full counterclockwise)
PULS-AAudio mode quadrature encoder knob – A
ACC5V or GND, will be opposite of B when at a detent.
A and B briefly identical during transition between detents.
PULS-BAudio mode quadrature encoder knob – B
ACC5V or GND, will be opposite of A when at a detent.
A and B briefly identical during transition between detents.
GNDDigital logic power (negative).
Relative to ACC5V
ILL-Illumination power supply (negative)
Relative to ILL
BL-LCD backlight power supply (negative)
Relative to LCD-BL
RESETUnknown. Observed 0V relative to GND.
LC75853N has no reset pin.
Seems OK to leave it unconnected/floating.
Unknown. Observed 0V relative to GND.
Seems OK to leave it unconnected/floating.

Source code for this investigation (and accompanying demo) is publicly available on GitHub.

Button Presses on Tape Deck Faceplate (Toyota 86120-08010)

I’ve mapped out all LCD segments on a Toyota 86120-08010 tape deck faceplate, allowing me to control what is displayed from an Arduino. Constrained by the nature of a segmented LCD, of course. These segments were customized for a tape deck, doing what it needs and no more. Any projects trying to repurpose it would have to get creative. This will be even more difficult than abstract Nyan Cat on a VCR VFD!

But I have more than just the LCD, I have the entire faceplate which gives me option for user interactivity. Two knobs which I still have, and buttons which are now gone but their electrical traces are still present. I can find something conductive to bridge these traces like the buttons used to do, or I can solder wires connecting to switches elsewhere. Either way, I needed to write code for my Arduino to read key scan data from the Sanyo LC75853N chip. Just like LCD segment control data, my goal is to emulate the original mainboard as closely as I can. I will be guided by the datasheet and what my logic analyzer captured.

Thanks to lessons learned from doing LCD segment control CCB communication, reading keyscan data over CCB was relatively straightforward and I could dump control data out to Arduino serial monitor.

I will follow the KD1-KD30 numbering system used in the datasheet timing diagram.

Button map for Toyota 86120-08010 faceplate. (Click to view full size)

This key map shows the KD number for almost all of the buttons on the faceplate, including the push action on both knobs. The lone exception is the “Eject” button, which has its own dedicated wire and is not part of LC75853N key matrix. These numbers may look odd at first glance, but they make sense once we look at how they would be wired to the LC75853N:

These fifteen buttons make use of KS3-6 and KI2-5, learning only KD14 unused in its matrix. If I wanted to add in a single button, I will try to find KS3 and KI4 traces on the faceplate to wire in KD14. If I want to add more buttons, I might need to solder directly to the unused IC pins KI1, KS1, and KS2 as I wouldn’t expect any traces for those unused pins on the faceplate circuit board.

Feeling good that I’ve figured out the input & output of this faceplate, it’s time to wrap it all up.

Source code for this investigation is publicly available on GitHub.

Segmented LCD on Tape Deck Faceplate (Toyota 86120-08010)

Thanks to well-labeled connectors and an online datasheet, I can write Arduino code to control faceplate LCD of the stock tape deck audio unit from a 1998 Toyota Camry LE. (86120-08010). However, knowledge of the digital wiring doesn’t tell me anything about the physical location and shape of each segment in the LCD. I will build a segment map with the help of a knob already on the faceplate. The Sanyo LC75853N chip could control up to 126 segments. I edited my Arduino program to turn on all of them, so I could take this picture to see where segments even existed.

It reflected the custom nature of a segmented LCD. Some of these digits would only ever display numbers, so they had the standard 7 segments for numeric display. Others have a secondary use to display letters for a few audio settings, and those digits had more than 7 segments. But they can’t display arbitrary letters, only exactly what was needed and no more.

With the full set of available segments in hand, I changed the program to turn on just one segment at a time interactively selected via the “Audio Mode” knob. Since each segment is a single bit, I could print out my control bits to Arduino serial monitor to see current active segment’s programmatic address.

I’m using the same numbering system as used in the datasheet, D1 to D126. From there, I generated this segment map:

LCD segment map for 86120-08010. Click to view full size.

When I first started my Arduino program, I saw nothing for the first few turns of the dial. I thought perhaps my program was faulty, but a few turns later I saw my first result for D13. From there on it was relatively straightforward, working from bottom-to-top and right-to-left. Some notes:

  • D24 is missing, a mysterious gap.
  • There were a few segments that I had thought were separate but were actually the same segment. For example, two visually distinct segments were actually just D54. This was disappointing, because it restricted the number of letters we could show. (Example: a good looking “M” would be possible, but a good looking “N” wouldn’t.)
  • Along the same lines, some numeric segments looked like separate segments purely for the sake of preserving the 7-segment aesthetic. The three segments D101, D102, and D105 together could display either “1” or “2” but visually looked like 6 segments instead of 3.
  • The leftmost large numeric digit puzzled me. It could only display 1, which is fine. But given that, why are the two segments D91 and D92 individually addressable? I can’t think of a reason why we’d want to display only the top or bottom half of a “1”.

Here is a table from the LC75853N datasheet mapping segment numbers to pins. I colored in the segments that were seen and a clear pattern emerged: This LCD allocation avoided using the first four pins (S1-S4) leaving the option of using them as general-purpose output wires. (P1-P4) At the end, stopping at D105 meant they didn’t have to wire up the final seven output pins. (The final two S41 and S42 would have had to been reallocated from key scan duty KS1 and KS2, if used.)

None of this explains why D24 is missing. The answer to this mystery must lie elsewhere.

Now that I know what is on this LCD available to display, maybe I can think of a creative way to reuse it. While I think that over, I’ll proceed to work through how to read input from this faceplate.

Source code for this investigation is publicly available on GitHub.

Reading Faceplate “Audio Mode” Knob

Once I got a retired faceplate’s LCD up and running, I realized I was wrong about its backlight circuitry. Now that it’s been sorted out, attention returns to the LCD. I want to map out all the segments, which means I need to get some way to interactive select individual segments. Recently I’ve used ESPHome’s network capabilities for interactivity, but this project uses an Arduino Nano for its 5V operating voltage. I can wire up my own physical control, but the faceplate already had some on board. Earlier probing established Power/Volume knob is a potentiometer, and Audio Mode knob is a quadrature encoder with detent.

The encoder is perfect for selecting individual segments. I can write code to activate one segment at a time. When the knob is turned one way, I can move to adjacent segments in one direction. When the knob is turned the other way, segment selection can follow suit. However, this knob does have one twist relative to my prior experience: Each detent on this encoder is actually two steps. When the knob is at rest (at a detent) the A and B pins are always different. (High/low or low/high). Intermediate values where A and B pins are the same (high/high or low/low) occur between detents.

This places timing demands for reading that knob. For encoders where each detent is a single step change and turned by human hands, polling once every hundred millisecond or so would be fast enough. However, since this knob can flash through a step very quickly between detents, I need to make sure those steps do not get lost.

There are many quadrature encoder libraries available for the Arduino platform. I selected this one by Paul Stoffregen, who I know as the brains behind the Teensy line of products. When working with interrupt-capable pins, this library will set up hardware monitoring of any changes in encoder state. This makes it very unlikely for encoder steps to get lost. According to Arduino documentation for attachInterrupt(), all ATmega328-based Arduino boards (including the Arduino Nano I’m using) have two interrupt-capable pins: 2 and 3. Using those pins resulted in reliable reading of knob position for mapping out segments of this LCD.

Source code for this investigation is publicly available on GitHub.

Successful Arduino Test of LC75853N Control

Using a Saleae Logic 8 Analyzer, I’ve examined the communication protocol between the mainboard and faceplate of a car tape deck. These signals match expectations of a Sanyo LC75853N LCD controller which uses Sanyo’s proprietary CCB (Computer Control Bus) protocol. CCB has some resemblance to SPI and I2C but is neither, though close enough to SPI for me to use the SPI analyzer mode on a Saleae analyzer.

But “close enough” won’t be good enough for the next step: take an Arduino Nano and write code to talk to the LCD controller via CCB, copying the data waveform behavior I saw as closely as I can. Arduino has a library for SPI that assumes control of the enable pin, which has different behavior under CCB so that would not work here. I investigated using the shiftIn() and shiftOut() routines, which is part of the standard Arduino library. They are software implementations of a clocked serial data transfer routine, but unfortunately their clock signal behavior is different from what I saw of CCB under the logic analyzer. (Active-low vs. active-high.) In order to emulate behavior of the tape deck mainboard, I would have to write my own software implementation of CCB serial data transfer.

On the hardware side, I could no longer avoid soldering to small surface-mount connector pins on the back of the faceplate. I started simple by soldering the four data communication wires: LCD-DO, LCD-DI, LCD-CLK, and LCD-CE. Probing the circuit board with my meter, the only alternative soldering points were directly to the LC75853N, and those pins are even smaller. However, I found alternatives for ACC5V and GND: those were directly connected to the volume control potentiometer, which has nice big through-hole pins for me to solder to. I soldered these wires to a small prototype board with header pins, which then plugged into a breadboard alongside my Arduino Nano.

As a “Hello World” for CCB, I wrote code to replicate the control signals as closely as I could. I won’t try to replicate the exact timing of every pulse captured by my logic analyzer because (1) Arduino doesn’t make that level of control easy and (2) the CBB spec has no explicit requirement for precise timing anyway. However, I aim to make sure relationship between every clock, data, and enable pin high/low transition is preserved. I can verify this by capturing my Arduino output and compared the output to what I captured from the tape deck mainboard, look for where they are different, and fix differences over several iterations. Finally, I was satisfied the data waveforms look the same (minus the timing caveat above) and connected the faceplate.

These are almost the same LCD segments that are visible when I captured the data communication between mainboard and faceplate.

The only difference I see is “ST” in the upper right, which lights up when the FM tuner has a good enough signal to obtain stereo audio. Since this tape deck didn’t have an antenna attached, “ST” blinks on and off. Apparently, I had taken this picture when “ST” was on, and the recorded control signal I played back on an Arduino was when it was off. This is close enough to call my first test a success.

The other visible difference was the backlight: illuminated when I captured the data control message, but dark when I played it back. I had hoped the backlight was under LC75853N control somehow, but it looks like those LEDs are actually separately controlled.

Source code for this investigation is publicly available on GitHub.

Programming Mr Robot Badge Mk. 2 with Arduino

This particular Mr. Robot Badge Mk. 2 was deemed a defective unit with several dark LEDs. I used it as a practice subject for working with surface-mounted electronic devices, bringing two LEDs back into running order though one of them is the wrong color. Is the badge fully repaired? I couldn’t quite tell. The default firmware is big on blinking and flashing patterns, making it difficult to determine if a specific LED is functioning or not. What I needed was a test pattern, something as simple as illuminate all of the LEDs to see if they come up. Fortunately, there was a URL right on the badge that took me to a GitHub repository with sample code and instructions. It used Arduino framework to generate code for this ESP8266, and that’s something I’ve worked with. I think we’re in business.

On the hardware side, I soldered sockets to the unpopulated programmer header and then created a programming cable to connect to my FTDI serial adapter (*). For the software, I cloned the “Starter Pack” repository, followed installation directions, and encountered a build failure:

Arduino\libraries\Brzo_I2C\src\brzo_i2c.c: In function 'brzo_i2c_write':
Arduino\libraries\Brzo_I2C\src\brzo_i2c.c:72:2: error: cannot find a register in class 'RL_REGS' while reloading 'asm'
   72 |  asm volatile (
      |  ^~~
Arduino\libraries\Brzo_I2C\src\brzo_i2c.c:72:2: error: 'asm' operand has impossible constraints
exit status 1
Error compiling for board Generic ESP8266 Module.

This looks like issue #44 in the Brzo library, unfixed at time of this writing. Hmm, this is a problem. Reading the code some more, I learned Brzo is used to create I2C communication routines with the IS31FL3741 driver chip controlling the LED array. Aha, there’s a relatively easy solution. Since the time the badge was created, Adafruit has released a product using the same LED driver chip and corresponding software libraries to go with it. I could remove this custom I2C communication code using Brzo and replace it with Adafruit’s library.

Most of the conversion was straightforward except for the LED pixel coordinate lookup. The IS31Fl3741 chip treats the LEDs as a linear array, and something had to translate the linear index to their X,Y coordinates. The badge example code has a lookup table mapping linear index to X,Y coordinates. To use Adafruit library’s frame buffer, I needed the reverse: a table that converts X,Y coordinates to linear index. I started typing it up by hand before deciding that was stupid: this is the kind of task we use computers for. So I wrote this piece of quick-and-dirty code to cycle through the existing lookup table and print the information back out organized by X,Y coordinates.

      for(uint8_t x = 0; x < 18; x++)
        for(uint8_t y = 0; y < 18; y++)
      for(uint8_t i = 0; i < PAGE_0_SZ; i++)
        reverseLookup[page0LUT[i].x][page0LUT[i].y] = i;
      for(uint16_t i = 0; i < PAGE_1_SZ; i++)
        // Unused locations were marked with (-1, -1) but x and y are
        // declared as unsigned which means -1 is actually 0xFF so
        // instead of checking for >0 we check for <18
        if(page1LUT[i].x < 18 && page1LUT[i].y < 18)
          reverseLookup[page1LUT[i].x][page1LUT[i].y] = i+PAGE_0_SZ;
      for(uint8_t y = 0; y < 18; y++)
        for(uint8_t x = 0; x < 18; x++)
          if (x<17)
        if (y<17)

This gave me the numbers I needed in Arduino serial monitor, and I could copy it from there. Some spacing adjustment to make things a little more readable, and we have this:

const uint16_t Lookup[ARRAY_HEIGHT][ARRAY_WIDTH] = {
  {  17,  47,  77, 107, 137, 167, 197, 227, 257,  18,  48,  78, 108, 138, 168, 198, 228, 258},
  {  16,  46,  76, 106, 136, 166, 196, 226, 256,  19,  49,  79, 109, 139, 169, 199, 229, 259},
  {  15,  45,  75, 105, 135, 165, 195, 225, 255,  20,  50,  80, 110, 140, 170, 200, 230, 260},
  {  14,  44,  74, 104, 134, 164, 194, 224, 254,  21,  51,  81, 111, 141, 171, 201, 231, 261},
  {  13,  43,  73, 103, 133, 163, 193, 223, 253,  22,  52,  82, 112, 142, 172, 202, 232, 262},
  {  12,  42,  72, 102, 132, 162, 192, 222, 252,  23,  53,  83, 113, 143, 173, 203, 233, 263},
  {  11,  41,  71, 101, 131, 161, 191, 221, 251,  24,  54,  84, 114, 144, 174, 204, 234, 264},
  {  10,  40,  70, 100, 130, 160, 190, 220, 250,  25,  55,  85, 115, 145, 175, 205, 235, 265},
  {   9,  39,  69,  99, 129, 159, 189, 219, 249,  26,  56,  86, 116, 146, 176, 206, 236, 266},
  {   8,  38,  68,  98, 128, 158, 188, 218, 248,  27,  57,  87, 117, 147, 177, 207, 237, 267},
  {   7,  37,  67,  97, 127, 157, 187, 217, 247,  28,  58,  88, 118, 148, 178, 208, 238, 268},
  {   6,  36,  66,  96, 126, 156, 186, 216, 246,  29,  59,  89, 119, 149, 179, 209, 239, 269},
  {   5,  35,  65,  95, 125, 155, 185, 215, 245, 270, 279, 288, 297, 306, 315, 324, 333, 342},
  {   4,  34,  64,  94, 124, 154, 184, 214, 244, 271, 280, 289, 298, 307, 316, 325, 334, 343},
  {   3,  33,  63,  93, 123, 153, 183, 213, 243, 272, 281, 290, 299, 308, 317, 326, 335, 344},
  {   2,  32,  62,  92, 122, 152, 182, 212, 242, 273, 282, 291, 300, 309, 318, 327, 336, 345},
  {   1,  31,  61,  91, 121, 151, 181, 211, 241, 274, 283, 292, 301, 310, 319, 328, 337, 346},
  {   0,  30,  60,  90, 120, 150, 180, 210, 240, 275, 284, 293, 302, 311, 320, 329, 338, 347}

This lookup table got Mr. Robot Badge Mk. 2 up and running without Brzo library! I could write that simple “turn on all LED” test I wanted.

The test exposed two more problematic LEDs. One of them was intermittent: I tapped it and it illuminated for this picture. If it starts flickering again, I’ll give it a dab of solder to see if that helps. The other one is dark and stayed dark through (unscientific) tapping and (scientific) LED test equipment. It looks like I need to find two more surface-mount red LEDs to fully repair this array.

In case anybody else wants to play with Mr. Robot Badge and runs into the same problem, I have collected my changes and updated the README installation instructions in pull request #3 of the badge code sample repository. If that PR rejected, clone my fork directly.

(*) Disclosure: As an Amazon Associate I earn from qualifying purchases.

Window Shopping LovyanGFX

One part of having an open-source project is that anyone can offer their contribution for others to use in the future. Most of them were help that I was grateful to accept, such as people filling gaps in my Sawppy documentation. But occasionally, a proposed contribution unexpectedly pops out of left field and I needed to do some homework before I could even understand what’s going on. This was the case for pull request #30 on my ESP_8_BIT_composite Arduino library for generating color composite video signals from an ESP32. The author “riraosan” says it merged LovyanGFX and my library, to which I thought “Uh… what’s that?”

A web search found which is a graphics library for embedded controllers, including ESP32. But also many others that ESP_8_BIT_composite does not support. While the API mimics AdafruitGFX, this library adds features like sprite support and palette manipulation. It looks like a pretty nifty library! Based on the README of that repository, the author’s primary language is Japanese and they are a big fan of M5Stack modules. So in addition to the software technical merits, LovyanGFX has extra appeal to native Japanese speakers who are playing with M5Stack modules. Roughly two dozen display modules were listed, but I don’t think I have any of them on hand to play with LovyanGFX myself.

Given this information and riraosan’s Instagram post, I guess the goal was to add ESP_8_BIT composite video signal generation as another supported output display for LovyanGFX. So I started digging into how the library was architected to enable support for different displays. I found that each supported display unit has corresponding files in the src/lgfx/v1/panel subdirectory. Each of which has a class that derives from the Panel_Device base class, which implements the IPanel interface. So if we want to add a composite video output capability to this library, that’s the code I expected to see. With this newfound knowledge, I returned to my pull request to see how it was handled. I saw nothing of what I expected. No IPanel implementation, no Panel_Device derived class. That work is in the contributor’s fork of LovyanGFX. The pull request for me has merely the minimal changes needed to ESP_8_BIT_composite to be used in that fork.

Since those changes are for a specialized usage independent of the main intention of my library, I’m not inclined to incorporate such changes. I suggested to riraosan that they fork the code and create a new LovyanGFX-focused library (removing AdafruitGFX support components) and it appears that will be the direction going forward. Whatever else happens, I now know about LovyanGFX and that knowledge would not have happened without a helpful contributor. I am thankful for that!

Initial Lessons on ESP8266 Arduino Sketch for InfluxDB

Dipping my toes in building a data monitoring system, I have an ESP8266 Arduino sketch that reads its analog input pin, converts it to original input voltage, and log that information to InfluxDB. Despite the simplicity of the sketch, I’ve already learned some very valuable lessons.

The Good

The Arduino libraries do a very good job of recovering from problems on their own, taking care of it so my Arduino sketch does not. The ESP8266 Arduino WiFi library can reconnect lost WiFi as long as I call periodically. And the InfluxDB library doesn’t need me to do anything special at all. I call InfluxDbClient.writePoint() whenever I want to write data. If the connection was lost since initial connection, it will be re-established and the data point written with no extra work on my part. I’ve had this sketch up and running as I’ve taken the InfluxDB docker container offline to upgrade to newer versions, or performed firmware updates WiFI access point which takes wireless offline for a few minutes. This sketch recovered and resume logging data, no sweat.

The Bad

ESP8266 ADC (analog-to-digital converter) is pretty darned noisy when asked to measure within a narrow range of its full range of voltages as I am. The full range is 0-22V, and I’m typically only within a narrow band between 12-14V. I tried taking multiple measurements and averaging them, but that didn’t seem to help very much.

This noisiness also made it hard to calibrate readings against voltage values as measured by my multi-meter. It’s not difficult to take a meter reading and calculation a conversion factor for an ADC reading taken at the same time. But if the ADC value can drift even as the actual voltage is held steady, the conversion factor is unreliable. Even worse, since the conversion is done in my Arduino sketch, every time I want to refine this value, I had to hook up a computer and re-upload my Arduino sketch.

Since I expect to add more data sources to this system, I also expected to query by source to see data returned by each. For this first iteration, I tagged points with the MAC address of my ESP8266. This is a pretty good guarantee of uniqueness, but it is not very human-friendly. Or at least not to me, as I’m not in the habit of memorizing MAC addresses of my devices.

The Ugly

As typical of Arduino sketches, this sketch is running loop() as fast as it could. Functionally speaking, this is fine. But it means the ESP8266 is always in a state of high power draw, with the WiFi stack always active and the CPU running as fast as it could. When the objective is merely to record measurements every minute or so, I could be far more energy efficient here.

Addressing these issues (and much more I’m sure) will be topic of future iterations. In the meantime, I have some data points and I want to graph them.

[Source code for this project is publicly accessible on GitHub]

Setting Up ESP8266 Arduino Sketch for InfluxDB

I think I’ve got the hardware side sorted out for building an ESP8266 that monitors voltage output of a solar panel, so it’s time to dive into the software side. I want to log this data into InfluxDB as a learning exercise, and the list of client libraries included a pointer to a GitHub repository for an InfluxDB client library for ESP8266 and ESP32 running on their Arduino core.

Arduino is not exactly the most fully featured development environment, so I’ve been doing my Arduino development using PlatformIO plugin of Visual Studio Code. However, this is the first time I’ve had to manually add a third-party library and it’s not the same as Arduino’s Library Manager so I had to go online for a little help like this forum thread. I learned I should go to and search for the desired library. Searching on “InfluxDB” resulted in several results, one of which is the same client library I found earlier but now with instructions on how to install into PlatformIO.

After compiling a simple test sketch, my serial output monitor returned gibberish. The key here is that baud rate must match between my platformio.ini configuration file:

monitor_speed = 115200

And my serial output initialization code in setup() function of my sketch:


Another configuration issue concern information necessary to connect to my home WiFi and my InfluxDB server. This little sketch needs that information to run, but I don’t want to include them in my source code since I intended to upload this to GitHub in a publicly accessible repository. I found a solution on StackOverflow: put my secret information in a separate secrets.h file. After I committed a basic version without any actual information, use the command git update-index --skip-worktree secrets.h to remove it from further Git activity. After that point, I could edit secrets.h and Git would not care to upload that information leaving my local secrets local, which is what I want.

Once all of these setup details were taken care of, I could dive into code and learn some valuable lessons out of the experience.

[Source code for this project is publicly accessible on GitHub]

Cat and Galactic Squid

Emily Velasco whipped up some cool test patterns to help me diagnose problems with my port of AnimatedGIF Arduino library example, rendering to my ESP_8_BIT_composite color video out library. But that wasn’t where she first noticed a problem. That honor went to the new animated GIF she created upon my request for something nifty to demonstrate my library.

This started when I copied an example from the AnimatedGIF library for the port. After I added the code to copy between my double buffers to keep them consistent, I saw it was a short clip of Homer Simpson from The Simpsons TV show. While the legal department of Fox is unlikely to devote resources to prosecute authors of an Arduino library, I was not willing to take the risk. Another popular animated GIF is Nyan Cat, which I had used for an earlier project. But despite its online spread, there is actual legal ownership associated with the rainbow-pooping pop tart cat. Complete with lawsuits enforcing that right and, yes, an NFT. Bah.

I wanted to stay far away from any legal uncertainties. So I asked Emily if she would be willing to create something just for this demo as an alternate to Homer Simpson and Nyan Cat. For the inspirational subject, I suggested a picture she posted of her cat sleeping on her giant squid pillow.

A few messages back and forth later, Emily created Cat and Giant Squid complete with a backstory of an intergalactic adventuring duo.

Here they are on an seamlessly looping background, flying off to their next adventure. Emily has released this art under the CC BY-SA (Creative Commons Attribution-ShareAlike) 4.0 license. And I have happily incorporated it into ESP_8_BIT_composite library as an example of how to show animated GIFs on an analog TV. When I showed the first draft, she noticed a visual artifact that I eventually diagnosed to missing X-axis offsets. After I fixed that, the animation played beautifully on my TV. Caveat: the title image of this post is hampered by the fact it’s hard to capture a CRT on camera.

Finding X-Offset Bug in AnimatedGIF Example

Thanks to a little debugging, I figured out my ESP_8_BIT_composite color video out Arduino library required a new optional feature to make my double-buffering implementation compatible with libraries that rely on a consistent buffer such as AnimatedGIF. I was happy that my project, modified from one of the AnimatedGIF examples, was up and running. Then I swapped out its test image for other images, and it was immediately clear the job is not yet done. These test images were created by Emily Velasco and released under Creative Commons Attribution-ShareAlike 4.0 license (CC BY-SA 4.0).

This image resulted in the flawed rendering visible as the title image of this post. Instead of numbers continously counting upwards in the center of the screen, various numbers are rendered at wrong places and not erased properly in the following screens. Here is another test image to get more data

Between the two test images and observing where they were on screen, I narrowed the problem. Animated GIF files might only update part of the frame and when that happens, the frame subset is to be rendered at a X/Y offset relative to the origin. The Y offset was accounted for correctly, but the X offset went unused meaning delta frames were rendering against the left edge rather than the correct offset. This problem was not in my library, but inherited from the AnimatedGIF example. Where it went unnoticed because the trademark-violating animated GIF used by that example didn’t have an X-axis offset. Once I understood the problem, I went digging into AnimatedGIF code. Where I found the unused X-offset, and added it into the example where it belonged. These test images now display correctly, but they’re not terribly interesting to look at. What we need is a cat with galactic squid friend.

Animated GIF Decoder Library Exposed Problem With Double Buffering

Once I resolved all the problems I knew existed in version 1.0.0 of my ESP_8_BIT_composite color video out Arduino library, I started looking around for usage scenarios that would unveil other problems. In that respect, I can declare my next effort a success.

My train of thought started with ease of use. Sure, I provided an adaptation of Adafruit’s GFX library designed to make drawing graphics easy, but how could I make things even easier? What is the easiest way for someone to throw up a bit of colorful motion picture on screen to exercise my library? The answer came pretty quickly: I should demonstrate how to display an animated GIF on an old analog TV using my library.

This is a question I’ve contemplated before in the context of the Hackaday Supercon 2018 badge. Back then I decided against porting a GIF decoder and wrote my own run-length encoding instead. The primary reason was that I was short on time for that project and didn’t want to risk losing time debugging an unfamiliar library. Now I have more time and can afford the time to debug problems porting an unfamiliar library to a new platform. In fact, since the intent was to expose problems in my library, I fully expected to do some debugging!

I looked around online for an animated GIF decoder library written in C or C++ code with the intent of being easily portable to microcontrollers. Bonus if it has already been ported to some sort of Arduino support. That search led me to the AnimatedGIF library by Larry Bank / bitbank2. The way it was structured made input easy: I don’t have to fuss with file I/O or SPIFFS, I can feed it a byte array. The output was also well matched to my library, as the output callback renders the image one horizontal line at a time, a great match for the line array of ESP_8_BIT.

Looking through the list of examples, I picked ESP32_LEDMatrix_I2S as the most promising starting point for my test. I modified the output call from the LED matrix I2S interface to my Adafruit GFX based interface, which required only minor changes. On my TV I can almost see a picture, but it is mostly gibberish. As the animation progressed, I can see deltas getting rendered, but they were not matching up with their background.

After chasing a few dead ends, the key insight was noticing my noisy background of uninitialized memory was flipping between two distinct values. That was my reminder I’m performing double-buffering, where I swap between front and back buffers for every frame. AnimatedGIF is efficient about writing only the pixels changed from one frame to the next, but double buffering meant each set of deltas was written over not the previous frame, but two frames prior. No wonder I ended up with gibberish.

Aside: The gibberish amusingly worked in my favor for this title image. The AnimatedGIF example used a clip from The Simpsons, copyrighted material I wouldn’t want to use here. But since the image is nearly unrecognizable when drawn with my bug, I can probably get away with it.

The solution is to add code to keep the two buffers in sync. This way libraries minimizing drawing operations would be drawing against the background they expected instead of an outdated background. However, this would incur a memory copy operation which is a small performance penalty that would be wasted work for libraries that don’t need it. After all of my previous efforts to keep API surface area small, I finally surrendered and added a configuration flag copyAfterSwap. It defaults to false for fast performance, but setting it to true will enable the copy and allow using libraries like AnimatedGIF. It allowed me to run the AnimatedGIF example, but I ran into problems playing back other animated GIF files due to missing X-coordinate offsets in that example code.

TIL Some Video Equipment Support Both PAL and NTSC

Once I sorted out memory usage of my ESP_8_BIT_composite Arduino library, I had just one known issue left on the list. In fact, the very first one I filed: I don’t know if PAL video format is properly supported. When I pulled this color video signal generation code from the original ESP_8_BIT project, I worked to keep all the PAL support code intact. But I live in NTSC territory, how am I going to test PAL support?

This is where writing everything on GitHub paid off. Reading my predicament, [bootrino] passed along a tip that some video equipment sold in NTSC geographic regions also support PAL video, possibly as a menu option. I poked around the menu of the tube TV I had been using to develop my library, but didn’t see anything promising. For the sake of experimentation I switched my sketch into PAL mode just to see what happens. What I saw was a lot of noise with a bare ghost of the expected output, as my TV struggled to interpret the signal in a format it could almost but not quite understand.

I knew the old Sony KP-53S35 RPTV I helped disassemble is not one of these bilingual devices. When its signal processing board was taken apart, there was an interface card to host a NTSC decoder chip. Strongly implying that support for PAL required a different interface card. It also implies newer video equipment have a better chance of having multi-format support, as they would have been built in an age when manufacturing a single worldwide device is cheaper than manufacturing separate region-specific hardware. I dug into my hardware hoard looking for a relatively young piece of video hardware. Success came in the shape of a DLP video projector, the BenQ MS616ST.

I originally bought this projector as part of a PC-based retro arcade console with a few work colleagues, but that didn’t happen for reasons not important right now. What’s important is that I bought it for its VGA and HDMI computer interface ports so I didn’t know if it had composite video input until I pulled it out to examine its rear input panel. Not only does this video projector support composite video in both NTSC and PAL formats, it also had an information screen where it indicates whether NTSC or PAL format is active. This is important, because seeing the expected picture isn’t proof by itself. I needed the information screen to verify my library’s PAL mode was not accidentally sending a valid NTSC signal.

Further proof that I am verifying a different code path was that I saw a visual artifact at the bottom of the screen absent from NTSC mode. It looks like I inherited a PAL bug from ESP_8_BIT, where rossumur was working on some optimizations for this area but left it in a broken state. This artifact would have easily gone unnoticed on a tube TV as they tend to crop off the edges with overscan. However this projector does not perform overscan so everything is visible. Thankfully the bug is easy to fix by removing an errant if() statement that caused PAL blanking lines to be, well, not blank.

Thanks to this video projector fluent in both NTSC and PAL, I can now confidently state that my ESP_8_BIT_composite library supports both video formats. This closes the final known issue, which frees me to go out and find more problems!

[Code for this project is publicly available on GitHub]

Allocating Frame Buffer Memory 4KB At A Time

Getting insight into computational processing workload was not absolutely critical for version 1.0.0 of my ESP_8_BIT_composite Arduino library. But now that the first release is done, it was very important to get those tools up and running for the development toolbox. Now that people have a handle on speed, I turned my attention to the other constraint: memory. An ESP32 application only has about 380KB to work with, and it takes about 61K to store a frame buffer for ESP_8_BIT. Adding double-buffering also doubled memory consumption, and I had actually half expected my second buffer allocation to fail. It didn’t, so I got double-buffering done, but how close are we skating to the edge here?

Fortunately I did not have to develop my own tools here to gain insight into memory allocation, ESP32 SDK already had one in the form of heap_caps_print_heap_info() For my purposes, I called it with the MALLOC_CAP_8BIT flag because pixels are accessed at the single byte (8 bit) level. Here is the memory output running my test sketch, before I allocated the double buffers. I highlighted the blocks that are about to change in red:

Heap summary for capabilities 0x00000004:
  At 0x3ffbdb28 len 52 free 4 allocated 0 min_free 4
    largest_free_block 4 alloc_blocks 0 free_blocks 1 total_blocks 1
  At 0x3ffb8000 len 6688 free 5872 allocated 688 min_free 5872
    largest_free_block 5872 alloc_blocks 5 free_blocks 1 total_blocks 6
  At 0x3ffb0000 len 25480 free 17172 allocated 8228 min_free 17172
    largest_free_block 17172 alloc_blocks 2 free_blocks 1 total_blocks 3
  At 0x3ffae6e0 len 6192 free 6092 allocated 36 min_free 6092
    largest_free_block 6092 alloc_blocks 1 free_blocks 1 total_blocks 2
  At 0x3ffaff10 len 240 free 0 allocated 128 min_free 0
    largest_free_block 0 alloc_blocks 5 free_blocks 0 total_blocks 5
  At 0x3ffb6388 len 7288 free 0 allocated 6784 min_free 0
    largest_free_block 0 alloc_blocks 29 free_blocks 1 total_blocks 30
  At 0x3ffb9a20 len 16648 free 5784 allocated 10208 min_free 284
    largest_free_block 4980 alloc_blocks 37 free_blocks 5 total_blocks 42
  At 0x3ffc1f78 len 123016 free 122968 allocated 0 min_free 122968
    largest_free_block 122968 alloc_blocks 0 free_blocks 1 total_blocks 1
  At 0x3ffe0440 len 15072 free 15024 allocated 0 min_free 15024
    largest_free_block 15024 alloc_blocks 0 free_blocks 1 total_blocks 1
  At 0x3ffe4350 len 113840 free 113792 allocated 0 min_free 113792
    largest_free_block 113792 alloc_blocks 0 free_blocks 1 total_blocks 1
    free 286708 allocated 26072 min_free 281208 largest_free_block 122968

I was surprised at how fragmented the memory space already was even before I started allocating memory in my own code. There are ten blocks of available memory, only two of which are large enough to accommodate an allocation for 60KB. Here is the memory picture after I allocated the two 60KB frame buffers (and two line arrays, one for each frame buffer.) With the changed sections highlighted in red.

Heap summary for capabilities 0x00000004:
  At 0x3ffbdb28 len 52 free 4 allocated 0 min_free 4
    largest_free_block 4 alloc_blocks 0 free_blocks 1 total_blocks 1
  At 0x3ffb8000 len 6688 free 3920 allocated 2608 min_free 3824
    largest_free_block 3920 alloc_blocks 7 free_blocks 1 total_blocks 8
  At 0x3ffb0000 len 25480 free 17172 allocated 8228 min_free 17172
    largest_free_block 17172 alloc_blocks 2 free_blocks 1 total_blocks 3
  At 0x3ffae6e0 len 6192 free 6092 allocated 36 min_free 6092
    largest_free_block 6092 alloc_blocks 1 free_blocks 1 total_blocks 2
  At 0x3ffaff10 len 240 free 0 allocated 128 min_free 0
    largest_free_block 0 alloc_blocks 5 free_blocks 0 total_blocks 5
  At 0x3ffb6388 len 7288 free 0 allocated 6784 min_free 0
    largest_free_block 0 alloc_blocks 29 free_blocks 1 total_blocks 30
  At 0x3ffb9a20 len 16648 free 5784 allocated 10208 min_free 284
    largest_free_block 4980 alloc_blocks 37 free_blocks 5 total_blocks 42
  At 0x3ffc1f78 len 123016 free 56 allocated 122880 min_free 56
    largest_free_block 56 alloc_blocks 2 free_blocks 1 total_blocks 3
  At 0x3ffe0440 len 15072 free 15024 allocated 0 min_free 15024
    largest_free_block 15024 alloc_blocks 0 free_blocks 1 total_blocks 1
  At 0x3ffe4350 len 113840 free 113792 allocated 0 min_free 113792
    largest_free_block 113792 alloc_blocks 0 free_blocks 1 total_blocks 1
    free 161844 allocated 150872 min_free 156248 largest_free_block 113792

The first big block, which previously had 122,968 bytes available, became the home of both 60KB buffers leaving only 56 bytes. That is a very tight fit! A smaller block, which previously had 5,872 bytes free, now had 3,920 bytes free indicating that’s where the line arrays ended up. A little time with the calculator with these numbers arrived at 16 bytes of overhead per memory allocation.

This is good information to inform some decisions. I had originally planned to give the developer a way to manage their own memory, but I changed my mind on that one just as I did for double buffering and performance metrics. In the interest of keeping API simple, I’ll continue handling the allocation for typical usage and trust that advanced users know how to take my code and tailor it for their specific requirements.

The ESP_8_BIT line array architecture allows us to split the raw frame buffer into smaller pieces. Not just a single 60KB allocation as I have done so far, it can accommodate any scheme all the way down to allocating 240 horizontal lines individually at 256 bytes each. That will allow us to make optimal use of small blocks of available memory. But doing 240 instead of 1 allocation for each of two buffers means 239 additional allocations * 16 bytes of overhead * 2 buffers = 7,648 extra bytes of overhead. That’s too steep of a price for flexibility.

As a compromise, I will allocate in the frame buffer in 4 kilobyte chunks. These will fit in seven out of ten available blocks of memory, an improvement from just two. Each frame would consist of 15 chunks. This works out to an extra 14 allocations * 16 bytes of overhead * 2 buffers = 448 bytes of overhead. This is a far more palatable price for flexibility. Here are the results with the frame buffers allocated in 4KB chunks, again with changed blocks in red:

Heap summary for capabilities 0x00000004:
  At 0x3ffbdb28 len 52 free 4 allocated 0 min_free 4
    largest_free_block 4 alloc_blocks 0 free_blocks 1 total_blocks 1
  At 0x3ffb8000 len 6688 free 784 allocated 5744 min_free 784
    largest_free_block 784 alloc_blocks 7 free_blocks 1 total_blocks 8
  At 0x3ffb0000 len 25480 free 724 allocated 24612 min_free 724
    largest_free_block 724 alloc_blocks 6 free_blocks 1 total_blocks 7
  At 0x3ffae6e0 len 6192 free 1004 allocated 5092 min_free 1004
    largest_free_block 1004 alloc_blocks 3 free_blocks 1 total_blocks 4
  At 0x3ffaff10 len 240 free 0 allocated 128 min_free 0
    largest_free_block 0 alloc_blocks 5 free_blocks 0 total_blocks 5
  At 0x3ffb6388 len 7288 free 0 allocated 6776 min_free 0
    largest_free_block 0 alloc_blocks 29 free_blocks 1 total_blocks 30
  At 0x3ffb9a20 len 16648 free 1672 allocated 14304 min_free 264
    largest_free_block 868 alloc_blocks 38 free_blocks 5 total_blocks 43
  At 0x3ffc1f78 len 123016 free 28392 allocated 94208 min_free 28392
    largest_free_block 28392 alloc_blocks 23 free_blocks 1 total_blocks 24
  At 0x3ffe0440 len 15072 free 15024 allocated 0 min_free 15024
    largest_free_block 15024 alloc_blocks 0 free_blocks 1 total_blocks 1
  At 0x3ffe4350 len 113840 free 113792 allocated 0 min_free 113792
    largest_free_block 113792 alloc_blocks 0 free_blocks 1 total_blocks 1
    free 161396 allocated 150864 min_free 159988 largest_free_block 113792

Instead of almost entirely consuming the block with 122,968 bytes leaving just 56 bytes, the two frame buffers are now distributed among smaller blocks leaving 28,329 contiguous bytes free in that big block. And we still have anther big block free with 113,792 bytes to accommodate large allocations.

Looking at this data, I could also see allocating in smaller chunks would have led to diminishing returns. Allocating in 2KB chunks would have doubled the overhead but not improved utilization. Dropping to 1KB would double the overhead again, and only open up one additional block of memory for use. Therefore allocating in 4KB chunks is indeed the best compromise, assuming my ESP32 memory map is representative of user scenarios. Satisfied with this arrangement, I proceeded to work on my first and last bug of version 1.0.0: PAL support.

[Code for this project is publicly available on GitHub]

Lightweight Performance Metrics Have Caveats

Before I implemented double-buffering for my ESP_8_BIT_composite Arduino library, the only way we know we’re overloaded is when we start seeing visual artifacts on screen. After I implemented double-buffering, when we’re overloaded we’ll see the same data shown for two or more frames because the back buffer wasn’t ready to be swapped. A binary good/no-good feedback is better than nothing but it would be frustrating to work with and I knew I could do better. I wanted to collect some performance metrics a developer can use to know how close they’re running to the edge before going over.

This is another feature I had originally planned as some type of configurable data. My predecessor ESP_8_BIT handled it as a compile-time flag. But just as I decided to make double-buffering run all the time in the interest of keeping the Arduino API easy to use, I’ve decided to collect performance metrics all the time. The compromise is that I only do so for users of the Adafruit GFX API, who have already chosen ease of use over maximum raw performance. The people who use the raw frame buffer API will not take the performance hit, and if they want performance metrics they can copy what I’ve done and tailor it to their application.

The key counter underlying my performance measurement code goes directly down to a feature of the Tensilica CPU. CCount, which I assume to mean cycle count, is incremented at every clock cycle. When the CPU is running at full speed of 240MHz, it increments by 240 million within each second. This is great, but the fact it is a 32-bit unsigned integer limits its scope, because that means the count will overflow every 232 / 240,000,000 = 17.895 seconds.

I started thinking of ways to keep a 64-bit performance counter in sync with the raw CCount, but in the interest of keeping things simple I abandoned that idea. I will track data through each of these ~18 second periods and, as soon as CCount overflows, I’ll throw it all out and start a new session. This will result in some loss of performance data but it eliminates a ton of bookkeeping overhead. Every time I notice an overflow, statistics from the session is output to logging INFO level. The user can also query the running percentage of the session at any time, or explicitly terminate a session and start a new one for the purpose of isolating different code.

The percentage reported is the ratio of of clock cycles spent in waitForFrame() relative to the amount of time between calls. If the drawing loop does no work, like this:

void loop() {

Then 100% of the time is spent waiting. This is unrealistic because it’s not useful. For realistic drawing loops that does more work, the percentage will be lower. This number tells us roughly how much margin we have to spare to take on more work. However, “35% wait time” does not mean 35% CPU free, because other work happens while we wait. For example, the composite video signal generation ISR is constantly running, whether we are drawing or waiting. Actual free CPU time will be somewhere lower than this reported wait percentage.

The way this percentage is reported may be unexpected, as it is an integer in the range from 0 to 10000 where each unit is a percent or a percent. The reason I did this is because the floating-point unit on an ESP32 imposes its own overhead that I wanted to avoid in my library code. If the user wants to divide by 100 for a human-friendly percentage value, that is their choice to accept the floating-point performance overhead. I just didn’t want to force it on every user of my library.

Lastly, the session statistics include frames rendered & missed, and there is an overflow concern for those values as well. The statistics will be nonsensical in the ~18 second session window where either of them overflow, though they’ll recover by the following session. Since these are unsigned 32-bit values (uint32_t) they will overflow at 232 frames. At 60 frames per second, that’s a loss of ~18 seconds of data once every 2.3 years. I decided not to worry about it and turn my attention to memory consumption instead.

[Code for this project is publicly available on GitHub]

Double Buffering Coordinated via TaskNotify

Eliminating work done for pixels that will never been seen is always a good change for efficiency. Next item on the to-do list is to work on pixels that will be seen… but we don’t want to see them until they’re ready. Version 1.0.0 of ESP_8_BIT_composite color video out library used only a single buffer, where code is drawing to the buffer at the same time the video signal generation code is reading from the buffer. When those two separate pieces of code overlap, we get visual artifacts on screen ranging from incomplete shapes to annoying flickers.

The classic solution to this is double-buffering, which the precedent ESP_8_BIT did not do. I hypothesize there were two reasons for this: #1 emulator memory requirements did not leave enough for a second buffer and #2 emulators sent its display data in horizontal line order, managing to ‘race ahead” of the video scan line and avoid artifacts. But now both of those are gone. #1 no longer applies because emulators had been cut out, freeing memory. And we lost #2 because Adafruit GFX is concentrated around vertical lines so it is orthogonal to scan line and no longer able to “race ahead” of it resulting in visual artifacts. Thus we need two buffers. A back buffer for the Adafruit GFX code to draw on, and a front buffer for the video signal generation code to read from. At the end of each NTSC frame, I have an opportunity to swap the buffers. Doing it at that point ensures we’ll never try to show a partially drawn frame.

I had originally planned to make double-buffering an optional configurable feature. But once I saw how much of an improvement this was, I decided everyone will get it all of the time. In the spirit of Arduino library style guide recommendations, I’m keeping the recommended code path easy to use. For simple Arduino apps the memory pressure would not be a problem on an ESP32. If someone wants to return to single buffer for memory needs, or maybe even as a deliberate artistic decision to have flickers, they can take my code and create their own variant.

Knowing when to swap the buffer was easy, video_isr() had a conveniently commented section // frame is done. At that point I can swap the front and back buffers if the back buffer is flagged as ready to go. My problem was that I didn’t know how to signal the drawing code they have a new back buffer and they can start drawing the next frame. The existing video_sync() (which I use for my waitForFrame() API) forecasts the amount of time to render a frame and uses vTaskDelay() which I am somewhat suspicious of. FreeRTOS documentation has the disclaimer that vTaskDelay() has no guarantee that it will resume at the specified time. The synchronization was thus inferred rather than explicit, and I wanted something that ties the two pieces of code more concretely together. My research eventually led to vTaskNotifyGiveFromISR() I can use in video_isr() to signal its counterpart ulTaskNotifyTake() which I will use for a replacement implementation of video_sync(). I anticipate this will prove to be a more reliable way for the application code to know they can start working on the next frame. But how much time do they have to spare between frames? That’s the next project: some performance metrics.

[Code for this project is publicly available on GitHub]

The Fastest Pixels Are Those We Never Draw

It’s always good to have someone else look over your work, they find things you miss. When Emily Velasco started writing code to run on my ESP_8_BIT_composite library, her experiment quickly ran into flickering problems with large circles. But that’s not as embarrassing as another problem, which triggered ESP32 Core Panic system reset.

When I started implementing a drawing API, I specified X and Y coordinates as unsigned integers. With a frame buffer 256 pixels wide and 240 pixels tall, it was a great fit for 8-bit unsigned integers. For input verification, I added a check to make sure Y did not exceed 240 and left X along as it would be a valid value by definition.

When I put Adafruit’s GFX library on top of this code, I had to implement a function with the same signature as Adafruit used. The X and Y coordinates are now 16-bit numbers, so I added a check to make sure X isn’t too large either. But these aren’t just 16-bit numbers, they are int16_t signed integers. Meaning coordinate values can be negative, and I forgot to check that. Negative coordinate values would step outside the frame buffer memory, triggering an access violation, hence the ESP32 Core Panic and system reset.

I was surprised to learn Adafruit GFX default implementation did not have any code to enforce screen coordinate limits. Or if they did, it certainly didn’t kick in before my drawPixel() override saw them. My first instinct is to clamp X and Y coordinate values within the valid range. If X is too large, I treat it as 255. If it is negative, I treat it as zero. Y is also clamped between 0 and 239 inclusive. In my overrides of drawFastHLine and drawFastVLine, I also wrote code to gracefully handle situations when their width or heights are negative, swapping coordinates around so they remain valid commands. I also used the X and Y clamping functions here to handle lines that were partially on screen.

This code to try to gracefully handle a wide combination of inputs added complexity. Which added bugs, one of which Emily found: a circle that is on the left or right edge of the screen would see its off-screen portion wrap around to the opposite edge of the screen. This bug in X coordinate clamping wasn’t too hard to chase down, but I decided the fact it even exists is silly. This is version 1.0, I can dictate the behavior I support or not support. So in the interest of keeping my code fast and lightweight, I ripped out all of that “plays nice” code.

A height or a width is negative? Forget graceful swapping, I’m just not going to draw. Something is completely off screen? Forget clamping to screen limits, stuff off-screen are just not going to get drawn. Lines that are partially on screen still need to be gracefully handled via clamping, but I discarded all of the rest. Simpler code leaves fewer places for bugs to hide. It is also far faster, because the fastest pixels are those that we never draw. These optimizations complete the easiest updates to make on individual buffers, the next improvement comes from using two buffers.

[Code for this project is publicly available on GitHub]

Overriding Adafruit GFX HLine/VLine Defaults for Performance

I had a lot of fun building a color picker for 256 colors available in the RGB332 color space, gratuitous swoopy 3D animation and all. But at the end of the day it is a tool in service of the ESP_8_BIT_composite video out library. Which has its own to-do list, and I should get to work.

The most obvious work item is to override some Adafruit GFX default implementations, starting with the ones explicitly recommended in comments. I’ve already overridden fillScreen() for blanking the screen on every frame, but there are more. The biggest potential gain is the degenerate horizontal-line drawing method drawFastHLine() because it is a great fit for ESP_8_BIT, whose frame buffer is organized as a list of horizontal lines. This means drawing a horizontal line is a single memset() which I expect to be extremely fast. In contrast, vertical lines via drawFastVLine() would still involve a loop iterating over the list of horizontal lines and won’t be as fast. However, overriding it should still gain benefit by avoiding repetitious work like validating shared parameters.

Given those facts, it is unfortunate Adafruit GFX default implementations tend to use VLine instead of the HLine that would be faster in my case. Some defaults implementations like fillRect() were easy to switch to HLine, but others like fillCircle() is more challenging. I stared at that code for a while, grumpy at lack of comments explaining what it is doing. I don’t think I understand it enough to switch to HLine so I aborted that effort.

Since VLine isn’t ESP_8_BIT_composite’s strong suit, these default implementations using VLine did not improve as much as I had hoped. Small circles drawn with fillCircle() are fine, but as the number of circles increase and/or their radius increase, we start seeing flickering artifacts on screen. It is actually a direct reflection of the algorithm, which draws the center vertical line and fills out to either side. When there is too much to work to fill a circle before the video scanlines start, we can see the failure in the form of flickering triangles on screen, caused by those two algorithms tripping over each other on the same frame buffer. Adding double buffering is on the to-do list, but before I tackle that project, I wanted to take care of another optimization: clipping off-screen renders.

[Code for this project is publicly available on GitHub]

Initial Issues With ESP_8_BIT Color Composite Video Out Library

I honestly didn’t expect my little project to be accepted into the Arduino Library Manager index on my first try, but it was. Now that it is part of the ecosystem, I feel obligated to record my mental to-do list in a format that others can reference. This lets people know that I’m aware of these shortcomings and see the direction I’ve planned to take. And if I’m lucky, maybe someone will tackle them before I do and give me a pull request. But I can’t realistically expect that, so putting them down on record would at least give me something to point to. “Yes, it’s on the to-do list.” So I wrote down the known problems in the issues section of the project.

First and foremost problem is that I don’t know if PAL code still works. I intended to preserve all the PAL functionality when I extracted the ESP_8_BIT code, but I don’t know if I successfully preserved it all. I only have a NTSC TV so I couldn’t check. And even if someone tells me PAL is broken, I wouldn’t be able to do anything about it. I’m not dedicated enough to go out and buy a PAL TV just for testing. [bootrino] helpfully tells me there are TV that understand both standards, which I didn’t know. I’m not dedicated enough to go out and get one of those TV for the task, but at least I know to keep an eye open for such things. This one really is waiting for someone to test and, if there are problems, submit a pull request.

The other problems I know I can handle. In fact, I had a draft of the next item: give the option to use caller-allocated frame buffer instead of always allocating our own. I had this in the code at one point, but it was poorly tested and I didn’t use it in any of the example sketches. The Arduino API Style Guide suggests trimming such extraneous options in the interest of keeping the API surface area simple, so I did that for version 1.0.0. I can revisit it if demand comes back in the future.

One thing I left behind in ESP_8_BIT and want to revive is a performance metric of some sort. For smooth display the developer must perform all drawing work between frames. The waitForFrame() API exists so drawing can start as soon as one frame ends, but right now there’s no way to know how much room was left before next frame begins. This will be useful as people start to probe the limits.

After performance metrics are online, that data can be used to inform the next phase: performance optimizations. The only performance override I’ve done over the default Adafruit GFX library was fillScreen() since all the examples call that immediately after waitForFrame() to clear the buffer. There are many more candidates to override, but we won’t know how much benefit they give unless we have performance metrics online.

The final item on this initial list of issues is support for double- or triple-buffering. I don’t know if I’ll ever get to it, but I wrote it down because it’s such a common thing to want in a graphics stack. This is a rather advanced usage and it consumes a lot of memory. At 61KB per buffer, the ESP32 can’t really afford many of them. At the very least this needs to come after the implementation of user-allocated buffers, because it’s going to be a game of Tetris to find enough memory in between developer code to create all these buffers and they know best how they want to structure their application.

I thought I had covered all the bases and was feeling pretty good about things… but I had a blind spot that Emily Velasco spotted immediately.