See World(s) Online

NASALogoOne of the longest tenure items on my “To-Do” exploration is to get the hang of the Google Earth API and learn how to create a web app around it. This was very exciting web technology when Google seemed to be moving Google Earth from a standalone application to a web-based solution. Unfortunately its web architecture was based around browser plug-ins which eventually lead to its death.

It made sense for Google Earth functionality to be folded into Google Maps, but that seemed to be a slow process of assimilation. It never occurred to me that there are other alternatives out there until I stumbled across a talk about NASA’s World Wind project. (A hands-on activity, too, with a sample project to play with.) The “Web World Wind” component of the project is a WebGL library for geo-spatial applications, which makes me excited about its potential for fun projects.

The Java edition of World Wind has (or at least used to) have functionality beyond our planet Earth. There were ways to have it display data sets from our moon or from Mars. Sadly the web edition has yet to pick up that functionality.

JPL does currently expose a lot of Mars information in a web-browser accessible form on the Mars Trek site. According to the speaker of my talk, it was not built on World Wind. He believes it was built on Cesium, another WebGL library for global data visualization.

I thought there was only Google Earth, and now I know there are at least two other alternatives. Happiness.

The speaker of the talk is currently working in the JPL Ops Lab on the OnSight project, helping planetary scientists collaborate on Mars research using Microsoft’s Hololens for virtual presence on Mars. That sounds like an awesome job.

Fusion 360 vs. Onshape: Raspberry Pi

raspberry-pi-logoAnd now for something completely silly: let’s look at how our two competing hobbyist-friendly CAD offerings fare on the hobbyist-friendly single-board computer, the Raspberry Pi.

(Spoiler: both failed.)

Raspberry Pi

I have on hand the Raspberry Pi 3 Model B. Featuring a far more powerful CPU than the original Pi which finally made the platform usable for basic computing tasks.

When the Raspberry Pi foundation updated its Raspbian operating system with PIXEL, they switched the default web browser from Epiphany to Chromium, the open-source fork of Google’s Chrome browser. Bringing in a mainstream HTML engine resulted in far superior compatibility with a wider range of web sites, supporting many of the latest web standards, including WebGL which is what we’ll be playing with today.

Autodesk Fusion 360

Fusion 360 is a native desktop application compiled for Windows and MacOS, so we obviously couldn’t run that on the Pi. However, there is a web component: Fusion 360 projects can be shared on the Autodesk 360 collaboration service. From there, the CAD model can be viewed in a web browser via WebGL on non-Windows/MacOS platforms.

While such files can be viewed on a desktop machine running Ubuntu and Chromium, a Raspberry Pi 3 running Chromium is not up to the task. Only about half of the menu bar and navigation controls are rendered correctly, and in the area of the screen where the actual model data should be, we get only a few nonsensical rectangles.

Onshape

Before this experiment I had occasionally worked on my Onshape projects on my desktop running Ubuntu and Chromium, so I had thought the web-based Onshape would have an advantage in Raspberry Pi Chromium. It did, just not usefully so.

In contrast to A360’s partial menu UI rendering, all of Onshape’s menu UI elements rendered correctly. Unfortunately, the actual CAD model is absent in the Raspberry Pi Chromium environment as well. We get the “Loading…” circle and it was never replaced by the CAD model.

Conclusion

Sorry, everyone, you can’t build a web-based CAD workstation with a $35 Raspberry Pi 3.

You can, however, use these WebGL sites as a stress test of the Raspberry Pi. I had three different ways of powering my Pi and this experiment proved enlightening.

  1. A Belkin-branded 12V to 5V USB power adapter: This one delivered good steady voltage at light load, but when the workload spiked to 100% the voltage dropped low enough for the Pi to brown out and reset.
  2. A cheap Harbor Freight 12V to 5V USB adapter: This one never delivered good voltage. Even at light load, the Pi would occasionally flash the low-voltage warning icon, but never low enough to trigger a reboot. When the workload spiked to 100%, the voltage is still poor but also never dropped enough to trigger a reset. Hurray for consistent mediocrity!
  3. An wall outlet AC to 5V DC power unit (specifically advertised to support the Raspberry Pi) worked as advertised – no low-voltage warnings and no resets.

Static Web Site Hosting with Amazon S3 and Route 53

aws_logoWeb application frameworks have the current spotlight, which is why I started learning Ruby on Rails to get an idea what the fuss was about. But a big framework isn’t always the right tool for the job. Sometimes it’s just a set of static files to be served upon request. No server-side smarts necessary.

This was where I found myself when I wanted to put up a little web site to document my #rxbb8 project. I just wanted to document the design & build process, and I already had registered the domain rxbb8.com. The HTML content was simple enough to create directly in a text editor and styled with CSS from the Materialize library.

After I got a basic 1.0 version of my hand-crafted site, I uploaded the HTML (and associated images) to an Amazon S3 bucket. It only takes a few clicks to allow files in a S3 bucket to be web-accessible via a long cumbersome URL on Amazon AWS domain http://rxbb8.com.s3-website-us-west-2.amazonaws.com. Since I wanted this content to be accessible via the rxbb8.com domain I already registered, I started reading up on the AWS service named in geek-humor style as Route 53.

Route 53 is designed to handle the challenges of huge web properties, distributing workload across many computers in many regions. (No single computer could handled all global traffic for, say, netflix.com.) The challenge for a novice like myself is to figure out how to pull out just the one tool I need from this huge complex Swiss army knife.

Fortunately this usage case is popular enough for Amazon to have written a dedicated developer guide for it. Unfortunately, the page doesn’t have all the details. The writer helpfully points the reader to other reference articles, but those pages revert back to talking about complex deployments and again it takes effort to distill the simple basics out of the big feature list.

If you get distracted or lost, stay focused on this Cliff Notes version:

  1. Go into Route 53 dashboard, create a Hosted Zone for the domain.
  2. In that Hosted Zone, AWS has created two record sets by default. One of them is the NS type, write down the name servers listed.
  3. Go to your domain registrar and tell them to point name servers for the domain to the AWS name servers listed in step 2.
  4. Create S3 storage bucket for the site, enable static website hosting.
  5. Create a new Record Set in the Route 53 Hosted Zone. Select “Alias” to “Yes” and point alias target to the S3 storage bucket in step 4.

Repeat #4 and #5 for each sub-domain that needs to be hosted. (The AWS documentation created example.com and repeated 4-5 for www.example.com.)

And then… wait.

The update in step 3 needs time to propagate to name servers across the internet. My registrar said it may take up to 24 hours. In my case, I started getting intermittent results within 2 hours, but it took more than 12 hours before everything stabilized to the new settings.

But it was worth the effort to see version 1.0 of my created-from-scratch static web site up and running on my domain! And since it’s such a small and simple site with little traffic, it will cost me only a few pennies per month to host in this manner.

 

Let the App… Materialize!

materializecsslogoAfter I got the Google sign-in working well enough for my Rails practice web app, the first order of business was to build the basic skeleton. This was a great practice exercise to take the pieces I learned in the Ruby on Rails Tutorial sample app and build something of my own design.

The initial pass implemented basic functionality but it didn’t look very appealing. I had focused on the Rails server-side code and left the client-side code simple plain HTML that would have been state-of-the-art in… maybe 1992?

Let’s make it look like something that belongs in 2017.

The Rails tutorial sample app used Bootstrap to improve the appearance and functionality of the client-side interface. I decided to take this opportunity to learn something new instead of doing the same thing. Since I’m using Google Sign-In in this app, I decided to adopt Google’s design concepts to my client-side appearance as well.

Web being the web, I knew I wouldn’t have to start from scratch. I knew about Google’s own Material Lite and thought that would be a good candidate before I learned it had been retired in favor of its successor, Material Components for the web. One of the touted advantages was improved integration with different web platforms. Sadly Rails was not among the platforms with examples ready-to-go.

I looked around for an existing project to help Rails projects adapt Google’s design language, and that’s when I found Materialize: A library that shares many usage patterns with Bootstrap. The style sheets are even written using SASS, native to default Rails apps, making for easy integration. Somebody has done that work and published it as Ruby gem materialize-sass, so all I had to do was add a single line to use Materialize in my app.

Of course I still had to put in the effort to revise all the view files in my web app to pick up Materialize styling and features. That took a few days, and the reward for this effort is a practice web app which no longer look so amateurish.

The Cost for Security

In the seemingly never-ending bad news of security breaches, a recurring theme is “they knew how to prevent this, but they didn’t.” Usually in the form of editorializing condemning people as penny-pinching misers caring more about their operating cost than the customer.

The accusations may or may not be true, it’s hard to tell without the other side of the story. What’s unarguably true is that security has some cost. Performing encryption obviously takes more work than not doing any! But how expensive is that cost? Reports range wildly anywhere from less than 5% to over 50%, and it likely depends on the specific situations involved as well.

I really had no idea of the cost until I stumbled across the topic in the course of my own Rails self-education project.

I had designed my Rails project with an eye towards security. The Google ID login token is validated against Google certificates, and the resulting ID is salted and hashed for storage. The code for this added security were deceptively minor, as they triggered huge amounts of work behind the scenes!

I started on this investigation because I noticed my Rails test suite ran quite slowly. Running the test suite for the Rails Tutorial sample app, the test framework ran through ~120 assertions per second. My own project test suite ran at a snail’s pace of ~12 assertions/second, 10% of the speed. What’s slowing things down so much? A few hours of experimentation and investigation pointed the finger at the encryption measures.

Obviously security is good for the production environment and should not be altered. However, for the purposes of development & test, I could weaken them because there would be no actual user data to protect. After I made a change to bypass some code and reducing complexity in others, my test suite speed rose to the expected >100 assertions/sec.

Granted, this is only an amateur at work and I’m probably making other mistakes doing security inefficiently. But as a lesson to experience “Security Has A Cost” firsthand it is eye-opening to find a 1000% performance penalty.

For a small practice exercise app like mine, where I only expect a handful of users, this is not a problem. But for a high-traffic site, having to pay ten times the cost would be the difference between making or breaking a business.

While I still don’t agree with the decisions that lead up to security breaches, at least now I have a better idea of the other side of the story.

Protecting User Identity

google-sign-inRecently, web site security breaches have been a frequent topic of mainstream news. The technology is evolving but this chapter of technology has quite some ways to go yet. Learning web frameworks gives me an opportunity to understand the mechanics from web site developer’s perspective.

For my project I decided to use Google Identity platform and let Google engineers take care of identification and authentication. By configuring the Google service to retrieve only basic information, my site never sees the vast majority of personally identifiable information. It never sees the password, name, e-mail address, etc.

All my site ever receives is a string, the Google ID. My site uses it to identify an user account. With the security adage of “what I don’t know, I can’t spill” I thought this was a pretty good setup: The only thing I know is the Google ID, I can’t spill anything else.

Which led to the next question: what’s the worst that can happen with the ID?

I had thought the ID is something Google generated for my web site. More specifically my site’s Client ID. I no longer believe so. A little experimentation (aided by a change in Client ID for the security issue previously documented) led me to now believe it’s possible the Google ID is global across all Google services.

This means if a hacker manages to get a copy of my site’s database of Google ID, they can cross-reference to databases of other compromised web sites. Potentially assembling a larger picture out of small pieces of info.

While I can’t stop using Google ID (I have to have something to identify an user) I can make it more difficult for a hacker to cross-reference my database. I’ve just committed a change to hash the ID before it is stored in the database. Salted with a value that is unique per deployed instance of the app.

Now for a hacker to penetrate the identity of my user, they must do all of the following:

  1. Obtain a copy of the database.
  2. Obtain the hashing salt used by the specific instance of the app which generated that database.
  3. Already have the user’s Google ID, since they won’t get the original ID out of my database of hashed values.

None of which are impossible, but certainly a lot more effort than it would have otherwise taken.

I think this is a worthwhile addition.

Adventures in Server-Side Authentication

google-sign-inThe latest chapter comes courtesy of the Google Identity Platform. For my next Rails app project, I decided to venture away from the user password authentication engine outlined in the Hartl Ruby on Rails Tutorial sample app. I had seen the “Sign in with Google” button on several web sites (like Codecademy) and decided to see it from the other side: Users for my next Rails project will sign in with Google!

The client-side code was straightforward following directions in the Google documentation. The HTML is literally copy-and-paste, the JavaScript needed some reworking to translate into CoffeeScript for the standard Rails asset pipeline but wasn’t terribly hard.

The server side was less straightforward.

I started with the guide Authenticate with a Backend Server which had links to the Google API Client Library for (almost all) of the server side technologies including Ruby. The guide page itself included examples on using the client library to validate the ID token in Java, Node.JS, PHP, and Python. The lack of Ruby example would prove problematic because each flavor of the client library seems to have different conventions and use different names for the functionality.

Java library has a dedicated GoogleIdTokenVerifier class for the purpose. Node.JS library has a GoogleAuth.OAuth2 class with a verifyIdToken method. PHP has a Google_Client class with a verifyIdToken method. And to round out the set, Python library has oauth2client.verify_id_token.

Different, but they’re all in a similar vein of “verify”, “id”, and “token” so I searched the Ruby Google API client library documentation for those keywords in the name. After a few fruitless hours I concluded what I wanted wasn’t there.

Where to next? I went to the library’s Github page for clues. I had to wade through a lot of material irrelevant to the immediate task because the large library covers the entire surface of Google services.

I thought I had hit the jackpot when I found reference to the Google Auth Library for Ruby. It’s intended to handle all authentication work for the big client library, with the target completion date of Q2 2015. (Hmm…) Surely it would be here!

It was not.

After too many wrong turns, I looked at Signet in detail. It has a OAuth2::Client class, which sounded very similar to the other libraries, but it had no “verify” method so every time I see a reference to Signet I keep deciding to look elsewhere. Once I decided to read into the details of Signet::OAuth2::Client, I finally figured out that it had a decoded_id_token method that can optionally verify the token.

So it had the verification feature but the keyword “verify” itself wasn’t in the name, throwing off my search multiple times.

Gah.

Nothing to do now but to take some deep breaths, clear out the pent-up frustration, and keep on working…

Simple Online Digital Photo Frame

The CarrierWave Playground project was created for experimentation with image upload. As intended, it helped me learn things about CarrierWave such as creating versions of the image scaled to different resolutions and extracting EXIF image metadata.

Obviously the image has to be displayed to prove that the upload was successful. I hadn’t intended to spend much time on the display side of things, but I started playing with the HTML and kept going. Logic was added for the browser to report its window size so the optimal image could be sent and scaled to fit. I had a button to reload the page, and it was fairly simple to change it from “reload the current page” to “navigate to another page”. Adding a JavaScript timer to execute this navigation… and voila! I had myself a rudimentary digital photo frame web app that loads and displays image in a sequence.

It’s fun but fairly crude. Brainstorming the possibilities, I imagine the following stages of sophistication:

Stage 1 – Basic: Where I am now, simple JavaScript that performs page navigation on a timer.

Problem: page blinks and abruptly shifts as new page is loaded. To avoid the abrupt shift, we have to eliminate the page switch.

Stage 2 – Add AJAX: Instead of a page navigation, perform a XMLHttpRequest to the server asking for information on the next image. Load the image in the background, and once complete, perform a smooth transition (fade out/fade in/etc.) from one image to the next.

Problem: Visual experience is at the mercy of the web browser, which probably has an address bar and other UI on screen. Also, the user’s screen will quickly go dark due to power saving features. To reliably solve both, I will need app-level access.

Stage 3 – Vendor-specific wrapper: Every OS platform has a way to allow web site authors an express lane into the app world. Microsoft offers the Windows App Studio. Apple has iOS web applications. Google has Android Web Apps.

Unknown: The JavaScript timer is a polling model, do we gain anything by moving to a server-push model? If so, that means…

Stage 4 – WebSocket: Photo updates are triggered by the server. Since I’m on Rails, the most obvious answer is to do this via WebSockets using Action Cable.

Looking at the list, I think I can tackle Stage 2 pretty soon if not immediately.

Stages 3 and 4 are more advanced and I’ll hold off for later.

EXIF fun with CarrierWave uploader

To play with the CarrierWave uploader gem I created a new Rails project just for the purpose of experimentation. I had thought about doing this as part of the Hartl Rails Tutorial sample app but ultimately decided to keep things as bare-bones as I can.

When trying to understand a new system, a debugger is a developer’s best friend. Part of this exercise is to get my feet wet using a debugger to poke around a running Rails app. The Hartl Rails Tutorial text introduced the byebug gem but only minimally covered usage. The official Rails Guides had more information which helped me get going, in addition to various users writing up their own cheat sheets.

I decided to dig into image metadata. Basic information such as width and height were made available as img[:width] and img[:height] but where are the others? With the help of byebug I found that they were available in the img.data hash. Most of the photography-specific EXIF metadata are in there as well, though somebody interested only in that subset can access it via the img.exif hash.

As an exercise I decided to try to pull out the original date. Time stamp on digital photography files have always been a headache. Most computer OS track “Created Date” and “Modified Date” but they are relative to the computer and not reliable for photo organization purposes. Photo editing introduces another twist: if an image is created from a photograph, the time stamp would be the day the edit operation was made and not when the original photo was taken.

Which is how I ended up looking at EXIF “DateTime”, “DateTimeOriginal”, and “DateTimeDigitized” to take the earliest date of the three (when present).  Then I ran into another can of worms: time zones. The EXIF time stamp have no time zone information, but the Ruby DateTime object type does. Now my time stamps are interpreted as UTC when it isn’t. Since EXIF doesn’t carry time zone data (that I can find) I decided to leave that problem to be tackled another day.

Behavior Driven Development

cucumberlogoMy new concept of the day: Behavior Driven Development. As this beginner understands the concept, the ideal is that the plain-English customer demands on the software is formalized just enough to make it a part of automated testing. In hindsight, a perfectly logical extension of Test-Driven Development concepts, which started as QA demands on software treated as the horse instead of the cart. I think BDD can be a pretty fantastic concept, but I haven’t seen enough to decide if I like the current state of the art in execution.

I stumbled into this entirely by accident. As a follow-up to the Rails Tutorial project, I took a closer look at one corner of the sample app. The image upload feature of the sample app used a gem called carrierwave uploader to do most of the work. In the context of the tutorial, CarrierWave was a magic black box that was pulled in and used without much explanation. I wanted to better understand the features (and limitations) of CarrierWave for use (or not) in my own projects.

As is typical of open-source projects, the documentation that exists is relatively thin and occasionally backed by the disclaimer “for more details, see source code.” I prefer better documentation up front but I thought: whatever, I’m a programmer, I can handle code spelunking. It should be a good exercise anyway.

Since I was exploring, I decided to poke my head into the first (alphabetically sorted) directory : /features/. And I was immediately puzzled by the files I read. The language is too formal to be conversational English for human beings, but too informal to be a programming language as I knew one. Some amount of Google-assisted research led me to the web site for Cucumber, the BDD tool used by the developers of CarrierWave.

That journey was fun, illuminating, and I haven’t even learned anything about CarrierWave itself yet!

Dipping toes in AWS via Rails Tutorial Sample App

aws_logoAmazon Web Services is a big, big ball of yarn. For somebody just getting started, the list of AWS products is quite intimidating. It’s not any fault of Amazon, it’s just the nature of building out such a comprehensive system. Fortunately, Amazon is not blind to the fact people can get overwhelmed and put admirable effort into a gentle introduction via their Getting Started resources: a series of (relatively) simple guided tours through select parts of the AWS domain.

At the end of it all, though, a developer has to roll up their sleeves and dive in. The question then is: where? In previous times, I couldn’t make up my mind and got stuck. This time around, I have a starting point: Michael Hartl’s Ruby on Rails Tutorial sample app, which wants to store images on Amazon S3 (Simple Storage Service).

Let’s make it happen.

One option is to blaze the simplest, most direct path to get rolling, but I resisted. The example I found on stackoverflow.com granted the rails app full access to storage with my AWS root credentials. Functionally speaking that would work, but that is a very bad idea from a security practices standpoint.

So I took a detour through Amazon IAM (Identity and Access Management). I wanted to learn how to do a properly scoped security access scheme for the Rails sample app, rather than giving it the Golden Key to the entire kingdom. Unfortunately, since IAM is used to manage access to all AWS properties, it is a pretty big ball of yarn itself.

Eventually I found my on-ramp to AWS: A section in the S3 documentation that discussed access control via IAM. Since I control the rails app and my own AWS account, I was able to skip a lot of the cross-account management for this first learning pass, boiling it down to the basics of what I can do for myself to implement IAM best practices on S3 access for my Rails app.

After a few bumps in the exploration effort, here’s what I ended up with.


Root account: This has access to everything, so we want to use this account as little as possible to minimize risk of compromising this account. Login to this account just long enough to activate multi-factor authentication and create an IAM user account with “AdministratorAccess” privileges. Log out of the management console as root, log back in under the new admin account to do everything else.

Admin account: This account is still very powerful so it is still worth protecting. But if it should be compromised, it can at least be shut down without closing the whole Amazon account. (If root is compromised, all bets are off.) Use this account to set up the remaining items.

Storage bucket: While logged in as the admin account, go to the S3 service dashboard and create a new storage bucket.

Access policy: Go to the IAM dashboard and create a new S3 access policy. The “resource” of the policy is the storage bucket we just created. The “action” we allow are the minimum set needed for the rails sample app and no more.

  1. PutObject – this permission allows the rails app to upload the image file itself.
  2. PutObjectAcl – this permission allows the rails app to change the access permission on the image object, make the image publicly visible to the world. This is required for use as the source field of an HTML <img> tag in the rails app.
  3. DeleteObject – when a micropost is deleted, the app needs this permission so the corresponding image can be deleted as well.

Access group: From the IAM dashboard, create a new access group. Under the “Permissions” list of the group, attach the access policy we just created. Now any account which is a member of the group has enough access the storage bucket to run the rails sample app.

User: From the IAM dashboard, create a new user account to be used by the rails app. Add this newly user to the access group we just created, so it is a part of the group and can use the access policy we created. (And no more.)

This new user, which we granted only a low level of access, will not need a password since we’ll never log in to Amazon management console with it. But we will need to generate an app access key and secret key.

Once all of the above are done, we have everything we need to put into Heroku for the Rails Tutorial sample app. A S3 storage bucket name, plus the access and secret key of the low level user account we created to access that S3 storage bucket.


While this is far more complex than the stackoverflow.com answer, it is more secure. Plus a good exercise to learn the major bits and pieces of an AWS access control system.

The above steps will ensure that, if the Rails sample app should be compromised in any way, the hacker has only the permissions we granted to the app and no more. While the hacker can put new images on the S3 bucket and make them visible, or delete existing images, but they can’t do anything else in that S3 bucket.

And most importantly, the hacker has no access to any other part of my AWS account.

Rails Tutorial (Take 2)

With all the fun and excitement around 3D printing, I’ve let my Ruby on Rails education lapse. I want to dive back in, but it’s been long enough that I felt I needed a review. Also, during my time away, the Ruby on Rails team released version 5, and Michael Hartl’s Ruby on Rails Tutorial was updated accordingly.

Independent of the Rails 5 updates, it was well worth my time to go through the book again. On second run, I understood some things that didn’t make sense before. It was also good to look at the first do-nothing “hello app” and the second automated-scaffold “toy app” with a little more Rails knowledge under my belt. The book is structured so the beginner reader didn’t have to understand the mechanics of the hello or toy apps, but readers with a bit of understanding will get something out of it.

The release notes for the update mentioned that a few sections were rearranged for better pacing and structure, and added more exercises for readers to check their progress. Both are incremental improvements that I appreciated but neither were especially earth-shattering.

Action Cable, one of the big signature feature of Rails 5, was not rolled into the book. Hartl is handling that in a separate tutorial Learn Enough Action Cable to be Dangerous which I will go through at some point in the near future.

Towards the end of the book, Hartl introduced an optional advanced concept: using Amazon Web Services to store user image uploads. I skipped that section the first time through, and decided to dive into it this time.

I quickly found myself in a deep rabbit hole. Amazon Web Services has many moving parts designed for a wide range of audiences and it’s a challenge to get started without being overwhelmed.

Which is where I am now. Lots more exploration ahead!

“Ruby on Rails Tutorial” notes

RailsTutorial-cover-webTowards the end of the Getting Started with Rails guide, there was a link to the Ruby on Rails Tutorial. I had missed it the first time I went through the Getting Started guide, but when I reviewed the guide a second time (things made a lot more sense) I noticed and decided to check it out.

I’m very glad that I did!

For the most part, this book (available as paper edition, but the author had made it freely readable online on the web site) was exactly what I had hoped to find for Ruby on Rails. The book went through the process of making a Rails app three times, in exactly the progression I wanted:

First pass: A simple “hello world” that does nothing much but lets the student experience all the standard overhead around creating and deploying a Rails app. It’s barely a Rails lesson at all: it’s really a lesson for learning the surrounding infrastructure before digging into the meat of learning Rails.

Second pass: A simple “toy app” that does a lot… but the student is rushed through without understanding what’s going on behind the scene. It’s really a demonstration of what’s possible via Rails helpers & scaffolding and less about learning Rails. It reminds me of what the Codecademy Ruby on Rails “class” was like. A lot of incomprehensible fancy flash. However, unlike Codecademy which left me asking “now what?” Rails Tutorial follows up.

Third pass: Over 80% of the book is spent building the “sample app.” Starting from a site that can serve a few static pages and grows, bit by bit, into a mini clone of Twitter. It had creation and maintenance of user accounts, authentication, posting 140-character messages, following & un-following users, all the core bits we associate with Twitter. (Which, the author asserted, was also originally built using Ruby on Rails.)

The Codecademy Rails class left me confused and feeling like I’ve wasted my time. (About a day.)

The Rails Guides left me with a basic level of comprehension of what goes on in a Rails app. My time (About half a week) is well spent, but I didn’t feel like I could build anything on my own yet.

After the Ruby on Rails Tutorial (which took me over a week) I feel like I have all the basic tools I need to start playing in the Rails playground. I feel like I have an idea how to get started, how to experiment, understand what the experiment does behind the scenes, and debug whatever goes wrong while I experiment.

And with that, it’s time to practice using my new tools: Let’s build something!

Cache is King

15Puzzle

C is an old familiar friend, so it is not part of my “new toolbox” push, but I went back to it for a bit of refresher for old time’s sake. The exercise is also an old friend – solving the 15-puzzle. The sliding tile puzzle is a problem space that I studied a lot in college looking for interesting things around heuristic search.

For nostalgia’s sake, I rewrote a textbook puzzle solver in C using the iterative-deepening A* (IDA*) algorithm employing the Manhattan Distance heuristic. It rubbed off some rust and also let me see how much faster modern computers are. It used to be: most puzzles would take minutes, and the worst case would take over a week. Now most puzzles are solved in seconds, and the worst case topped out at “merely” few tens of hours.

Looking to further improve performance, I looked online for advances in heuristics research since the time I graduated and found several. I decided to implement one of them named “Walking Distance” by the person credited with devising it, Ken’ichiro Takahashi.

From the perspective of algorithmic effectiveness, Walking Distance is a tremendous improvement over Manhattan Distance. It is a far more accurate estimate of solution length. Solving the sliding tile puzzle with the Walking Distance eliminated over 90% of duplicated work within IDA*.

On paper, then, Walking Distance should be many orders of magnitude faster… but my implementation was not. Surprised, I dug into what’s going on and I think I know the answer: CPU cache. The Manhattan Distance algorithm and lookup table all would easily fit within the 256kb L2 cache of my Intel microprocessor. (It might even fit in L1.) The Walking Distance data structures would not fit and would spill into the much-slower L3 cache. (Or possibly even main memory.) It also takes more logical operations to perform a table lookup with Walking Distance, but I believe that is less important than the location of the lookup table themselves.

In any case: with my implementation and running on my computer, it takes about 225 processor cycles to examine a node with Manhattan Distance. In contrast, a Walking Distance node averages over 81 thousand cycles. That’s 363 times longer!

Fortunately, the author was not blind to this. While building the Walking Distance lookup table, Takahashi also built a table that tracks how one lookup state transitions to another in response to a tile move. This meant we perform the full Walking Distance calculation only on startup. After the initial calculation, the updates are very fast using the transition link table, effectively a cache of Walking Distance computation.

Takahashi also incorporated the Inversion Distance heuristic as support. Sometimes the inversion count is higher than the walking distance, and we can use whichever is higher. Like walking distance, there’s also a set of optimization so the updates are faster than a full calculation.

Lastly, I realized that I neglected to compile with the most aggressive optimization settings. With it, the Manhattan Distance implementation dropped from ~225 cycles down to ~75 cycles per node.

Walking Distance was much more drastic. By implementing lookup into the transition table cache, the per-node average dropped from 81 thousand cycles to ~207 cycles per node. With fully optimized code, that dropped further to ~52 cycles per node. Fewer cycles per node, and only having to explore < 10% of the nodes, makes Walking Distance a huge winner over Manhattan Distance. One test case that took tens of hours with Manhattan Distance takes tens of minutes with Walking Distance.

That was a fun exercise in low level C programming, a good change of pace from the high-level web tools.

For the curious, the code for this exercise is up on Github, under the /C/ subdirectory.

Codecademy “Learn Sass” notes

SasslogoWhile learning Ruby on Rails, one of the things I put on my “look into this later” list was Sass. I knew it was related to CSS but didn’t know the details, I just noticed when the Rails generator created a controller, it created a .scss file under the stylesheets directory.

So when I received email from Codecademy notifying me that they have a new class on Sass… the “look into this later” became “let’s look into it now”.

Unlike Ruby on Rails, Sass is not a huge complicated system. It solves a fairly specific set of problems typical of CSS growing unwieldy as it grows with a project. It introduces some very nice concepts to keep CSS information organized. After banging my head on lots of walls with Ruby on Rails, it is refreshing to tackle a smaller-scope project and be able to understand what’s going on. The Codecademy format is well suited to teach a smaller scoped concept like Sass.

I was also mildly amused to learn that Sass is apparently written in Ruby. I don’t think it particularly matters what the implementation was, but it’s amusing to me to see Ruby applied in an entirely different way from Rails. The bonus is that, if I should try to debug or extend Sass myself, I wouldn’t be starting from scratch looking over its source code.

Being a fresh course, the Codecademy class had a few minor problems that still need to get ironed out. The flashcard example was supposed to flip on mouse hover… it never did anything for me. Too bad, because I think the effect would have been interesting.

I haven’t gotten far enough with Rails to think about making my web app pretty, but when I do, I know how to keep my style sheets manageable with Sass.

Codecademy PHP notes

elephpant

I just zipped through the PHP course in the Codecademy “Language Skills” section.

As is typical of beginner-friendly Codecademy language skill courses, it starts from the beginning with variables, flow control, etc. If the student is already aware of such basic concepts, this can get very tiresome.

This PHP class flew through basic topics. This is great for experienced programmers already familiar with the concepts and just wanted to see how PHP differs from the other languages. The downside is that, if a student is truly a beginner, they wouldn’t have received much help and would probably be lost.

I thought the class would get into more difficult topics, or areas specific to PHP and web development and… it didn’t. After covering the basics of declaring classes and class inheritance, the class stops. Even though the introduction mentioned that PHP web apps can contact databases and other server-side resources, this class didn’t cover any of the meat-and-potato of writing web applications with PHP.

Overall, the class is a superficial skimming of the surface for PHP. Good for somebody who just wants to see a quick view of PHP basics, and I appreciated it for that, but bad for somebody who actually wants to figure out how to do useful things with PHP.

I’ll have to look elsewhere for that.

Loopiness

Unrelated to the skimpy curriculum is a problem with the interactive learning development environment. In other Codecademy courses, the student writes the code and presses a button for it to be executed and evaluated. In this class, evaluation is constantly happening before the button is pressed. The upside is that feedback for errors are instant, no waiting until the end. The downside is that it’s easy to make it go into an infinite loop.

Example: If this was the goal:

$abc = 1;

while ($abc < 10)

Just before we type “<10” the state would be:

$abc = 1;

while ($abc)

Which is an infinite loop that sends the evaluation code spinning uselessly. So when the student actually presses the button later, the gear just spins.

Workaround: copy the newly-typed code into the clipboard, hit refresh on the browser to reset and reload the page, then paste the code in.

Minor Derailment Due To Infrastructure

One of the reasons I put Node.js education on hold and started with Ruby on Rails is because of my existing account at Dreamhost. Their least expensive shared hosting plan does not support Node.js applications. It does support Ruby on Rails, PHP, and a few others, so I started learning about Ruby on Rails instead.

The officially supported version of Ruby (and associated Ruby on Rails) is very old, but their customer support wiki assured me it could be updated via RVM. However, it wasn’t until I paid money and got into the control panel did I learn RVM is not supported on their shared hosting plan.

RVM Requires VPS

At this point I feel like the victim of a bait-and-switch…

So if I want to work with a non-ancient version of Ruby on Rails (and I do) I must upgrade to a different plan. Their dedicated server option is out of the question due to expense, so it’s a choice between their managed Virtual Private Server option or a raw virtual machine via DreamCompute.

In either case, I didn’t need to pause my study of Node.js because it’d work on these more expensive plans. Still, Ruby is a much more pleasant language than JavaScript. And Rails is a much better integrated stack than the free-wheeling Node.js. So it wasn’t all loss.

Before I plunk down more money, though, I think I should look into PHP. It was one of the alternatives to Ruby when I learned NodeJS wasn’t supported on Dreamhost shared hosting. It is the server-side technology available to Dreamhost shared hosting, fully managed and kept up to date. Or at least I think it is! Maybe I’ll learn differently as I get into it… again.

Dreamhost offers a 97-day satisfaction guarantee. I can probably use that to get off of shared hosting and move on to VPS. It’s also a chance find out if their customer service department is any good.

UPDATE 1: Dreamhost allowed me to cancel my hosting plan and refunded my money, zero fuss. Two clicks on the web control panel (plus two more to confirm) and the refund was done. This is pretty fantastic.

UPDATE 2: I found Heroku, a PaaS service that caters to developers working in Rails and other related web technologies. (It started with Ruby on Rails then expanded from there.) For trial and experimentation purposes, there is a free tier of Heroku I can use, and I shall.

RailGuides on Active Record

rails_guides_logo

There’s not a whole lot to say about the individual courses, so I’m going to cram a few into a single post here.

  • Active Record Basics: A good overview of Active Record, expanding upon the concepts introduced during the Getting Started guide. As a Rails novice, the conventions around naming and schema are fascinating. The goal is clearly “do what the user meant” and the magic sure looks impressive up front. But I know from experience, when the magic fails and the user has to deal with the raw guts, it might be unpleasant.
  • Active Record Migrations: This is where I started feeling like I could use more guidance. It feels like the intended audience is already well versed in database concepts… which I am not.
  • Active Record Validations: This one feels like a fairly straightforward view of doing data checks using Ruby code instead of SQL. It even goes into some detail on when you’d want to use Ruby vs. directly in the database. (Not that I necessarily understood all of it…)
  • Active Record Callbacks: Life cycle of an Active Record and when/where I can place some code of my own to look at the latest data. I don’t think this will really sink in until I see a situation where I need it.
  • Active Record Associations: I started getting in over my head here. I think I would do better if I knew more about designing database schema, but I did not. As a result the associations feel like a toolbox full of tools I don’t know how to use on problems.
  • Active Record Query Interface: If somebody can think in SQL, this section will explain how Rails features are mapped to SQL queries behind the scenes and vice versa. Sadly I’m too rusty in SQL for this section to be particularly enlightening.

For me, the general theme here is: Good, except the parts that expect me to have more SQL database knowledge than I actually do. I plan to come back and review this information later, but between now and then, I think it’d be a good idea to brush up on my SQL.

RailsGuides “Testing Rails Applications” notes

rails_guides_logo

In professional software development, the adage is “If it hasn’t been tested, it doesn’t work.” It’s easy to write some code that I believe does what I want… it might even pass a few smoke tests. But if it hasn’t been examined with some rigor, it is worthless. Time and time again test suites proved their value by exposing problems in my code. With their help I knew what I need to fix to make things better.

Yet with all this importance, testing techniques and methodologies are rarely covered – or even mentioned – in most curriculum on software development. This is why I was most surprised to find the section Testing Rails Applications on the list of RailsGuides.

I’m still too much of a Rails beginner to understand all of the information in the lesson, so I would definitely have to come back later, but it is enough for me to get a rough idea of the capabilities of the built-in testing framework.

I was most amused to see that the built-in creation scripts for Rails models and controllers would create the test scripts at the same time it created the code. A not-so-subtle reminder for the developer to go and write some tests. At this point, the source code files and the test files are both sitting empty and ready to go. We could start writing the source code – but we could have just as easily started writing the test files first.

Write the test, see it fail, write the code, see it pass.

At least that’s the theory of test-driven development. I don’t know if I’ll approach Rails development in this manner, but I do appreciate how Rails removes so much of the barriers to entry.

RailsGuides “Securing Rails Applications” notes

rails_guides_logo

As I enter the world of web development, I’m very wary of the security pitfalls of the domain. It is a completely different set of security threats than what I used to worry about. Different attack vectors that need to be defended against or mitigated with different techniques.

I had known about the Open Web Application Security Project (OWASP) but I can’t say I’ve reached proficiency understanding all the concepts there. At this point I can articulate most of the concepts listed in the OWASP Top 10 list. If I come across a description of an attack using one of these top 10 techniques, I can follow along with the description.

But that is far far short of the skill level where I can spot the security issues ahead of time. That is yet to come. With expectations set suitably low, I went through the Securing Rails Applications RailGuide to see how much I can understand. And surprisingly… quite a lot! Whoever wrote this guide was able to explain a lot of these issues in a way I can understand. I really appreciate the effort they put into breaking the problems down.

With every exploit, the guide also describes some mitigation for Rails applications. Not knowing a whole lot about Rails just yet, most of the mitigation descriptions made little sense. So while I understood more than I thought I would, I’d still have to come back again.

Right now I have no confidence that I can tell a secure Rails app from an insecure one. There’s a whole lot of very clever people trying to find ways to break security boundaries. Whenever I read about an exploit, I usually end up shaking my head “I never would have thought to try that.”

Since there’s no way to guard against a problem you never thought of, I need to build up my web programming toolbox. I want to get from “I never would have thought to try that” to “ah, that’s a clever thing to try” to “no problem, I’m pretty sure I’ve taken care of it.”

One step at a time.