Notes on Codecademy “Build Deep Learning Models with TensorFlow”

Once I upgraded to a Codecademy Pro membership, I started taking courses from its Python catalog with the goal of building a foundation to understand deep learning neural networks. Aside from a few scenic detours, most of my course choices were intended to build upon each other to fulfill what I consider prerequisites for a Codecademy “Skill Path”: Build Deep Learning Models with TensorFlow

This was the first “Skill Path” I took, and I wasn’t quite sure what to expect as Codecademy implied they are different than the courses I took before. But once I got into this “skill path”… it feels pretty much like another course. Just a longer one, with more sessions. It picked up where the “Learn the Basics of Machine Learning” course left off with neural perceptrons, and dived deeper into neural networks.

In contrast to earlier courses that taught various concepts by using them to solve regression problems, this course spent more time on classification problems. We are still using scikit-learn a lot, but as promised by the title we’re also using TensorFlow. Note the course work mostly stayed in the Keras subset of TensorFlow 2 API. Keras used to be a separate library for making it easier to work with TensorFlow version 1, but it has since been merged into TensorFlow version 2 as part of the big revamp between versions.

I want to call attention to an item linked as “additional resources” for the skill path: a book titled “Deep Learning with Python” by François Chollet. (Author, or at least one of the primary people, behind Keras.) Following various links associated with the title, I found that there’s since been a second edition and the first chapter of the book is available to read online for free! I loved reading this chapter, which managed to condense a lot of background on deep learning into a concise history of the field. If the rest of the book is as good as the first chapter, I will learn a lot. The only reason I haven’t bought the book (yet) is that, based on the index, the book doesn’t get into unsupervised reinforcement learning like the type I want to put into my robot projects.

Back to the Codecademy course…. err, skill path: we get a lot of hands-on exercises using Keras to build TensorFlow models and train them on data for various types of problems. This is great, but I felt there was a significant gap in the material. I appreciated learning that different loss functions and optimizers will be used for regression versus classification problems, and we put them to work in their respective domains. But we were merely told which function to use for each exercise, the course doesn’t go into why they were chosen for the problem. I had hoped that the Keras documentation Optimizers Overview page would describe relative strengths and weaknesses of each optimizer, but it was merely a list of optimizers by name. I feel like such a comparison chart must exist somewhere, but it’s not here.

I didn’t quite finish this skill path. I lost motivation to finish the “Portfolio Project” portion of the skill path where we are directed to create a forest cover classification model. My motivation for deep learning lies in reinforcement learning, not classification or regression problems, so my attention has wandered elsewhere. At this point I believe I’ve exhausted all the immediately applicable resources on Codecademy as there are no further deep learning material nor is there anything on reinforcement learning. So I bid a grateful farewell to Codecademy for teaching me many important basics over the past few months and started looking elsewhere.

Notes on Codecademy Intermediate Python Courses

I thought Codecademy’s course “Getting Started Off Platform for Data Science” really deserved more focus than it did when I initially browsed the catalog, regretting that I saw it at the end of my perusal of beginner friendly Python courses. But life moves on. I started going through some intermediate courses with an eye on future studies in machine learning. Here are some notes:

  • Learn Recursion with Python I took purely for fun and curiosity with no expectation of applicability to modern machine learning. In school I learned recursion with Lisp, a language ideally suited for the task. Python wasn’t as good of a fit for the subject, but it was alright. Lisp was also the darling of artificial intelligence research for a while, but I guess the focus has since evolved.
  • Learn Data Visualization with Python gave me more depth on two popular Python graphing libraries: Matplotlib and Seaborn. These are both libraries with lots of functionality so “more depth” is still only a brief overview. Still, I anticipate skills here to be useful in the future and not just in machine learning adventures.
  • Learn Statistics with NumPy was expected to be a direct follow-up to the beginner-friendly Statistics with Python course, but it was not a direct sequel and there’s more overlap than I thought there’d be. This course is shorter, with less coverage on statistics but more about NumPy. After taking the course I think I had parsed the course title as “(Learn Statistics) with NumPy” but I think it’s more accurate to think of it as “Learn (Statistics with NumPy)”
  • Linear Regression in Python is a small but important step up the foothills on the way to climbing the mountain of machine learning. Finding the best line to fit a set of data teaches important concepts like loss functions. And doing it on a 2D plot of points gives us an intuitive grasp of what the process looks like before we start adding variables and increasing the number of dimensions involved. Many concepts are described and we get exercises using the scikit-learn library which implements those algorithms.
  • Learn the Basics of Machine Learning was the obvious follow-up, diving deeper into machine learning fundamentals. All of my old friends are here: Pandas, NumPy, scikit-learn, and more. It’s a huge party of Python libraries! I see this course as a survey of major themes in machine learning, of which neural networks was only a part. It describes a broader context which I believe is a good thing to have in the back of my head. I hope it helps me avoid the trap of trying to use neural nets to solve everything a.k.a. “When I get a shiny new hammer everything looks like a nail”.

Several months after I started reorienting myself with Python 3, I felt like I had the foundation I needed to start digging into the current state of the art of deep learning research. I have no illusions about being able to contribute anything, I’m just trying to learn enough to apply what I can read in papers. My next step is to learn to build a deep learning model.

Notes on Codecademy “Getting Started Off Platform for Data Science”

I like Codecademy’s format of having a bit of information that is followed immediately by an opportunity to try it myself. I like learn-by-doing as a beginner, even if the teaching/learning environment can be limited at times. But one thing that I didn’t like was the fact if I am to put my Python knowledge to use, I would have to venture outside of the learning environment and Codecademy didn’t used to provide information how.

The Learn Python 3 course made effort to help students work outside of the Codecademy environment with “Off-Platform Project”. These came in the form of Jupyter notebooks that I could download, and a page with some instructions on how to use them: a link to Codecademy’s command line course, a link to instructions for installing Python on my own computer, and a link on installing Jupyter notebooks. It’s a bit scattered.

What I didn’t know at the time was that Codecademy had already assembled an entire course covering these points. Getting Started Off Platform for Data Science is an orientation for everyone as we eventually venture off Codecademy’s learning platform. It starts with an introduction to the command line, then Python development tools like Jupyter Notebooks and other IDEs, wrapping up with an introduction to Github. This is great! Why didn’t they put more emphasis on this earlier? I think it would have been super helpful to beginners.

Though admittedly, I didn’t follow those installation instructions anyway. Python isn’t very good about library version management and the community has sidestepped the issue by using virtual environments to keep Python libraries separated in different per-project worlds. I’ve used venv and Anaconda to do this, and recently I’ve also started playing with Docker containers. For my own trip through Codecademy’s off-platform projects using Jupyter notebooks, I ran Jupyter Lab using their jupyter/datascience-notebook Docker image. That turned out to be sheer overkill and I probably could have just used the much lighter-weight jupyter/base-notebook image.

In hindsight I think it would have been useful for me to review Getting Started Off Platform for Data Science before I started reorienting myself with Python. I wouldn’t have followed it by the letter, but it had information that would have been useful beforehand. But as fate had it, it became the final course I took in the beginner-friendly section before I started trying intermediate courses.

Codecademy Beginner Friendly Python Fields

Once Codecademy got me reoriented with the Python programming language, I looked at some of their other beginner-friendly courses under the Python umbrella. I wanted to get some practice using Python, but I didn’t want to go through exercises for the sake of exercises. I wanted to make some effort at keeping things focused on my ultimate goal of learning about modern advances in machine learning.

  1. Learn Data Analysis with Pandas was my first choice, because I recognized “Pandas” as the name of a popular Python library for preparing data for machine learning. Making it relevant to the direction I am aiming for. The course title has “Data Analysis” and not “Machine Learning” but that was fine because it was only an introduction to the library. Not enough to get into field-specific knowledge, but more than enough to teach me Pandas vocabulary so I could navigate Pandas references and find my own answers in the future.
  2. How to Clean Data with Python followed up with more examples of Pandas in action. Again the course is nominally focused for data analytics but all the same concepts apply to cleaning data before feeding into machine learning algorithms.
  3. Exploratory Data Analysis in Python is a longer course with more ways to apply Pandas, including a machine learning specific section. Relative to other courses, this one is heavy on reading and light on hands-on practice, a consequence of the more general nature of the topic. And finally, this course let me dip my toes in another popular Python library I wanted to learn: NumPy.
  4. Learn Statistics with Python was how I dove into NumPy waters. After barely skating by some statics and number crunching in the previous course, I wanted a refresher in basic statistics. Alongside the refresher I also learn how to calculate common statistics using the NumPy library. And after the statistics calculations are done, we want to visualize them! Enter yet another popular Python library: matplotlib.
  5. Probability is the natural course to follow a refresher in basic statistics. They cover only the most basic and common applications of statistics and probability for data analysis, we’re on our own to explore in further depth outside of the class. I anticipate probability to play a role in machine learning, as some answers are going to be vague with room for interpretation. I foresee a poor (or misleadingly confident) grasp of probability will lead me astray.
  6. Differential Calculus was a course I poked my head into. I remembered it was quite a complex subject in school and was surprised Codecademy claimed anyone could learn it in two hours. It turns out the course should be more accurately titled “an introduction to numpy.gradient()“. Which… yes, it is a numerical application of differential calculus but it is definitely not the entirety of differential calculus. I guess it follows the trend of these courses: overly simplfied titles that skim the basics of a few things. Teach just enough for us to learn more on our own later.
  7. Linear Algebra starts to get into Python code that has direct relevance to machine learning. I know linear regression is a starting point and I knew I needed an introduction to linear algebra before I could grasp how linear regression algorithms work.
  8. Learn How to Get Started with Natural Language Processing was a disappointment to me, but it was not the fault of the course. It’s just that the machine learning systems in this field aren’t usually reinforcement learning systems. Which was the subfield of machine learning that most interested me. At least the course was short, and taught me enough so I know to skip other Codecademy natural language courses for myself.

The final Codecademy “Beginner friendly” Python course I took was titled “Getting Started Off Platform for Data Science.” I don’t think Codecademy put enough emphasis on this one.

Getting Reacquainted with Python via Codecademy

A few years ago I started learning Python and applied that knowledge to write control software for SGVHAK rover. I haven’t done very much with Python since, and my skills have become rusty. Since Python is very popular in modern machine learning research, a field that I am interested in exploring, I knew I had to get back to studying Python eventually.

I remember that I enjoyed learning Python from Codecademy, so I returned to see what courses had been added since my visit years ago. The Codecademy Python catalog has definitely grown, and I was not surprised to see most of it are only accessible to the paid Pro tier. If I want to make a serious run at this, I’ll have to pay up. Fortunately, like a lot of digital content on the internet, it’s not terribly difficult to find discounts for Codecademy Pro. Armed with one of these discount venues, I upgraded to the Pro tier and got to work. Here are some notes on a few introductory courses:

  • Learn Python 2 was where I started before, because SGVHAK rover used RoboClaw motor controllers and their Python library at the time was not compatible with Python 3. I couldn’t finish the course earlier because it was a mix of free and Pro content, and I wasn’t a Codecademy Pro subscriber at the time. I’m not terribly interested in finishing this course now. Python 2 was officially history as of January 1st, 2020. The only reason I might revisit this course is if I tackle the challenge of working in an old Python 2 codebase.
  • Right now I’m more interested in the future, so for my refresher course I started with Learn Python 3. This course has no prerequisites and starts at the very beginning with printing Hello World to the console and building up from there. I found the progression reasonable with one glaring exception: At the end of the course there were some coding challenges, and the one regarding Python classes required students to create base classes and derived classes. Problem: class inheritance was never covered in course instructions! I don’t think they expected students to learn this on their own. It feels like an instruction chapter got moved to the intermediate course, but its corresponding exercise was left in place. Other than that, the class was pretty good.
  • Inheritance and other related concepts weren’t covered until the “Object-Oriented Programming” section of Learn Intermediate Python 3, which didn’t have as smooth or logical of a progression. It felt more like a grab-bag of intermediate concepts that they decided to cut out of the beginner course. This class was not terrible, but it did diminish the advantage of learning through Codecademy versus reading bits and pieces on my own. Still, I learned a lot of useful bits about Python that I hadn’t known before. I’m glad I spent time here.

With some Python basics down — some I knew from before and some that were new to me — I poked around other beginner-friendly Codecademy Python courses.