Once Codecademy got me reoriented with the Python programming language, I looked at some of their other beginner-friendly courses under the Python umbrella. I wanted to get some practice using Python, but I didn’t want to go through exercises for the sake of exercises. I wanted to make some effort at keeping things focused on my ultimate goal of learning about modern advances in machine learning.
- Learn Data Analysis with Pandas was my first choice, because I recognized “Pandas” as the name of a popular Python library for preparing data for machine learning. Making it relevant to the direction I am aiming for. The course title has “Data Analysis” and not “Machine Learning” but that was fine because it was only an introduction to the library. Not enough to get into field-specific knowledge, but more than enough to teach me Pandas vocabulary so I could navigate Pandas references and find my own answers in the future.
- How to Clean Data with Python followed up with more examples of Pandas in action. Again the course is nominally focused for data analytics but all the same concepts apply to cleaning data before feeding into machine learning algorithms.
- Exploratory Data Analysis in Python is a longer course with more ways to apply Pandas, including a machine learning specific section. Relative to other courses, this one is heavy on reading and light on hands-on practice, a consequence of the more general nature of the topic. And finally, this course let me dip my toes in another popular Python library I wanted to learn: NumPy.
- Learn Statistics with Python was how I dove into NumPy waters. After barely skating by some statics and number crunching in the previous course, I wanted a refresher in basic statistics. Alongside the refresher I also learn how to calculate common statistics using the NumPy library. And after the statistics calculations are done, we want to visualize them! Enter yet another popular Python library: matplotlib.
- Probability is the natural course to follow a refresher in basic statistics. They cover only the most basic and common applications of statistics and probability for data analysis, we’re on our own to explore in further depth outside of the class. I anticipate probability to play a role in machine learning, as some answers are going to be vague with room for interpretation. I foresee a poor (or misleadingly confident) grasp of probability will lead me astray.
- Differential Calculus was a course I poked my head into. I remembered it was quite a complex subject in school and was surprised Codecademy claimed anyone could learn it in two hours. It turns out the course should be more accurately titled “an introduction to
numpy.gradient()“. Which… yes, it is a numerical application of differential calculus but it is definitely not the entirety of differential calculus. I guess it follows the trend of these courses: overly simplfied titles that skim the basics of a few things. Teach just enough for us to learn more on our own later.
- Linear Algebra starts to get into Python code that has direct relevance to machine learning. I know linear regression is a starting point and I knew I needed an introduction to linear algebra before I could grasp how linear regression algorithms work.
- Learn How to Get Started with Natural Language Processing was a disappointment to me, but it was not the fault of the course. It’s just that the machine learning systems in this field aren’t usually reinforcement learning systems. Which was the subfield of machine learning that most interested me. At least the course was short, and taught me enough so I know to skip other Codecademy natural language courses for myself.