Playing With Google Cloud Datalab

This weekend, I played around with the newly-released Google Cloud Datalab. I learned how to use BigQuery and also played around with Google Charts vs Pandas+Matplotlib plots, since you can do both in Datalab. I had a few frustrations with it because the documentation isn’t great, and also sometimes it would silently timeout and it… Continue reading Playing With Google Cloud Datalab

The Setup (usesthis.com) API

There’s a really interesting site usesthis.com AKA “The Setup” which interviews people and lists all of the gear that they use, including software. I found out that they have an API, (documented here) and I wanted to use my new API skills in Python to test it out! This one returns JSON unlike the NPR… Continue reading The Setup (usesthis.com) API

Data Science Practice – Classifying Heart Disease

This post details a casual exploratory project I did over a few days to teach myself more about classifiers. I downloaded the Heart Disease dataset from the UCI Machine Learning respository and thought of a few different ways to approach classifying the provided data. ——————————————- “MANUAL” APPROACH USING EXCEL So first I started out by… Continue reading Data Science Practice – Classifying Heart Disease

Machine Learning Project 4

So immediately after I turned in project 3, I started on Project 4, our final project in Machine Learning grad class. We had a few options that the professor gave us, but could also propose our own. One of the options was learning how to implement Random Forest (an ensemble learning method using many decision trees) and analyzing a given data set, so I proposed using Random Forest on University Advancement (Development/Fundraising) data I got from my “day job”. The professor approved it, so I started learning about Random Forest Classification.