A Challenge to Data Scientists

As data scientists, we are aware that bias exists in the world. We read up on stories about how cognitive biases can affect decision-making. We know that, for instance, a resume with a white-sounding name will receive a different response than the same resume with a black-sounding name, and that writers of performance reviews use different language to describe contributions by women and men in the workplace. We read stories in the news about ageism in healthcare and racism in mortgage lending.
Data scientists are problem solvers at heart, and we love our data and our algorithms that sometimes seem to work like magic, so we may be inclined to try to solve these problems stemming from human bias by turning the decisions over to machines. Most people seem to believe that machines are less biased and more pure in their decision-making – that the data tells the truth, that the machines won’t discriminate.

“Becoming a Data Scientist” Learning Club?

I have been thinking about doing a “Becoming a Data Scientist” podcast for a long time, at least since April. The podcast would include interviews focused on how people working in various data-science-related jobs got to where they are today (how did they “become a data scientist”?). I’m getting closer to taking the dive and getting it started.

I had an idea today that would take it a step further. Imagine how book clubs work where you pick a book, go off and read it, then gather occasionally to discuss and record your thoughts. Except instead of a book club, it’s a data science learning club!

BPDM’s interview with….. me!

An organization based in Puerto Rico called “Broadening Participation in Data Mining” (BPDM) interviewed me over the weekend, and it’s online now! Without further ado…. Thanks to Orlando and Herbierto for having me on! (P.S. I did put up the post about Data Sources on DataSciGuide)

Books for Data Science Beginners, and Data Sources

I just wanted to note here on Becoming A Data Scientist that I recently wrote two posts over on Data Sci Guide that are getting some attention Books to Read if You Might Be Interested in Data Science and Data Sources & APIs for Data Science Projects Enjoy!

Published
Categorized as resources

Playing With Google Cloud Datalab

This weekend, I played around with the newly-released Google Cloud Datalab. I learned how to use BigQuery and also played around with Google Charts vs Pandas+Matplotlib plots, since you can do both in Datalab. I had a few frustrations with it because the documentation isn’t great, and also sometimes it would silently timeout and it… Continue reading Playing With Google Cloud Datalab

Becoming A Data Scientist Flipboard Magazine

I love finding and sharing good articles about data science related topics on twitter, but I know not everyone is on twitter, and also sometimes tweets get quickly lost in the timeline and they’re easy to miss. So, I’ve started sharing the best articles via a Flipboard magazine as well! Check it out! https://flipboard.com/@becomingdatasci/becoming-a-data-scientist-5ktft1lky

How To Use Twitter to Learn Data Science (or anything)

When I decided that I wanted to become a data scientist, I started following some data scientists on twitter to see what they talk about and what was going on in the “industry”. Then I saw them pointing one another at resources, and answering each other’s questions, and I realized I had only seen the tip of the iceberg of “Data Science Twitter”. That’s when I created a new twitter account.

DataSciGuide Contest

Want a way to help people that are learning data science, and also get a chance to win a $40 Amazon Gift Card? Review a data science blog, podcast, course, or other content at DataSciGuide! Here’s more info: http://www.datasciguide.com/review-stuff-and-win-a-40-amazon-gift-card/

Human Name Variations in Databases

I normally write about my adventures learning data science here, but my expertise for years has been database design and reporting, and I have some knowledge to contribute to a discussion that I thought I’d document here. A conversation on Twitter today about how people’s names are stored in databases, with stories of frustration from people that have had terrible customer/patient experience because of “unusual” names, made me want to write about this topic.