Explorations

 
 

December 27 2019

Earlier this month, I had a blast leading a machine learning workshop at an R Ladies Philly meetup. After an introduction to machine learning, we used a beer review dataset to predict the alcohol concentration of beer using the caret R package. We even dabbled in text analysis. Everyone was awesome and asked many excellent questions throughout! I used RStudio Cloud to facilitate this workshop to …

Read More…

 
 
 

November 27 2019

In the last Quaker Strong challenge in the spring (March Madness edition), I was competing with several friends and enjoyed seeing how they were doing with the challenge. However, this time, with 46 registered participants, I figured it might be fun to write a few lines of code and make some fun visualization out of the logged progress. Code and details can be found here. To protect the …

Read More…

 
 
 

November 5 2019

Tree-based Pipeline Optimization Tool (TPOT) is an automated machine learning tool that helps the data scientist find the optimal model pipeline for their prediction problem. Using genetic programming (GP), TPOT explores different pipelines (sequences of feature selectors, model classifiers, etc.) and recommends one with optimal cross-validated score after a specified number of generations. Here …

Read More…

 
 
 

October 31 2019

Last week, I got to attend a series of presentations followed by a panel discussion on open science at Penn Van Pelt-Dietrich library during the Open Access week. The panel featured (from left to right) Ted Satterthwaite, Jennifer Sisto, Daniel Himmelstein – all initiated enlightening discussions around different aspects of open science. ⊕ Photo credit: Rebecca Miller Jennifer Stiso …

Read More…

 
 
 

October 1 2019

I recently stumbled upon this article by Gervasio Piñeiroa and colleages analyzing the method of model evaluation via plotting observed and predicted \(y\). The authors argue that, in plotting predicted or observed values, observed should be place on the \(y\)-axis vs. predicted on the \(x\)-axis. Because this article is unfortunately behind paywall, I’m going to show the quick simulation I have …

Read More…