Hello and welcome! I’m Trang Le. I’m a postdoctoral fellow with Jason Moore at the Computational Genetics Lab, University of Pennsylvania. I enjoy developing machine learning methods for analyses of biomedical data, including neuroimage (functional/structural MRI), transcriptomics and genotypes. Most of the datasets I work with are high dimensional (i.e., have many predictors/features), so I spend most of my time building feature selection algorithms for these data. I trade my bias toward the nearest-neighbor concept for lower variance of my methods and better generalizability. When I’m not knee deep in data, I run, dance and seasonally ski.

Explorations

It took me 30 minutes to figure this out. I hope it takes you less. Earlier today, I submitted a manuscript to GECCO Hot-Off-the-Press track. The submission process was pretty straightforward, until I hit Submit and encountered this error: […] All fonts must be embedded in the PDF. […] Googling this error led me to fiddle around with Acrobat Reader, try different TeX engines and …

Read More…

A few days after nonessential business closing due to the COVID-19 pandemic, the streets and trails of Philadelphia are filled with runners. While it’s nice that a lot of people have reverted to this basic form of exercise, Welcome to the club! it pains us regular runners physically when we see you run in jeans and cotton t-shirts. If running for you is a outdoor family activity and the goal is …

Read More…

Earlier this month, I had a blast leading a machine learning workshop at an R Ladies Philly meetup. After an introduction to machine learning, we used a beer review dataset to predict the alcohol concentration of beer using the caret R package. We even dabbled in text analysis. Everyone was awesome and asked many excellent questions throughout! I used RStudio Cloud to facilitate this workshop to …

Read More…

The Quaker Strong challenge

November 27 2019

In the last Quaker Strong challenge in the spring (March Madness edition), I was competing with several friends and enjoyed seeing how they were doing with the challenge. However, this time, with 46 registered participants, I figured it might be fun to write a few lines of code and make some fun visualization out of the logged progress. Code and details can be found here. To protect the …

Read More…

 

Recent Works

  • Fundamentals of AI guest lecturer, University of Pennsylvania, Mar 30, 2020      
  • Detect network interactions and control for confounders and multiple testing, Rocky Mountain Bioinformatics Conference, Dec 6, 2019      
  • Trang T Le, Weixuan Fu and Jason H Moore (2019) Scaling tree-based automated machine learning to biomedical big data with a feature set selector. doi:10.1093/bioinformatics/btz470
  • treeheatr: Heatmap-integrated decision tree visualizations (2020)      
  • Multilocus risk scores: rethinking genetic risk scores to account for epistasis, International Joint Conference on Biomedical Engineering Systems and Technologies, Feb 25, 2020      
  • Multilocus risk scores, Rocky Mountain Bioinformatics Conference, Dec 6, 2019      
  • Machine learning workshop, R Ladies Philly, Dec 2, 2019      
  • Multilocus risk scores, Penn Genetics Retreat, Sep 4, 2019      
  • npdr: Select features with nearest-neighbor concepts (2019)