Hello and welcome! I’m Trang Le. I’m a postdoctoral fellow with Jason Moore at the Computational Genetics Lab, University of Pennsylvania. I enjoy developing machine learning methods for analyses of biomedical data, including neuroimage (functional/structural MRI), transcriptomics and genotypes. Most of the datasets I work with are high dimensional (i.e., have many predictors/features), so I spend most of my time building feature selection algorithms for these data. I trade my bias toward the nearest-neighbor concept for lower variance of my methods and better generalizability. When I’m not knee deep in data, I run, dance and seasonally ski.

Explorations

Những tuần gần đây, mình đọc một số bài viết về nạn phân biệt chủng tộc ở Mĩ, chủ yếu là từ trải nghiệm cá nhân của từng người. Cho đến rất gần đây, mình vẫn nhìn việc phân biệt chủng tộc này từ một góc rất hạn hẹp: góc độ cá nhân. Bạn Trang Quạ viết: “sự kì thị này ăn sâu vào tiềm thức của các cá thể trong cộng đồng”. Mình đồng ý. Nhưng vấn đề chủng tộc ở Mĩ lớn hơn từng cá thể rất …

Read More…

It took me 30 minutes to figure this out. I hope it takes you less. Earlier today, I submitted a manuscript to GECCO Hot-Off-the-Press track. The submission process was pretty straightforward, until I hit Submit and encountered this error: […] All fonts must be embedded in the PDF. […] Googling this error led me to fiddle around with Acrobat Reader, try different TeX engines and …

Read More…

A few days after nonessential business closing due to the COVID-19 pandemic, the streets and trails of Philadelphia are filled with runners. While it’s nice that a lot of people have reverted to this basic form of exercise, Welcome to the club! it pains us regular runners physically when we see you run in jeans and cotton t-shirts. If running for you is a outdoor family activity and the goal is …

Read More…

Earlier this month, I had a blast leading a machine learning workshop at an R Ladies Philly meetup. After an introduction to machine learning, we used a beer review dataset to predict the alcohol concentration of beer using the caret R package. We even dabbled in text analysis. Everyone was awesome and asked many excellent questions throughout! I used RStudio Cloud to facilitate this workshop to …

Read More…

 

Recent Works

  • On visualization: Take a sad chart and make it better, R Ladies Philly, Dec 8, 2020      
  • pmlb-r: an R interface to PMLB (2020)      
  • treeheatr: an R package for interpretable decision tree visualizations. R/Medicine Virtual Conference, Aug 28, 2020      
  • Analysis of ISCB honorees and keynotes reveals disparities. Conference on Intelligent Systems for Molecular Biology, Jun 30, 2020      
  • Scaling tree-based automated machine learning to biomedical big data with a feature set selector. Genetic and evolutionary computation conference, Jun 10, 2020      
  • pmlb: Penn Machine Learning Benchmarks (PMLB) (2020)      
  • treeheatr: Heatmap-integrated decision tree visualizations (2020)      
  • Trang T. Le, Daniel S. Himmelstein, Ariel A. Hippen Anderson, Matthew R. Gazzara and Casey S. Greene (2020, preprint) Analysis of ISCB honorees and keynotes reveals disparities. doi:10.1101/2020.04.14.927251