Hello and welcome! I’m Trang Le. I’m a postdoctoral fellow with Jason Moore at the Computational Genetics Lab, University of Pennsylvania. I enjoy developing machine learning methods for analyses of biomedical data, including neuroimage (functional/structural MRI), transcriptomics and genotypes. Most of the datasets I work with are high dimensional (i.e., have many predictors/features), so I spend most of my time building feature selection algorithms for these data. I trade my bias toward the nearest-neighbor concept for lower variance of my methods and better generalizability. When I’m not knee deep in data, I run, dance and seasonally ski.

Explorations

Những tuần gần đây, mình đọc một số bài viết về nạn phân biệt chủng tộc ở Mĩ, chủ yếu là từ trải nghiệm cá nhân của từng người. Cho đến rất gần đây, mình vẫn nhìn việc phân biệt chủng tộc này từ một góc rất hạn hẹp: góc độ cá nhân. Bạn Trang Quạ viết: “sự kì thị này ăn sâu vào tiềm thức của các cá thể trong cộng đồng”. Mình đồng ý. Nhưng vấn đề chủng tộc ở Mĩ lớn hơn từng cá thể rất …

Read More…

It took me 30 minutes to figure this out. I hope it takes you less. Earlier today, I submitted a manuscript to GECCO Hot-Off-the-Press track. The submission process was pretty straightforward, until I hit Submit and encountered this error: […] All fonts must be embedded in the PDF. […] Googling this error led me to fiddle around with Acrobat Reader, try different TeX engines and …

Read More…

A few days after nonessential business closing due to the COVID-19 pandemic, the streets and trails of Philadelphia are filled with runners. While it’s nice that a lot of people have reverted to this basic form of exercise, Welcome to the club! it pains us regular runners physically when we see you run in jeans and cotton t-shirts. If running for you is a outdoor family activity and the goal is …

Read More…

Earlier this month, I had a blast leading a machine learning workshop at an R Ladies Philly meetup. After an introduction to machine learning, we used a beer review dataset to predict the alcohol concentration of beer using the caret R package. We even dabbled in text analysis. Everyone was awesome and asked many excellent questions throughout! I used RStudio Cloud to facilitate this workshop to …

Read More…

 

Recent Works

  • Mathematical analysis of the spread of HIV/AIDS using SEAIT model. Research Colloquium, University of Tulsa, 2013      
  • STIR feature selection. Pacific Symposium on Biocomputing, Jan 5, 2019      
  • Scaling tree-based automated machine learning to biomedical big data with a feature set selector. Genetic and evolutionary computation conference, Jun 10, 2020      
  • Optimized random forest classification accuracy. International Conference on Integral Methods in Science and Engineering, Jul 25, 2016      
  • On visualization: Take a sad chart and make it better, R Ladies Philly, Dec 8, 2020      
  • Integrative network analysis for major depressive disorder. Organization for Human Brain Mapping Meeting, Jun 28, 2017      
  • Statistical Inference Relief (STIR) feature selection. Mid-Atlantic Bioinformatics Conference, Oct 29, 2018      
  • Multilocus risk scores: rethinking genetic risk scores to account for epistasis. International Joint Conference on Biomedical Engineering Systems and Technologies, Feb 25, 2020