Trang Le

Explorations

10 things I love about Julia

December 28 2021

Those who know me know that my love for the Julia language has grown quite a bit in the past couple years. Still, I had trouble finding a project that would allow me to work more in this nifty language until I saw Jasmine Hughes solve all of last year Advent of Code (AOC) puzzles in Julia. ⊕ I know of Jasmine through the AOC RLadies leaderboard! See Jasmine’s valuable advice on “how to get …

9 things I can't tech without

March 6 2021

As I work closely with graduate students (whom I dearly call my apprentices), I share with them some tools I picked up along the way that help boost my productivity and just make my life easier in general. Often, they come back with “WOW!!! XYZ has been really helpful! I wish I knew about it 5 years ago.” So do I. Well, some of these tools might have been at their early stage or not even existed …

Chính phủ Mĩ hệ thống hoá nạn phân biệt chủng tộc như thế nào? (Chương 0)

June 20 2020

Những tuần gần đây, mình đọc một số bài viết về nạn phân biệt chủng tộc ở Mĩ, chủ yếu là từ trải nghiệm cá nhân của từng người. Cho đến rất gần đây, mình vẫn nhìn việc phân biệt chủng tộc này từ một góc rất hạn hẹp: góc độ cá nhân. Bạn Trang Quạ viết: “sự kì thị này ăn sâu vào tiềm thức của các cá thể trong cộng đồng”. Mình đồng ý. Nhưng vấn đề chủng tộc ở Mĩ lớn hơn từng cá thể rất …

All fonts must be embedded

May 3 2020

It took me 30 minutes to figure this out. I hope it takes you less. Earlier today, I submitted a manuscript to GECCO Hot-Off-the-Press track. The submission process was pretty straightforward, until I hit Submit and encountered this error: […] All fonts must be embedded in the PDF. […] Googling this error led me to fiddle around with Acrobat Reader, try different TeX engines and …

→ Explore

Recent Works

treeheatr and pmlbr: visualizing decision trees on benchmark datasets. R-Ladies Johannesburg, Sep 14, 2021
The presentation of the decision tree with data represented as a heatmap is a new visualization that uncovers the tree’s performance, the data’s correlation structure, and the importance of each feature in predicting the outcome. Implemented in an easily installed package with a detailed vignette, treeheatr can be a useful teaching tool to enhance students’ understanding of a simple decision tree model before diving into more complex tree-based machine learning methods. We will apply decision tree models and visualize them on multiple benchmark machine learning datasets in pmlbr.
Take a bad chart and make it better. IMS, Aug 30, 2021
Every good chart started out as a bad one. In this talk, we will discuss a few basic visualization principles that help improve our charts and refine our data stories.
Take a bad chart and make it better. Cleveland R User Group, Aug 25, 2021
Every good chart started out as a bad one. In this talk, we will discuss a few basic visualization principles and some ggplot tips and tricks that help improve our charts and refine our data stories. Some familiarity with R and ggplot will be useful but not required - novice R users are encouraged to attend. Trang Le is a postdoctoral fellow with Jason Moore at the Computational Genetics Lab, University of Pennsylvania. She enjoys developing machine learning methods for rigorous analyses of a wide array of biomedical data. Most recently, her work focuses on investigating the long-term effect of neurological conditions in COVID-19 patients. She’s the author and maintainer of multiple R packages.
On visualization: Take a sad chart and make it better, R Ladies Philly, Dec 8, 2020
Every good chart started out as a bad one. In this workshop, we discuss 1) basic visualization principles, and 2) resources, tools, tips, and tricks that help improve our charts and refine our data stories. Some familiarity with R and ggplot will be useful but not required - novice R users are encouraged to attend.
tdapseudotime: Implements the temporal phenotyping via topological data analysis. (2020)
Visualizing decision trees on benchmark datasets. R Ladies Miami, Nov 19, 2020
regens: REGENS (REcombinatory Genome ENumeration of Subpopulations) (2020)
An open source Python package 📦 that simulates whole genomes from real genomic segments. REGENS recombines these segments in a way that simulates completely new individuals while simultaneously preserving the input genomes’ linkage disequilibrium (LD) pattern with extremely high fedility. REGENS can also simulate mono-allelic and epistatic single nucleotide variant (SNV) effects on a continuous or binary phenotype without perturbing the simulated LD pattern.
pmlbr: an R interface to PMLB (2020)
an R interface to the Penn Machine Learning Benchmarks data repository, a large collection of curated benchmark datasets for evaluating and comparing supervised machine learning algorithms.