
PROJECTS
Research on Diabetes in California
2025.1 - 2025.4
In this research project, we start from scratch — creating our own database by collecting, cleaning, and joining data from multiple credible sources. We then perform data analysis to investigate factors that may be correlated with diabetes-related mortality, such as county of residence, hospitalization rates, and various complications or types of diabetes.
​
Key Skill Set: SQL, Pandas

Credit Card Default Prediction
2024.5 - 2025.6
How can we tell the odds that a customer might default on their credit card bill? Based on their demographic information and past payment records, we can try training different machine learning models to predict their future behaviour. At the end of the project, I also compared these models’ outcomes using a selected metric.
​
Key Skill Set: Scikit-learn, Matplotlib, Numpy, Pandas

Medical Cost Prediction in the US
2023.9 - 2023.12
People from different backgrounds pay varying amounts in medical fees each year in the US. Focusing on regression techniques, we will apply a full additive multiple linear regression (MLR) model, with and without ridge and lasso penalties, to predict these costs.
​
Key Skill Set: R, Data Analysis Flow, Hyperparameter Optimization

FeedMe
2023.1 - 2023.4
If you're looking for some amazing restaurants around your city to hang out with friends, family, or even just explore on your own, you should try my self-developed software — FeedMe! You’ll not only be able to view signature dishes in each restaurant, but also add your favourite ones to a personal collection with a timely save option.
​
Key Skill Set: Java, Software Design Flow, UI Design

