Student at Rice University. Class of 2021
Studying Statistics and Computer Science.
Interactive map of Houston, plotting region, income, construction, and demolition features, with script to geocode addresses with Maps API
Currently building model to predict gentrification in each zip code, using boosted random forests
Made in R Shiny, with Leaflet
Machine learning model to detect political leanings from online comments and indicate predictive phrases, with crawler to source 250,000 political comments
Achieved F1 score of 0.850, with accuracy of 79.1%
Made in Python
Online applet for exploration of Rice University course evaluation data for the year of 2017 to 2018.
Includes interactive visualziations of course review ratings.
Made in R Shiny
Technical Skills
Languages
Technologies
Python - sk-learn, nltk, np, matplotlib
R - h20, leaflet, shiny
SQL - MS SQL Server
Java
Amazon Web Services
PySpark
Apache Hadoop
Git
Rice University Data Science Research | Research Assistant
Summer 2018 | Houston, Texas• Text-mined natural language features (sentiment, PoS tagging, embeddings), evaluated as predictors of unreliable news
• Automated pipeline to transform article data into network structure of similarity scores
• Implemented clustering (k-means, spectral, mixture) for classification of articles and detection of community structure
• Achieved 85.8% classification accuracy with an F1 score of 0.845 on articles
National Defense Medical Center | Assistant Programmer
Summer 2016 | Taipei, Taiwan
• Built test cases for a browser-based distributed system for sequence alignment
• Implemented Smith-Waterman and Needleman-Wunsch algorithms for 41% speed improvement over initial version
Regression, classification, kernels, clustering, random forests, ensemble learning
Databases (SQL), optimization, text mining, distributed computing (AWS EMR, Spark), TensorFlow
Concurrency, networking, memory allocation and management, linking, exceptions
Houston, Texas
1-(781)-951-4028
Hsiao.WeiLin1999, at Gmail