Wei-Lin Hsiao

Student at Rice University. Class of 2021
Studying Statistics and Computer Science.

build a site for free

GitHub

Resume

LinkedIn


Personal Projects

Mobirise

Houston Map - Gentrification

Interactive map of Houston, plotting region, income, construction, and demolition features, with script to geocode addresses with Maps API

Currently building model to predict gentrification in each zip code, using boosted random forests 

Made in R Shiny, with Leaflet

Mobirise

Political Leaning Detector

Machine learning model to detect political leanings from online comments and indicate predictive phrases, with crawler to source 250,000 political comments

Achieved F1 score of 0.850, with accuracy of 79.1%

Made in Python

Mobirise

Rice Univeristy - Course Reviews

Online applet for exploration of Rice University course evaluation data for the year of 2017 to 2018. 

Includes interactive visualziations of course review ratings.



Made in R Shiny


Technical Skills


Languages

Technologies

Python - sk-learn, nltk, np, matplotlib
R - h20, leaflet, shiny
SQL - MS SQL Server
Java

Amazon Web Services
PySpark
Apache Hadoop
Git


Work Experience

Rice University Data Science Research | Research Assistant
Summer 2018 | Houston, Texas 

    Text-mined natural language features (sentiment, PoS tagging, embeddings),         evaluated as predictors of unreliable news
    Automated pipeline to transform article data into network structure of similarity         scores 
    Implemented clustering (k-means, spectral, mixture) for classification of articles         and detection of community structure 
    Achieved 85.8% classification accuracy with an F1 score of 0.845 on articles 

National Defense Medical Center | Assistant Programmer
Summer 2016 | Taipei, Taiwan 

    Built test cases for a browser-based distributed system for sequence alignment
    Implemented Smith-Waterman and Needleman-Wunsch algorithms for 41%         speed improvement over initial version


Selected Coursework

STAT 413: Stat. Machine Learning

Regression, classification, kernels, clustering, random forests, ensemble learning

COMP 330: Tools & Models – D.S.

Databases (SQL), optimization, text mining, distributed computing (AWS EMR, Spark), TensorFlow

COMP 321: Intro. to Comp. Systems

Concurrency, networking, memory allocation and management, linking, exceptions


COMP 322: Principles of Parallel Programming                            
 
STAT 410: Linear Regression  


MATH 354: Honors Linear Algebra                                                       

STAT 411: Advanced Statistical Methods

Based in...

Houston, Texas

Phone Number

1-(781)-951-4028

Email

Hsiao.WeiLin1999, at Gmail