Hi! My name is Jason Fang, a data scientist who graduated from the University of Maryland, College Park.
With a technical (Master of Science in Marketing Analytics) and
non-technical (Bachelor of Science in Marketing) academic background,
I always focus on creating values rather than creating models.
Here are several tools that I usually use to solve business problems
Python: Create machine learning, deep learning and statistical models
SQL: Find key metrics, analyze business problems, join tables
Tableau: Visualize findings, build dashboard, create calculated field
Excel: Analyze small dataset, create pivot table, generate reports
PowerPoint: Present my results to technical and non-technical audiences
PySpark: Analyze big dataset, utilize Spark ML Library and Spark Streaming
AWS: Use S3, Redshift, QuickSight, EC2, IAM, Route 53, CloudFront
I use many models to create predictive models. Deep learning sounds more advanced, but sometimes
machine learning models, statistical models,and ensemble learning are better. It always depends
on the cases that I need to analyze.
Here are some models that I usually use to solve business problems
Regression: Simple, Multiple, Polynomial Linear Regression, Logistic Regression
Classification: KNN, SVM, Naive Bayes, Decision Tree Classification
Clustering: K-Means, Hierarchical Clustering, Self-Organizing Maps
Ensemble Learning: Random Forest,Gradient Boosting, XGBoost
Dimensionality Reduction: PCA, EFA, SOM, Multidimensional Scaling
Deep Learning: ANN, RNN, CNN
Time Series: ARIMA, RNN, Facebook Prophet
Statistical Analysis: T-test, Chi-squared Test, ANOVA
A/B Testing: Thompson Sampling, Upper Confidence Bound