I majored in Data Science and Statistics, and minored in Economics at Minerva Schools at the Keck Graduate Institute. I traveled, worked, and studied in seven different countries around the world ๐บ๐ธ ๐ฐ๐ท ๐ฎ๐ณ ๐ฉ๐ช ๐ฆ๐ท ๐ฌ๐ง ๐น๐ผ throughout my college education. Since I was young, I always enjoyed solving problems and building things. Writing software gives me the same kind of enjoyment I had when I built my own catapult as a nine year old kid.
Passions: Computer Science, Mathematics, Physics, Robotics, Economics, Finance
- Causal Inference
- Machine Learning
- Bayesian Inference
- Computational Modeling
- Econometrics
Causal Inference (using R, Python)
Using statistics to prove empirically a causal relationship, and calculating the true treatment effect.
-
Statistical matching
- Ensuring that the treatment and control group are similar pre-treatment
- Minimizing the effect of unaccounted variables (i.e., confounding variables)
-
Counterfactuals
- Using statistics to estimate the result of being assigned to the opposite group
- Calculating the true treatment effect
-
Multicollinearity, endogeneity
- Accounting for the possibility of variables and/or the error being correlated with each other
-
Hypothesis testing
- RCTs, ANOVA, T-test, Fisher's test of independence,
$\chi^{2}$ test, p-values
- RCTs, ANOVA, T-test, Fisher's test of independence,
Machine Learning (using Python)
Collecting and processing data to build all kinds of machine learning models.
- Classification
- Logistic regression, KNN, SVM, random forest, gradient boosting, neural networks
- Regression
- Linear & multiple regression, random forest, lasso & ridge regression, neural networks
- Clustering
- K-means clustering, fuzzy clustering, hierarchical clustering, density-based clustering
- Dimensionality reduction
- Principal component analysis, singular value decomposition, linear discriminant analysis
- Deep Learning
- Feedforward, convolutional & recurrent neural networks, multi-layer perceptron
- Natural Language Processing
- Tries, named-entity recognition, sentiment analysis, text summarization, topic modelling
Bayesian Inference (using Python, Stan)
Using Bayes Theorem to calculate the conjugate prior, likelihood, and posterior distributions over some hyperparameter.
-
Statistical distributions (conjugate prior - likelihood)
- Beta-Bernoulli / Beta-Binomial distribution
- Gamma-Poisson distribution
- Dirichlet-Categorical / Dirichlet-Multinomial distribution
- Inverse Gamma-Normal distribution
- Gamma-Exponential distribution
-
Generative models
- Directed graphical models
- Factor graphs
- Message passing
- Sum product algorithm
- Expectation propagation
-
More
- Using Bayes Theorem to infer probabilities
- Using Stan to compute samples from a given distribution
- Selecting appropriate test statistics
- Calculating confidence intervals
Computational Modeling (using Python)
Building computer programs that simulate real-life events and draw conclusions
-
Cellular Automata
- Renormalization, percolation, coarse graining
- The game of life, the ising model, forest fires, and traffic flow
-
Graph Theory
- Directed and undirected graphs
- Dijkstra's algorithm, BFS, DFS, random walk
- Topology / networks
- Spread of illnesses, information, political and social self-organization
-
Algorithms
- Monte Carlo simulation
- Markov Chain Monte Carlo (MCMC)
- Metropolis-Hastings algorithm
- Gradient Descent
- Page Rank
Econometrics (using Stata, R)
Building statistically sound economic models, proving causality, and drawing appropriate conclusions from data
-
Causal Inference and Hypothesis Testing
- Type I and type II errors
- One tailed and double tailed T-tests
- ANOVA, Fisher's test of independence, Chi-squared test, p-values
- Sum of squares total, due to regression, and due to the error (SST, SSR, SSE)
-
Fitness of Model
- Matching (propensity scores, genetic matching, etc)
- Heteroskedasiticity, confounding variables
- Multicollinearity, endogeneity
-
Econometric Models
- Regression discontinuity design (RDD)
- Instrumental variables (IV)
- Differences in differences (DD)
- Synthetic controls