Projects

Salesforce

Describes a basic data architecture that can be used to manage your personal finance on (free!) Trailhead orgs. As of 5/20/21 this project is a work-in-progress.

Data Architecture

Supervised Learning

Trains and optimizes several supervised machine learning model on the Census Income Data Set.

Details the criteria by which an appropriate model was selected. Iteratively tunes that final model to optimize performance using sklearn’s GridSearchCV. Visualizes changes in performance with changes in model complexity. Demonstrates the data exploration, preprocessing and visualization workflow necessary to prepare the data prior to training the model. Examines the relative importance of the features in the original dataset.

Deep Learning

Demonstrates training of a PyTorch deep learning model that managed 78.4% accuracy on a test to label images. Then, converts both the training and inference functions of the model to run from the command-line as a standalone Python command-line application.

Unsupervised Learning

Demonstrates utilization of principal component analysis and clustering on a large dataset to better understand the characteristics of a company’s customer base.

Visualization

Cleans and consolidates multiple, similar datasets, mapping the data to common categorical variables. Then, visualizes the result as several stacked barplots built according to visualization best practices using Matplotlib. The raw data utilized are the Kaggle Machine Learning and Data Science survey results for 2017, 2018, and 2019.

Demonstrates creation of a best-practices visual using Matplotlib and real-world weather data.

Using Matplotlib to Create a Best-Practices Visual

Demonstrates one method to dynamically color a plot based upon an input parameter.

Dynamically-Colored Plots in Matplotlib

Demonstrates manipulation and reshaping of real-world data and then creation of basic visuals to facilitate insights discussion in a masters thesis.

Real-World Histograms and Bar Charts

Binary Classifier

Outlines creation of a predictive model used to determine whether credit card applicants should be approved or rejected for a credit card based on various creditworthiness metrics.