Career Profile

Experienced data professional always looking to learn new and creative ways operate. The last couple years have seen a transition from heavy use in python for statistical results work to data engineer. All this to help round-out my knowledge and capabilities.

Experiences

Data Scientist

April 2022 - Present
BI Worldwide

First data science function within our data group to make sense of our unstructured data along with structured data to better support client needs and tap into small, incremental improvements in performance.

  • … more to come

Data Engineer

December 2020 - April 2022
Charter Communications

Our team is responsible for quantifying the user experience across our self-service product portals. SQL hiveQL presto aws terraform git

  • Calculate the user experience on Charters account portals through measuring api responses, page loads, and other custom metrics (terms) depending on the feature we're measuring performance for. These terms are scaled between 0 and 1, weight averaged & summed – giving product developers & owners a concise, relatable number to compare relative to past performance & other features directly.

  • Charter captures > 160M events per day, ~140M of them are events relevant to our pipelines, requiring forethought on appropriate index field usage and other efficient data operations: temp-tables, CTEs, aws Athena (presto) or hive (adhoc cluster or standalone cluster).

  • Maintain ETL pipelines that utilize HiveQL run on aws-emr clusters through scheduled aws-coordinators to ensure our pre-aggregations are complete before further aggregations are started.

  • Pipeline aws infrastructure is defined in terraform modules allowing us to only worry about any job-specific alterations while keeping best-practices regarding tagging/permissions/fleet management etc up to date through inheritance of common modules managed by our aws platform team.

  • Templated our pipelines to allow for single-day runs or back filling with addition of one additional parameter. Updates to gitlab CI/CD pipeline, shell script, and the hiveQL were required. This enhancement saves ~ 20 - 40 minutes on average every time a backfill for reprocessing is required.

Career skills upgrade

August 2020 - December 2020

Time off to focus on career development & gain AWS Solutions Architect certification.

Decision Scientist

December 2018 - August 2020
Ibotta, Inc

Our team was the analytical and technical support for our client success teams. Our work aims to provide automated, multifaceted campaign results via a multitude of tools include SQL, Spark, Python, and Apache Airflow. python pyspark airflow presto git test/holdout audiences

  • Authored pyspark/airflow ETL pipeline to calculate program results, storing results in s3 for nightly ingestion to our snowflake datastore feeding looker dashboards & explores.

  • Technical lead on our AB testing pipeline in object-oriented pyspark and airflow to measure statistical effectiveness of Ibotta platform.

  • Query optimization liaison for our team, consulting with our team on efficient SQL structure and operations to help reduce run time and compute resources.

  • Automated the power analysis to determine optimal control sizes along with stratifying the groups to ensure similar behavior across the treatment & control.

  • Ported repeated data analysis scripts into our internal python package for reproducibility and version control.

  • Addressed continual support requests with adding functionality to our internal python module for rest of the team to use.

Finance Data Analyst

August 2017 - December 2018
Johns Manville

I worked to provide a technical background to our data projects with SQL support, advanced tableau, and python on our team within the finance department. Our team operated in close contact with the CFO to be the central analytics hub and to provide advanced data & analytics support to Johns Manville finance. SQL Tableau python requirements-gathering

  • Introduced python to our team with ETL pipelines to feed our MSSQL database.

  • Built a model with keras (tensorflow backend) in python to estimate tax liabilities to bring potential errors to the attention of our tax department thus reducing the possibility of incurring consultant fees.

  • Managed analytics projects to completion supporting commercial sales analytics to internal tax reconciliation

Data Analyst

March 2015 - August 2017
BI Worldwide

With a team that provided multifaceted analytic support for multiple channels of a large, national-telecom client. We were on top of our descriptive analytics game and worked to apply a more systematic approach to analysis to extract deeper information from our data.

SQL RStats Tableau PowerPivot requirements-gathering data-communication

  • Mined our hosted blog and ran sentiment analyses with R to shed light on open-ended participant feedback.

  • Developed a logistic regression model to determine the probability of referrals selling, and experimented with clustering on our program population to delve into promotion results and answer why some groups perform better than others.

  • Integrated R into some of our analyses for both internal and external stakeholder with final reporting is rendered in HTML with the knitr and flexdashboard packages allowing for great deal of interactivity between the stakeholder and the data for those without Tableau licenses.

Project Coordinator

September 2014 - March 2015
BI Worldwide

Our team ran and operated the incentive programs for numerous automobile clients through an in-house developed web app, back-end databases and procedures, reporting, and client support.

  • Learned SQL on the job and ported all of our excel reports to SQL, saving our monday mornings.

  • Support Business Analysts with fielding of participant inquiries in a timely manner.

  • Responsible for development and execution of day-to-day sales reporting distributed to the client.

  • Complete documentation of database and Website procedures.

Data Analyst

June 2013 - September 2014
AmerisourceBergen

  • Designed, developed, implemented, and tested new processes and dashboards that improved data analysis and tracking, reduced cost, and improved process effectiveness and accuracy.

  • Directly oversaw10+ of these analysis processes that monitor our performance on compliance contracts valued at millions in quarterly revenue.

  • Consulted with end-users on their requirements to drive my report design, development and testing.

  • Communicated with our pharmaceutical buyers on actionable steps towards maximizing our inventory position whilst minimizing penalties associated with erratic purchasing activity.

  • Conducted mathematical analysis of inventory/service level data to ensure maximum reward payment from our contracted suppliers.

Projects

Some side projects worked on throughout free time

MotoGP Data - Dorna publishes MotoGP results lap-by-lap in highly formatted PDFs. I want to take that data and put it in a useable analysis format
Site Response Monitoring - A python script to test bandwidth to popular content providers and log to Google sheets.
Neural Network in pure numpy - Udacity project - A vanilla neural network implemented in pure numpy to predict bike sharing demand.
NLP - Machine Translation - Udacity project - A sequence to sequence model translating english text to french.
Sudoku solver - Udacity project - An intelligent agent that solves sudoku puzzles.
Isolation Agent - Udacity project - An agent utilizing alpha-beta pruning to play isolation.