Data Projects

Baseball Salaries:

Investigates trends in MLB player pay over the past thirty years. Specifically, examines the relationship between those most highly paid and the appearance of their teams at the World Series. The Python/Jupyter program begins by processing two csv files into Pandas DataFrames, merging and streamlining them, before employing Numpy for trend modeling.

Economic Development:

Python and Pandas analysis of which sovereign nations have grown (in GDP) most consistently and solidly over the course of the past three decades. Required significant data cleaning before presenting visually in Matplotlib (left video). In the second part of the project (right video), I grouped the nations by continent and plotted the Standard Deviation of each. I then formulated a geopolitical hypothesis for the discrepancies.

Global Climate Change:

Aggregates temperature data from a Pandas DataFrame for various world cities, then color-codes the average change and plots this information using BaseMap (left video). In the second part of the project (right video), I built a function that dynamically builds an extrapolation for the future temperature of each city. (In case you needed a more stark warning about climate change, the entire planet will be boiling within twenty thousand years, even if you assume only the moderate linear trend applied here.)

Domestic Dirty Energy Consumption

Draws on US Census data for 50 states to construct trends in type of power used. Calculates trend of clean (i.e. wind) versus dirty (i.e. coal) over the past five years, then charts this information in MatPlotLib (left video). The second part of the project (right video), identifies the single strongest trend, including type of energy, for each state.
Scroll to Top