Data Bootcamp

Regen Data Bootcamp

This is a WIP curriculum for a regen data bootcamp. We know everyone is very busy, so we want to design something which is very easy to manage, requires little time commitment, and taps people’s intrinsic motivation to learn on their own. We take inspiration from https://speedrunethereum.com/ (opens in a new tab) - the learners help build the next content for the course! Most learning will be async.

Latest Updates

2024-04-18: Pandas Sync

  • Hosted by Umar: https://regenlearnings.xyz/data-bootcamp (opens in a new tab)
  • The format will be a quick overview of the first three kaggle courses then some time to work through them together 👏🏽
  • Even if you haven’t started it’s not too late to jump in!
  • Now is a great time to catch up before we kick into the next phase with the bootcamp session on Thurs 4/24

2024-03-28: Pandas Sync

  • 💪 Most group members have finished up the first two kaggle mini-courses
  • 🤓 Some also checked out the datacamp materials (and had good experiences!) - very interactive
  • 🐼 We are now ready to start doing stuff with pandas
  • 🎯 Goal for the next 2 weeks will be for everyone to start doing some things with pandas AND sharing some of their work (no matter how silly or sloppy you might think it is)
  • 🙌 Share your notebooks / code snippets / colabs here and we will try to give some feedback
  • 📆 We will sync up again on a video call in two weeks at the same time!

2024-03-07: Cheetahs Sync

2024-02-16: Intro Call

We held the first Data Bootcamp Intro Call (opens in a new tab) on 2024-02-16. Here (opens in a new tab) is a link to what we covered.

The main outcome is that we have two learning tracks: Pandas (more beginner) and Foxes (more advanced). Umar, David, Humpty, Sardius, Doge, and Carl are helping facilitate.

The Pandas group is larger and focused on helping people learn the basics of Python and SQL and apply them to regen data.

The first Pandas learning challenge (prepared by DistributedDoge) is available here (opens in a new tab). We are asking you to do two courses on Kaggle (opens in a new tab) by 2024-03-22.

The Foxes group is smaller and focused on getting analysis work done on some specific projects. Each member has their own learning challenge.

Sync to head by joining the appropriate channels in the RegenLearnings.xyz Telegram (opens in a new tab).

Note: most groups meet every 2 weeks to review progress and the rest is fully async.

Learning Tracks

🐼 Panda Level

I want to start at the beginning. I know how to write formulas in Google Sheets and Notion databases, but I am new or extremely rusty at Python. I want to learn the basics of loading, transforming & visualizing data. I want to learn the basics of reading blockchain data via a blockexplorer, and querying that data with basic SQL queries. By the end of this course, I should be able to fire up a Jupyter Notebook, load a CSV file, run some functions on the data, and make a pretty chart.

Topics will cover:

  • Discussing the meaning of Impact Data Science
  • Setting up an easy coding environment on your computer
  • Python fundamentals (variables, data structures, functions)
  • Loading dataframes from CSV files
  • Exploratory data analysis in Pandas
  • Reading a block explorer/understanding transactions
  • Forming basic SQL queries
  • Making not beautiful charts

🦊 Fox Level

I want more data. I know how to work with dataframes, but my issue is extracting and cleaning data from different sources. I want to learn how to work with APIs, write crawlers, and join data that comes from different sources on the internet. By the end of this course, I should be able to write a GraphQL query to fetch data from Ethereum Attestation Service, normalize blobs of metadata, and ping the Etherscan API to get additional info about each attestation.

Topics will cover:

  • Working with popular web3 APIs (Dune, Flipside, Covalent)
  • Making requests and getting data back
  • Writing scripts that can index data from multiple sources
  • Logging and error handling (sorry, but shit will break a lot)
  • Getting ChatGPT to write SQL / GraphQL for you

Future Tracks

🐆 Cheetah Level

I want to share insights. I know how to get and process data, but I want to make beautiful charts and interactive data visualizations. I want to learn the different charts and visualization libraries, and I want control over every little styling detail. By the end of this course, I should be able to make an interactive Sankey diagram with Plotly, a beautiful heatmap with Datashader, a scatter plot with Seaborn, and a lightweight web app with Streamlit.

Topics will cover:

  • Make beautiful charts
  • Getting the small stuff right when it comes to styling
  • Working with Python charting libraries
  • Learning enough JavaScript and html/CSS to be a minor threat
  • Deploying data apps and charting widgets
  • Storytelling with data