About the workshop

We will be taking a look at the basic concepts of data pipelines as well as practical use cases using Python and libraries like pandas, matplotlib, and tensorflow.

About you:

  • Some experience using the command line
  • Intermediate Python knowledge / use
  • Be able to apply what we learn and adopt to your use cases
  • Interested in data and systems
  • Aspring or current data engineering
  • Some knowledge about systems and databases (enough to be dangerous)

Our focus for the day

  • Greater understanding on how to apply data pipelines using the Python and libraries in the Python scientific ecosystem
  • Focus on concepts (rather than complex implementations)
  • Practical knowledge application
  • Create the building blocks needed for your day-to-day work

Keeping on track

You will find 🚦 across the tutorial examples. We will use this to identify how folks are doing over the workshop (if following along in person). These will indicate practical or hands-on portions of the tutorial.

Additional tutorial (PyCon US)

For another (much longer) tutorial integrating MYSQL and Twitter stream data check out https://github.com/trallard/airflow-tutorial

Also in the upcoming months I have planned: