Data Analysis with Pandas for Scientists & Engineers
Build an end-to-end data analysis workbook that can be used as a basic template for other data analysis projects. Apply your understanding of major data analysis concepts through exercises with popular Python libraries like NumPy, Pandas, and XArray.
Course Overview
Data Analysis with Pandas for Scientists and Engineers is designed in two parts. The first part presents a typical data analysis workflow with the foundational ideas behind each step. Here students will be presented with major concepts and given short exercises to practice those ideas with NumPy, Pandas, and XArray.
The second part of the course is a practicum in which the workflow is used in conjunction with Pandas to work through a small data analysis project from beginning to end. Here, each session will remind students of the main workflow, teach how Pandas approaches that specific step, and then allow the students to put what they have learned into practice. In the end, the students will have built an end-to-end data analysis workbook that can be used as a basic template for other data analysis projects.
Prerequisite
This course requires basic proficiency with Python and the scientific Python stack. Some practical experience with Jupyter Notebooks, NumPy (ndarrays), Pandas (DataFrames), and scientific visualization in Python using Matplotlib are essential to working with the code and concepts presented in this course.
If you have taken Enthought’s Python Foundations for Scientists and Engineers, you have the requisite background knowledge for this course.
Lectures
Data Analysis Workflow – Introduction, Examples of Use Cases
Data Sources – Finding & Storing Data, Scraping Web, Databases, Formats
Preparing Data – Tidy Data, Missingness, Filling Gaps
Exploring Data – Summary Statistics, Visualization
Analysis & Modeling – Analysis & Modeling Use Cases
Pandas Practicum I – Practicum with Various Data Sources
Pandas Practicum II – Reshape, Pivot, Join, Merge
Pandas Practicum III – Dates & Times, Text Data, Categorical Data
Pandas Practicum IV – Multi-Level Indexing, Computations, Chaining
Pandas Practicum V – Automation, Building Analysis Notebooks
Packages
numpy, pandas, xarray
Download the syllabus for this course here.
Updated December 2023