Docker LabBeginner2.5 hours
Data Cleaning & EDA with Pandas
Clean a messy real-world dataset, handle missing values, detect outliers, and create visualizations to extract business insights.
Part of Data Science (Week 2)
What You'll Build
A cleaned dataset from a messy CSV with documented data quality checks, imputation strategies, and an EDA notebook with 10+ visualizations and business insights.
Tools Used
PythonPandasMatplotlibSeabornJupyter
Skills Practiced
Data wranglingExploratory data analysisStatistical visualization
Prerequisites
- Basic Python
Why This Matters in Real Jobs
Data scientists spend 60-80% of their time on data cleaning. Interviewers test this skill with messy datasets and expect you to explain your cleaning decisions with statistical reasoning.
Access This Lab
This lab is part of the Data Science course. Enrol to get access to all labs, projects, and career support.