Mid-Term Project β€” DATS 2102: Data Visualization for Data Science

A project assignment for the first half of the course (Weeks 1–6).


🎯 Objectives

By mid-semester, you will:

  • Apply foundational data visualization techniques (Weeks 1–6).
  • Demonstrate mastery of:
  • environment setup & reproducible notebooks,
  • tidy data principles & visual encodings,
  • distributions & variation,
  • wrangling with pandas,
  • perception-based design principles,
  • fair and effective comparisons.
  • Produce a mini data story using 2–3 datasets.

πŸ“– Project Description

Select a real-world dataset (from provided sources or external datasets of interest). Using the tools and concepts learned in the first six weeks, create a narrative notebook that:

  1. Introduces the dataset and research question(s).
  2. Cleans, reshapes, and prepares the data for visualization, demonstrating core pandas wrangling: selection/filtering, sorting, grouping + aggregation, joins/merges, and tidy reshaping.
  3. Produces at least 6–8 visualizations, including:
  4. At least one distribution plot (histogram/KDE/boxplot/ECDF).
  5. At least one comparison plot (dot plot, slope chart, or small multiples).
  6. At least one of your own visualizations revised and improved by reflecting on perception principles, showing how thoughtful design choices enhance clarity and fairness.
  7. At least one visualization with clear text/labels/annotations.
  8. Applies best practices for choice of color, scales, and labeling.
  9. Provides a written narrative explaining insights, choices, and design considerations.

πŸ“¦ Deliverables

  • Jupyter Notebook with all code, markdown explanations, and charts.
  • Rendered HTML file (via Quarto).
  • A short reflective essay (300–500 words) addressing:
  • What challenges did you face in cleaning/visualizing the data?
  • How did perception/design principles guide your choices?
  • Which visualization best communicates your main insight, and why?

πŸ“Š Suggested Datasets


πŸ—“οΈ Timeline

  • Final Submission (Deadline: October 26): Completed notebook, HTML export, and reflection.

🧾 Grading Rubric (20 pts total)

  • Data Wrangling & Preparation (4 pts): Appropriate cleaning, filtering, and reshaping.
  • Variety of Visualizations (5 pts): Includes required chart types; demonstrates range.
  • Application of Principles (4 pts): Perception, scales, baselines, labeling.
  • Narrative & Reflection (4 pts): Clear storyline; thoughtful discussion of design choices.
  • Technical Quality (3 pts):Β The notebook runs cleanly, is reproducible, and is well-organized.

βœ… Submission Checklist

Before submitting, make sure:

-