Skip to main content
January 22, 2025

Global Datathon Workshop #2: Dataset Preprocessing and Preparation

About This Video

In this workshop, instructors cover data set preprocessing and cleaning. They discuss how to encode categorical variables and merge the categorical columns with the functional connectivity matrices based on participant IDs. They also address exploratory data analysis and handling NaN values. It will equip participants with technical skills in Python to clean and prepare the dataset for machine learning.

This workshop is the second of three Global Challenge workshops presented by WiDS Interns Caterina Ponti and Kylie Cancilla, and WiDS ambassador Liana Mendoza. They are all students at the University of San Francisco.

View the slides for this workshop.

View the notebook for this workshop.


In This Video
Student, WiDS Intern

Data Science Student at the University of San Francisco and WiDS Intern.

Student, WiDS Intern

I am Kylie Cancilla, a Data Science major at the University of San Francisco. I am working with Women in Data Science this year to help provide resources for participants in the 2025 Global Datathon Challenge.

Student, WiDS Ambassador

Liana is a a Data Science major with a minor in Neuroscience who is driven by data, advocacy, and inclusive innovation. Her goals revolve around understanding how to empower healthcare through data-driven initiatives. Her previous experiences are varied and include graphic design and project management. She is also a firm advocate of gender sensitivity and women empowerment, leading and participating in organizations that push for equality.