Global Datathon Workshop #2: Dataset Preprocessing and Preparation
About This Video
In this workshop, instructors cover data set preprocessing and cleaning. They discuss how to encode categorical variables and merge the categorical columns with the functional connectivity matrices based on participant IDs. They also address exploratory data analysis and handling NaN values. It will equip participants with technical skills in Python to clean and prepare the dataset for machine learning.
This workshop is the second of three Global Challenge workshops presented by WiDS Interns Caterina Ponti and Kylie Cancilla, and WiDS ambassador Liana Mendoza. They are all students at the University of San Francisco.
In This Video

Student, WiDS Intern
Data Science Student at the University of San Francisco and WiDS Intern.

Student, WiDS Intern
I am Kylie Cancilla, a Data Science major at the University of San Francisco. I am working with Women in Data Science this year to help provide resources for participants in the 2025 Global Datathon Challenge.

Student, WiDS Ambassador
Liana is a a Data Science major with a minor in Neuroscience who is driven by data, advocacy, and inclusive innovation. Her goals revolve around understanding how to empower healthcare through data-driven initiatives. Her previous experiences are varied and include graphic design and project management. She is also a firm advocate of gender sensitivity and women empowerment, leading and participating in organizations that push for equality.