Skip to main content

Data Preprocessing and Transformation in Machine Learning: Real World Data Use Case (WiDS Datathon 2024)

Join Us

In the realm of machine learning, the quality of data directly influences the performance and accuracy of models. The process of data preprocessing and transformation plays a pivotal role in shaping raw data into a format suitable for effective machine learning algorithms. Using a real-world dataset from the WiDS Datathon 2024 challenge, this workshop aims to delve into the fundamental concepts and demonstrate different techniques of data preprocessing and transformation for machine learning tasks.

Participants will be introduced to an overview of data preprocessing, including data cleaning, handling missing values, feature scaling, and feature engineering. Through hands-on exercises and practical examples, attendees will gain knowledge in utilizing popular Python libraries such as Pandas, NumPy, and Scikit-learn to preprocess and transform real-world data effectively.

Workshop Instructors

Kelly Hoang

Associate Director, Data Scientist, Gilead Sciences