Skip to main content
April 10, 2024

Data Preprocessing and Transformation in Machine Learning: WiDS Datathon 2024

About This Video

In the realm of machine learning, the quality of data directly influences the performance and accuracy of models. The process of data preprocessing and transformation plays a pivotal role in shaping raw data into a format suitable for effective machine learning algorithms. Using a real world data set from the WiDS Datathon 2024 challenge, this workshop aims to delve into the fundamental concepts and demonstrates different techniques of data preprocessing and transformation for machine learning tasks.

Participants will be introduced to an overview of data preprocessing, including data cleaning, handling missing values, feature scaling, and feature engineering. Through hands-on exercises and practical examples, attendees will gain knowledge in utilizing popular Python libraries such as Pandas, NumPy, and Scikit-learn to preprocess and transform real-world data effectively.

Google Colab Notebook for this workshop

Slides for this workshop

In This Video
Associate Director, Data Scientist, Gilead Sciences

Kelly completed her PhD in Information Sciences with focus in Natural Language Processing, Machine Learning, and their applications in Biomedical domain from University of Illinois at Urbana-Champaign. She currently work as a data scientist in Oncology group, Advanced Analytics department at Gilead Sciences, where they develop AI/ML products to support clinical trial development and operations.