Skip to main content

WiDS Datathon++ 2024


This challenge, available only to colleges and universities, can be offered as a project in advanced undergraduate or graduate courses. It can also serve as a capstone course, the base of an honors thesis, independent research, directed research, or a standalone 3-unit course. 

Challenge description:

Students will predict time to treatment of women patients who are diagnosed with metastatic triple negative breast cancers (metastatic TNBC). Metastatic TNBC is considered the most aggressive TNBC, and women who are diagnosed with it are among those who are in the most urgent need for timely treatment. Students will develop a model to predict how many days it takes for a patient to receive the first treatment for their cancer diagnosis based on patients’ characteristics. Differences in the wait time to get treatment is a good proxy for disparities in healthcare access.

Students are also encouraged to go beyond the predictive challenge and engage in discussions in order to generate additional insights and to understand the real-world implications of the results:

  • What are the social, economic, and geographic factors that show the strongest correlation with time to receive treatments of patients? (e.g., whether it is race/ethnicity, or locations, or incomes or something else)
  • Can you develop some graphs/dashboards that show the analytics of those factors? What stories can you tell with these data visualizations?
  • Are there interactions between the variables and/or interesting clusters? What might be the implications of these patterns on access to healthcare?
  • What are different approaches to address NAs in the data? What are the pros and cons of these approaches?
  • What additional information can you gain by performing NLP on different columns of the data? What can you observe about the right or left or quadrants of the breast?

The challenge will be open September 18, 2023 – June 30, 2024.

Learn about this year’s WiDS Datathon challenges.