Keynote: Can Data Science and AI Deliver on its Promise for Improving Public Health? | Manisha Desai
Manisha Desai, Kim and Ping Li Professor of Medicine and Biomedical Data Science, Stanford University presents a Keynote Talk at the 2024 WiDS Worldwide, Stanford conference.
WiDS 2024 Fireside Chat with Ellen Pao, CEO, Project Include moderated by Tina Tang, Head of Marketing, WiDS Worldwide.
In this podcast episode, Margot interviews Telle Whitney, a highly accomplished woman in the tech industry. Telle is best known for her 15 years as CEO of the Anita Borg Institute for Women and Technology, also known as AnitaB.
Leda Braga is founder and CEO of Systematica Investments, a hedge fund known for using data science-driven models to support its investment strategies. She explains how systematic investment management is data science applied to investment and how she believes it is the future of the financial industry.
Tahu Kukutai, a professor at the University of Waikato in New Zealand and a Māori woman, is leading data sovereignty initiatives that advocate for indigenous data ownership, guardianship, and governance.
Allison Koenecke, currently a postdoc at Microsoft Research and soon to be assistant professor at Cornell, discusses her decision to pursue a career in academia focused on algorithmic fairness and causal inference in public health.
Karina Edmonds, Global Head of Academies and University Alliances at SAP, has spent her career building bridges between business and academia. She is passionate about promoting fairness in data science by bringing more young people, women, and underrepresented groups into the field.
Fatima Abu Salem, a professor at the American University of Beirut, talks about how the conflict in the region has shaped her life journey and motivates her to apply data science for the public good to address challenges in Lebanon.
Karen Hao, Senior Editor at MIT Technology Review, discusses her experiences covering the latest research and social impacts of AI, and ethics washing in the tech industry.
Kristian Lum, a statistician and former research assistant professor at the University of Pennsylvania, describes how following her interests has led her on an ever-changing career path across business, public service, and academia.
Ya Xu, head of LinkedIn’s global data science team, explains how the company takes responsibility for data privacy and creating economic opportunities for all of its members.
Susan Athey, Economics of Technology Professor at the Stanford Graduate School of Business, brings an economist’s expertise and perspective to machine learning and data science.
Women face many roadblocks to careers in data science and other STEM disciplines. One Stanford professor is out to change perceptions and realities for women in these fields.
Timnit Gebru spoke to us when she was a research scientist and technical co-lead of Google’s Ethical Artificial Intelligence Team. In this episode, Timnit explains the importance of advocating for diversity, inclusion and ethics in AI.
Christiane Kamdem and Lama Moussawi discuss the importance of role models, mentors and giving back to empower women and girls to pursue data science careers.
Natalie Evans Harris, a leader on ethical and responsible use of data, explains how building trust through a shared vision and data “code of ethics” is essential to promote both innovation and privacy.
Srinija Srinivasan, Co-Founder, Loove opens WiDS Stanford 2023.
Biography:
Born in India and raised in Lawrence, Kansas, Srinija Srinivasan followed her siblings to college in California. Having studied artificial intelligence at Stanford and worked at a large-scale AI project after graduating, Srinija joined Yahoo! in 1995 as their fifth employee and self-titled Ontological Yahoo. She served as Vice President, Editor-in-Chief at Yahoo! for over 15 years, where her work centered on the human experience, from the categorization system of the Yahoo! Directory to editorial and policy issues globally. During that time she also chaired the board of non-profit SFJAZZ, and these experiences together inspired her to co-found Loove, a music venture exploring how commerce and technology can be guided by artistic values rather than letting our culture be led by market values. She’s a board member of the On Being Project and a vice chair of Stanford University’s Board of Trustees. She lives in Palo Alto, CA and Brooklyn, NY.
Trina Reynolds-Tyler, Data Director, Invisible Institute presents the Technical Vision Talk “(DIS) Proportionate Impacts of Policing in Chicago”. Through public data records requests the Invisible Institute received an unprecedented amount of data related to misconduct records of the Chicago Police Department. Beneath the Surface analyzed these records to uncover patterns of gender based violence at the hands of police. A volunteer team of over 200 community members generated training data for Judy, our nickname for the algorithm which then parsed through narratives of complaints in more than 27,000 misconduct records between 2011 and 2015. We then were able to run a targeted search and identify a range of testimony representing shared experiences; connecting people across time and space. But what proportion is significant enough to constitute as evidence of a deeper issue? How does the universe of information we use to define the numerator or denominator impact our willingness to deepen our questions? Where do we draw the line between significance and meaningfulness when using data science to understand policing in America?
Biography:
Trina Reynolds-Tyler is the Data Director at the Invisible Institute, an abolitionist, and a native of south side Chicago. She leads Beneath the Surface, a project employing machine learning to identify gender based violence at the hands of Chicago police. Trina works to document how communities unable to depend on the police are creating safety and accountability outside of the carceral state. As a data scientist, she centers the practice of narrative justice in her inquiries.
Trina organizes with Not Me We, and is serving on a University of Chicago council attempting to measure the institution’s impact on the south side population. She developed the skills to use data science for real world problems as a Pozen Center for Human Rights intern with the Human Rights Data Analysis Group (HRDAG), and was a Pearson Institute Fellow. Trina holds a masters degree in public policy from the University of Chicago.
—
Panel: Putting our values into practice in data science work
Moderator:
Megan Price, Executive Director, Human Rights Data Analysis Group (HRDAG). As the Executive Director of the Human Rights Data Analysis Group, Megan drives the organization’s overarching strategy, leads scientific projects, and presents HRDAG’s work to diverse audiences. Her scientific work includes analyzing documents from the National Police Archive in Guatemala and contributing analyses submitted as evidence in multiple court cases in Guatemala. Her work in Syria includes collaborating with the Office of the United Nations High Commissioner of Human Rights (OHCHR) and Amnesty International on several analyses of conflict-related deaths in that country. In 2022 she was named a Fellow in the American Statistical Association.
Panelists:
Jennifer Pan is a Professor of Communication and Senior Fellow at the Freeman Spogli Institute at Stanford University. Her research resides at the intersection of political communication and authoritarian politics. Using large-scale datasets on political activity in China and other authoritarian countries, her work answers questions about how autocrats perpetuate their rule; how political censorship, propaganda, and information manipulation work in the digital age; and how preferences and behaviors are shaped as a result. Her papers have appeared in peer-reviewed publications such as Science, the American Political Science Review, the American Journal of Political Science, and Journal of Politics. She graduated from Princeton University, summa cum laude, and received her Ph.D. from Harvard University’s Department of Government.
Trina Reynolds-Tyler, Data Director, Invisible Institute, an abolitionist, and a native of south side Chicago. She leads Beneath the Surface, a project employing machine learning to identify gender based violence at the hands of Chicago police. Trina works to document how communities unable to depend on the police are creating safety and accountability outside of the carceral state. As a data scientist, she centers the practice of narrative justice in her inquiries.
Trina organizes with Not Me We, and is serving on a University of Chicago council attempting to measure the institution’s impact on the south side population. She developed the skills to use data science for real world problems as a Pozen Center for Human Rights intern with the Human Rights Data Analysis Group (HRDAG), and was a Pearson Institute Fellow. Trina holds a masters degree in public policy from the University of Chicago.
Wendy Ku, Computer Vision Tech Lead, Senior Data Scientist, Getty Images presents the Technical Vision Talk “ML through a wide-angle lens: Real World Successes and Lessons Learned in Deploying ML Models”. Image search has been a well-established problem area across industries, with a wide range of applications including e-commerce, social media and search engines. As we collectively create and consume more visual content, image search capabilities are becoming increasingly more important. In recent years, multiple large-scale image-text models have been released, reinventing the performance of image-text understanding tasks. However, applying these generalized models out-of-the-box often results in less than desired performance. In practice, deploying and maintaining an image search system presents a different set of challenges.
Wondering what else is involved in a machine learning solution besides training and deployment? Or how real world model evaluations differ from Kaggle scoreboards? This talk will cover the less discussed journey of bringing language and image-text models to production.
Biography:
Wendy is a Senior Data Scientist at Getty Images, where she develops multilingual and visual-language representation models to improve users’ search experience. She leads Getty Images’ efforts on diagnosing bias and improving fairness in machine learning systems. Prior to joining Getty Images, Wendy was involved in product and operations optimization projects in cybersecurity, consumer finance and restaurant companies. When she’s not working, Wendy enjoys working on her art and running.
—
Irene Dankwa-Mullan, Chief Health Equity Officer at Merative & Affiliate Professor at GWU Milken Institute School of Public Health presents Technical Vision Talk “Harnessing AI and Data Science for Health Equity within Communities”. A robust data science agenda can help support communities in their interventions to achieve health equity, and measure progress toward ensuring quality and optimal health for all. However, there are challenges for data science in promoting community-engaged interventions addressing health disparities. This talk will provide a background on the role of data science in promoting a vision for a productive health AI ecosystem of research, technology development and implementation to improve community health and advance health equity.
Biography:
Irene Dankwa-Mullan is an affiliate professor in the Department of Health Policy and Management, Milken Institute School of Public Health at The George Washington University. She is a nationally recognized industry physician, scientist, thought leader, author with over 20 years of diverse leadership experience in primary care, healthcare, businesses, and the community. She also serves in a strategic advisory role for various health technology start-ups. Irene most recently served as Chief Health Equity Officer at IBM Watson Health and provided leadership for the data and evidence strategy for implementation of technology and clinical decision-support solutions. She was previously Deputy Director for extramural scientific programs at the National Institute. Irene has published widely on health equity, community and public health and building AI technologies for social good.
—
Panel: Data democratization: a powerful means for creating sustainable and equitable communities
Moderator:
Michela Taufer is an ACM Distinguished Scientist and holds the Dongarra Professorship in High-Performance Computing in the Department of Electrical Engineering and Computer Science at the University of Tennessee Knoxville (UTK). She earned her undergraduate degree (Laurea) in Computer Engineering from the University of Padova (Italy) and her doctoral degree (Ph.D.) in Computer Science from the Swiss Federal Institute of Technology or ETH (Switzerland). From 2003 to 2004, she was a La Jolla Interfaces in Science Training Program (LJIS) Postdoctoral Fellow at the University of California San Diego (UCSD) and The Scripps Research Institute (TSRI), where she worked on interdisciplinary projects in computer systems and computational chemistry.
Michela is well-known for her work in establishing trustworthy scientific discoveries on heterogeneous cyberinfrastructures. Throughout her career, she has put the principle of trustworthiness into practice. She has promoted scientific computing for the general population through volunteer computing, defined accurate scientific applications on accelerators and GPUs, and developed in situ analysis methods for scientific workflows on converging HPC and Cloud platforms. She has been serving as the principal investigator of several NSF collaborative projects. She has significant experience in mentoring a diverse population of students on interdisciplinary research and establishing long-lasting workforce development.
Panelists:
Priya Donti, Co-Founder and Executive Director, Climate Change AI (CCAI). Climate Change AI, a global non-profit initiative to catalyze impactful work at the intersection of climate change and machine learning, which she is currently running through the Cornell Tech Runway Startup Postdoc Program. She will also join MIT EECS as an Assistant Professor in Fall 2023. Her research focuses on developing physics-informed machine learning methods for forecasting, optimization, and control in high-renewables power grids. Priya received her Ph.D. in Computer Science and Public Policy from Carnegie Mellon University, and is a recipient of the MIT Technology Review’s 2021 “35 Innovators Under 35” award, the ACM SIGEnergy Doctoral Dissertation Award, the Siebel Scholarship, the U.S. Department of Energy Computational Science Graduate Fellowship, and best paper awards at ICML (honorable mention), ACM e-Energy (runner-up), PECI, the Duke Energy Data Analytics Symposium, and the NeurIPS workshop on AI for Social Good.
Julia Stewart Lowndes, Director, Openscapes is a marine ecologist working at the intersection of actionable environmental science, data science, and open science. Julia’s main focus is mentoring teams to develop technical and leadership mindsets and skills for data-intensive research, grounded in climate solutions, inclusion, and kindness. She founded Openscapes in 2018 as a Mozilla Fellow and Senior Fellow at the National Center for Ecological Analysis and Synthesis (NCEAS) at the University of California Santa Barbara (UCSB), having earned her PhD from Stanford University in 2012 studying drivers and impacts of Humboldt squid in a changing climate.
Nikki Tulley, Doctoral Student, University of Arizona; Indigenous Researcher, NASA Ames Research Center. Nikki is from the Navajo Nation (NN), an Indigenous Nation located in the United States. The work and research Nikki does is influenced by her upbringing. Born and raised on the NN Reservation, she has seen firsthand the impacts of water access and water quality challenges rural communities face. The NN has wicked water problems related to anthropogenic activities and climate change. Now, as an Indigenous Scientist, she recognizes that opportunity to braid traditional ecological knowledge and western science together to address water challenges. Taking a step beyond braiding the two knowledge systems together she has begun to use Earth Observation satellite imagery to tell a story of the changes being monitored from space and those observed from the landscapes. Nikki’s passion is empowering communities through data access and capacity building. She believes that community involvement in research can significantly aid in seeking solutions for resilient and sustainable communities.
Megan Price, Executive Director, Human Rights Data Analysis Group (HRDAG) presents the Technical Vision Talk “What is the Cost of Being Wrong? Machine learning models are a versatile tool in a statistician‚Äôs analytical toolbox. As George Box is credited with saying, ‚ÄúAll models are wrong, some are useful.‚Äù How can we identify the contexts when machine learning models are most useful? How can we identify the contexts where they pose the most risk for harm? These questions will be answered using examples from work by the Human Rights Data Analysis Group”.
Biography:
As the Executive Director of the Human Rights Data Analysis Group, Megan drives the organization’s overarching strategy, leads scientific projects, and presents HRDAG’s work to diverse audiences. Her scientific work includes analyzing documents from the National Police Archive in Guatemala and contributing analyses submitted as evidence in multiple court cases in Guatemala. Her work in Syria includes collaborating with the Office of the United Nations High Commissioner of Human Rights (OHCHR) and Amnesty International on several analyses of conflict-related deaths in that country. In 2022 she was named a Fellow in the American Statistical Association.
—
Julia Stewart Lowndes, Director, Openscapes presents Technical Vision Talk “Openscapes: Supporting Kinder Science for Future Us”. At Openscapes, we believe open science can accelerate interoperable, data-driven solutions and increase diversity, equity, inclusion, and belonging in research and beyond. Our main activity is mentoring environmental and Earth science teams in open science, and connecting and elevating these researchers both through tech like R, Python, Quarto, and JupyterHubs and communities like RLadies, Black Women in Ecology Evolution, and Marine Science, Ladies of Landsat, and NASA. We will share stories and approaches about open science as a daily practice ‚Äì better science for future us ‚Äì and welcome you to join the movement.
Biography:
Julia Stewart Lowndes, PhD, is a marine ecologist working at the intersection of actionable environmental science, data science, and open science. Julia’s main focus is mentoring teams to develop technical and leadership mindsets and skills for data-intensive research, grounded in climate solutions, inclusion, and kindness. She founded Openscapes in 2018 as a Mozilla Fellow and Senior Fellow at the National Center for Ecological Analysis and Synthesis (NCEAS) at the University of California Santa Barbara (UCSB), having earned her PhD from Stanford University in 2012 studying drivers and impacts of Humboldt squid in a changing climate.
—
Gayatree Ganu, Vice President, Data Science, Facebook presents Keynote Address “Put the horse before the cart: Why ‚Äúusers first‚Äù is important for a good monetization strategy”.
Meta has over 3B users on our platform engaging with our different products and services. Meta also makes over $100B annually through advertising. There is a strong connection between user engagement on our platform and how we build a sustainable business. Our mission statement for ads at Meta is “Make meaningful connections between people and businesses”. Connecting users to monetization or ads is an important part of Meta‚Äôs long term success. In this talk I will describe the frameworks to connect user engagement and revenue potential, allowing us to focus our products and services. We will also discuss how high quality and relevant ads can actually bring more engagement to our platform, making it a win-win situation. We will cover a lot of fun and challenging data science topics from weighted metrics, producer-consumer experimental setups, counterfactuals, incrementality, all at an extraordinary scale of 3B users and $100B!
Biography:
Gayatree Ganu leads the Engagement Ecosystem and Monetization Data Science teams at Facebook. The Engagement Ecosystem team’s mission is to inform Facebook’s strategy through better understanding and forecasting the health of the app. The Monetization team’s mission is to give everyone a voice and to champion economic prosperity. Gayatree leads a Data Science team with a diverse portfolio spanning modeling and machine learning, product optimizations of user experience, and strategic innovations. Gayatree has a PhD in Computer Science in Search and Recommendations from Rutgers University. She joined Facebook (now Meta) in 2013 and has worked on several problems and product areas through the last 10 years.
Gayatree believes deeply in fairness and equality in opportunity and is passionate about bringing more representation and providing sustained support to women and under-represented minorities in Tech. She leads recruiting for all Data Science roles at Meta, and is helping build an organization that values diverse perspectives as well as strong technical and analytical skills.
—
Kathryn Hymes, Lead of Product and Innovation, Médecins Sans Frontières-USA presents the Technical Vision Talk “Productizing Data for Humanitarian Aid Applications”. In humanitarian efforts focused on delivering medical interventions in low-resource settings, there are many opportunities for data science to improve decision-making and produce valuable insights, both on the ground and in long-term operations. This talk will focus on product approaches to data that support insights for long-term engagement with some of the work of M√©decins Sans Fronti√®res, a global aid organization focused on public health.
Biography:
Kathryn Hymes is a technologist, computational linguist, and game designer. She currently serves as the lead of product and innovation at Médecins Sans Frontières-USA. She leads a humanitarian tech team building new products rooted in modern engineering practice to aid in MSF’s global work. Previously she was the head of international product expansion at Slack and an advisor at Airtable. She is a fellow at the Berkman Klein Center for Internet and Society with a focus on how playful design can contribute to a better digital life. Kathryn is a co-founder of Thorny Games (https://thornygames.com/), an award-winning design studio that regularly collaborates with universities, nonprofits and museums to apply playful design to hard problems. Her writing has appeared in The Atlantic, Wired, and The New York Times. Kathryn holds an MS in Computational and Mathematical Engineering from Stanford, an MA in Linguistics from Stanford, and a BS in Math from UCLA.
—
Hear stories of women in data science from around the Globe!
Becki Cook: Brisbane, Australia
Staying Connected Through Community Outreach
Philomena Mbura: Nairobi, Kenya
Finding Work/Life Balance
Amanda Milberg: Colorado, USA
Building a Supportive Network
Alexandre Lapene, Tech Advisor, Data Science, Total Energies & Myriam Fayad, Product & Value Manager, Total Energies talk with Lisa Martin at WiDS 2023 at Stanford University.
Shir Meir Lador talks with Lisa Martin & Hannah Freitag at WiDS 2023 at Stanford University.
Kelly Hoang, Data Scientist, Gilead talks with Lisa Martin & Tracy Zhang at WiDS 2023 at Stanford University.
Irene Dankwa-Mullan, Chief Medical Officer, Marti Health talks with Lisa Martin & Tracy Zhang at WiDS 2023 at Stanford University.
What key principles of design and data viz do you need to know to create effective and clear graphs? This talk will cover preattentive attributes, Gestalt principles, and principles of color use. It will provide the key concepts from design and data viz research that you need to know to communicate data effectively. The talk will include examples to demonstrate applying the concepts and comparing data viz effectiveness.
This workshop was conducted by Jenn Schilling, Founder of Schilling Data Studio.
The integrated use of data science and machine learning in healthcare has grown in popularity in recent years with many applications becoming engrained in our healthcare systems. Recent advancements in digitalization of healthcare data, production of masses of data from both operational activities in a healthcare setting and at a patient level from sensors and scans etc, has enabled many more applications and research.
In this session we will discuss data science applications in the healthcare industry as well as some of the ethics and considerations required when delivering Data Science solutions in the industry.
This workshop was conducted by Mrs Emily Godson (née Wheaton), Data Scientist / Big Data Mining – Senior at Hitachi Vantara.
Linear regression is a fundamental tool in statistics and data science for modeling the relationship between different parameters. It can be used for prediction, forecasting and error reduction by fitting a predictive model between a response variable and a collection of explanatory variables based on an observed data set. Through linear regression analysis, we can quantify the strength of the linear relationship between the response and different explanatory variables, and we can identify parameters that may contain redundant information.
This workshop introduces the basics of simple and multiple linear regression. We will present both mathematical theory and applications in the context of real data sets — ranging from survey results collected by the US National Center for Health Statistics (NHANES), to real estate listings in Sacramento, CA. After the talk, the R code used will be provided, so attendees can revisit examples of how to apply this foundational modeling method.
This workshop was conducted by Laura Lyman, Instructor of Mathematics, Statistics, and Computer Science (MSCS) at Macalester College
The 6th Annual Women in Data Science (WiDS) Datathon launches in January 2023, in the lead up to the WiDS conferences in March 2023. In this year’s datathon challenges participants…
Precision medicine aims to learn from data how to match the right treatment to the right person at the right time. One common goal in precision medicine is the estimation of optimal dynamic treatment regimens (DTRs), sequences of decision rules that recommend treatments to patients in a way that, if followed, would optimize outcomes for each individual and overall, in the targeted population. In this presentation, we will describe how the precision medicine framework formalizes sequential clinical decision-making and briefly review a subset of the most popular strategies for learning optimal dynamic treatment regimes. We will then invite the workshop group to ideate and discuss the critical opportunities and challenges for the translation of DTRs to clinical and community care, the role of stakeholder engagement and cross-disciplinary collaboration, and considerations for evaluating DTRs in practice.
This workshop was conducted by Nikki Freeman and Anna Kahkoska from the University of North Carolina at Chapel Hill.
Slides and resources used in this workshop: https://bit.ly/precision_medicine_slides
The usage of machine learning (ML) has been growing exponentially. Its significant power in generalization and a large amount of available data make machine learning indispensable. In parallel, humanity is focused more than ever on space exploration, developing cutting-edge Earth Observation (EO) technology. Have you ever wondered how these two can be combined?
One domain that can be greatly benefited from this coalition is agriculture. With climate change and population rise, maintaining natural ecosystems while enhancing agricultural productivity and supporting farmers is of primary importance. In this sense, ML and EO technologies are the key enablers in developing actionable recommendations for farmers and policymakers to achieve resilient agriculture. In this workshop, we discuss the usage of ML for EO-related applications, focusing on agriculture and ecosystem services. We will present two applications of how ML bridges the gap between scientific knowledge and actionable advice for farmers and policymakers. The first application will consist of a predictive ML model related to the occurrence of pests in cotton fields. The second application will showcase the combination of a geographical model and an ML algorithm to identify the local-specific contribution of agricultural management to ecosystem services. For both applications, there will be live demonstrations using Python and R. By the end of this workshop, we hope you will be acquainted with establishing the link between machine learning, earth observation, and sustainable agriculture. Wishing you a fruitful exploration of this field having provided you with the necessary tools to start your journey!
This workshop was conducted by Roxanne Suzette Lorilla and Ornela Nanushi from the National Observatory of Athens.
Slides and materials used in this workshop: https://bit.ly/agroecological_applica…
Can drones help prevent natural disasters? Wildfires have become highly destructive in recent years, ravaging the environment and human lives. In this hands-on workshop, build a wildfire detection system with autonomous drones. Explore cutting-edge methods to detect fire outbreaks and predict their direction of spread. Gain skills in simulation and AI that you can apply to life-saving problems.
This workshop was conducted by Shweta Singh, Sheeba Ransing and Arushi Kapurwan from Mathworks.
Resources used for this workshop can be accessed on Github: https://bit.ly/wids_catching_fire
Slides for this workshop: https://bit.ly/3DvenVR
AI and machine learning are increasingly being used across industry and government to make decisions impacting many parts of our lives. These technologies could determine who gets a job interview, what products are advertised to different audiences, or what government resources are allocated to different populations. Bias can become embedded in the development of AI systems either through the data and/or the development and evaluation of algorithms. This can result in inaccurate predictions that can significantly impact people’s lives. Several WiDS talks describe how bias in machine learning can impact everything from online ads to search recommendations to bus routes.
This workshop is targeted toward those who are new to coding. This presentation will teach an individual how to analyze their personal Spotify data, create visualizations and prepare their data to be used in business processes. This demonstration will use Python so a new coder will understand foundational coding syntax that can be used in other languages.
This workshop was conducted by Nicole Crosdale, a Graduate student at the University of Florida.
Resources and slides for this workshop: https://bit.ly/spotify_resources
You’ve heard it before – Python vs MATLAB vs R but in reality, programming languages are often used together! In this hands-on workshop, you’ll learn how to use MATLAB and Python together with practical examples. Specifically, you’ll learn how to: – Call Python libraries from MATLAB – Call user-defined Python commands, scripts, and modules – Manage and convert data between languages – Package MATLAB algorithms to be called from Python
This workshop was conducted by Heather Gorr, Senior Product Marketing Manager, MATLAB and Grace Woolson, Student Competitions Technical Evangelist – Data Science at Mathworks.
Resources and slides for this workshop: https://bit.ly/matlab_python_slides
In this workshop, I would like to share my journey transitioning from an electrical engineer focusing on ultra-low power integrated circuit design to an AI Solution Architect. Through specific examples of how the two fields connect, I will discuss the fundamentals of deep learning and data-driven hardware design. I will start with my experience in the semiconductor industry designing application-specific and data-dependent hardware for IoT systems and then discuss how this experience led to my career in AI specializing in areas including high-performance computing, edge computing, and more recently, federated learning.
I hope the attendees will not only find the technical content informative but also see how a growth mindset truly helped me find my career passion. Having a broad knowledge of the eco-system that supports AI applications – such as the hardware stack, hardware level optimization, and application-specific hardware design – can be very helpful to understanding and choosing the right platform for operational AI. I also hope to use this opportunity to connect with fellow AI/hardware enthusiasts in WiDS.
This workshop was conducted by Chu Lahlou, AI Specialized Cloud Solution Architect at Microsoft.
The least squares method is one of the most widely used techniques in data science and is used to fit a linear model to data. In this workshop, we will study least squares problems from a linear algebraic perspective and discuss the techniques to solve them.
This workshop assumes that you have a basic understanding of linear algebra including concepts such as matrices, rank, range space, orthogonality, and matrix decompositions (Cholesky, QR, SVD).
This workshop was conducted by Abeynaya Gnanasekaran, a Senior Research Engineer at Raytheon Technologies Research Center.
Learn how you can apply AI in your field without extensive knowledge in programming. This hands-on session includes a quick recap on the fundamentals of AI and two exercises where you will learn how to classify human activities using MATLAB® interactive tools and apps:
– Accessing and preprocessing data acquired from a mobile device
– Classifying the labeled data using two apps: The Classification Learner app and the Deep Network Designer app
At the end of the workshop, you will be able to design and train different machine learning and deep learning models without extensive programming knowledge. In addition, you will also learn how to automatically generate code from the interactive workflow. This will not only help you to reuse the models without manually going through all the steps but also to learn programming or advance your coding skills.
This workshop was conducted by Gaby Arellano Bello and Neha Sardesai, Senior Application Engineers in Education at Mathworks.
Access resources for this workshop: https://bit.ly/low_code_ai_resources
Responsible AI is reaching new heights these days. Companies have started exploring Explainable AI as a means to explain the results better to senior leadership and increase their trust in AI Algorithms. This workshop will entail an overview of this area, importance of it in today’s era, and some of the practical techniques that you can use to implement it. As a bonus, it will also cover some industry use cases and limitations of these techniques. Join me in unboxing this black box!
This workshop was conducted by Supreet Kaur, Assistant Vice President at Morgan Stanley.
Slides for this workshop: https://bit.ly/explainableai_slides
This workshop aims to enable young data scientists to start their first ML project. It would help them understand the process from gathering data to building their ML model. Building an ML model is easy, but building it the correct way is a lot harder than known.
This workshop was conducted by Manogna Mantripragada, Data Scientist at Greenlink Analytics.
Access resources for this workshop: https://bit.ly/energy_burden_analysis…
During the workshop, we show a simple exploratory data analysis using Deepnote. We will focus on personal data from Camino de Santiago pilgrimage which we retrieved from our Strava API and show you how to get it from your own device. Using this data we explain a theory about Exploratory Data Analysis and show some use cases.
This workshop was conducted by Tereza Vaňková and Alleanna Clark of Deepnote.
Resources used in this workshop:
– https://bit.ly/deepnote_notebook
– https://bit.ly/deepnote_slides
Best practices in data visualization and dashboard design are numerous and sometimes contradictory, but a straightforward method to apply design thinking to creating dashboards is effective and universally applicable. This session will cover the details of design thinking and how it can be applied to dashboard development to create impactful dashboards that meet user needs and provide valuable insights.
This workshop was conducted by Jenn Schilling, Senior Research Analyst at the University of Arizona.
Exploring Hidden Markov Models | Julia Christina Costacurta
Hidden Markov Models (HMMs) are used to describe and analyze sequential data in a wide range of fields, including handwriting recognition, protein folding, and computational finance. In this workshop, we will cover the basics of how HMMs are defined, why we might want to use one, and how to implement an HMM in Python. This workshop might be of particular interest to attendees from May 25’s “Intro to Markov Chains and Bayesian Inference” session. Introductory background in probability, statistics, and linear algebra is assumed.
This workshop was conducted by Julia Christina Costacurta, PhD Candidate at Stanford University
Useful resources for this workshop:
– https://bit.ly/hmm_presentation
– https://bit.ly/hmm_tutorial_notebook
Make answering ‘what if’ analysis questions a whole lot easier by learning about state-of-the-art, end-to-end applied frameworks for causal inference.
We will cover:
Microsoft’s “Do Why” Package Causal Impact in Python – DoWhy | An end-to-end library for causal inference — DoWhy | An end-to-end library for causal inference documentation (microsoft.github.io)
Bayesian Causal Impact in R
MLE Causal Impact in Python
Bonus: AA Testing, when to use and why it matters
We will apply these models in the context of understanding the impact of a marketing rewards campaign, as well as understand the impact from a product/feature upgrade
This workshop was conducted by Jennifer Vlasiu, Data Science & Big Data Instructor at York University
Useful resources for this workshop:
– https://bit.ly/github_casual_impact
Image classification is a task in the Computer Vision domain that takes in an image as input and outputs a label for that image. Deep learning is the most effective modern method for modeling this task. In this interactive workshop, we will walkthrough a Jupyter Notebook which will overview how to perform multi-class image classification in Python using the PyTorch library. The intention is to give the audience a broad overview of this task of classification and inspire participants to explore the vast fields of visual recognition and computer vision at large.
This workshop was conducted by Cindy Gonzales, Data Science Team Lead for the Biosecurity and Data Science Applications Group at Lawrence Livermore National Laboratory
Useful resources for this workshop:
– https://bit.ly/deep_learning_files
– https://bit.ly/deep_learning_notebook
As data scientists, the ability to understand our models’ decisions is important, especially for models that could have a high impact on people’s lives. This may pose several challenges, as most models used in the industry are not inherently explainable. Today, the most popular explainability methods are SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-Agnostic Explanation). Each method offers convenient APIs, backed by solid mathematical foundations, but falls short in intuitiveness and actionability.
In this workshop/article, I will introduce a relatively new model explanation method – Counterfactual Explanations (CFs). CFs are explanations based on minimal changes to a model’s input features that lead the model to output a different (mostly opposite) predicted class. CFs have been shown to be more intuitive for humans to comprehend and provide actionable feedback, compared to traditionalSHAP and LIME methods. I will review the challenges in this novel field (such as how to ensure that the CF proposes changes which are feasible), provide a birds-eye view of the latest research and give my perspective, based on my research in collaboration with Tel Aviv University, on the various aspects in which CFs can transform the way data science practitioners understand their ML models.
This workshop was conducted by Aviv Ben Arie, Data Science Manager at Intuit
Research proves that the human brain processes visualizations better than text. And data visualizations prove that further.
Data visualization is the last phase in the data life cycle. It is the art and science of making data easy to understand and consume for the end user. Data visualizations present clusters of data in an easy-to-understand layout and that’s the reason it becomes mandatory for large amounts of complex data. Ideal data visualization shows the right amount of data, in the right order, in the right visual form, to convey the high priority information to the right audience and for the right purpose. If the data is presented in too much detail, then the consumer of that data might lose interest and the insight.
There are innumerable types of visual graphing techniques available for visualizing data. The right visualization arises from an understanding of the totality of the situation in context of the business domain’s functioning, consumers’ needs, nature of data, and the appropriate tools and techniques to present data. Ideal data visualization should tell a true, complete and simple story backed by data effectively, while keeping it insightful and engaging.
This workshop was conducted by Pariza Kamboj, Professor at Sarvajanik College of Engineering & Technology (SCET).
Useful resources for this workshop:
– Workshop #1: https://youtu.be/lRBuknaPRNI
– Jupyter code: https://bit.ly/jupyter_notebook2
– https://bit.ly/cars3_data
– https://bit.ly/execution_google_colab
– https://bit.ly/anaconda_installation_…
Markov chains are a special type of random process which can be used to model many natural processes. This workshop will be a gentle introduction to Markov chains, giving basic properties and many examples. The second part of the workshop will focus on one specific application of Markov chains to data science: Sampling from posterior distributions in Bayesian inference. Introductory background in probability, statistics, and linear algebra is assumed.
This workshop was conducted by Mackenzie Simper, PhD Student at Stanford University.
Slides for this workshop: https://bit.ly/markov_chains_ppt
A propensity model attempts to estimate the propensity (probability) of a behavior (e.g., conversion, churn, purchase, etc.) happening during a well-defined time period into the future based on historical data. It is a widely used technique by organizations or marketing teams for providing targeted messages, products or services to customers. This workshop shares an open-sourced package developed by Google, for building an end-to-end Propensity Modeling solution using datasets like GA360, Firebase or CRM and using the propensity predictions to design, activate and measure the impact of a media campaign. The package has enabled companies from e-commerce, retail, gaming, CPG and other industries to make accelerated data-driven marketing decisions.
This workshop was conducted by Lingling Xu, Bingjie Xu, Shalini Pochineni and Xi Li, data scientists on the Google APAC team.
Useful resources for this workshop:
– Workshop #1: https://youtu.be/rQhQca8RCuM
– https://bit.ly/propensity_modeling_pa…
– https://bit.ly/bigquery_export_schema
– https://bit.ly/ga_sample_dataset
– https://bit.ly/ml_windowing_pipeline
The WiDS Worldwide conference took place in March 2022, held in-person at Stanford University and online. The conference featured keynotes, technical talks, panel discussions, and more. You’ll want to experience the energy in the room, hearing from data science thought leaders and conference attendees.
Neural networks have been widely celebrated for their power to solve difficult problems across a number of domains. We explore an approach for leveraging this technology within a statistical model of customer choice. Conjoint-based choice models are used to support many high-value decisions at GM. In particular, we test whether using a neural network to model customer utility enables us to better capture non-compensatory behavior (i.e., decision rules where customers only consider products that meet acceptable criteria) in the context of conjoint tasks. We find the neural network can improve hold-out conjoint prediction accuracy for synthetic respondents exhibiting non-compensatory behavior only when trained on very large conjoint data sets. Given the limited amount of training data (conjoint responses) available in practice, a mixed logit choice model with a traditional linear utility function outperforms the choice model with the embedded neural network.
This workshop was conducted by Kathryn Schumacher, Staff Researcher in the Advanced Analytics Center of Expertise within General Motor’s Chief Data and Analytics Office.
A propensity model attempts to estimate the propensity (probability) of a behavior (e.g., conversion, churn, purchase, etc.) happening during a well-defined time period into the future based on historical data. It is a widely used technique by organizations or marketing teams for providing targeted messages, products or services to customers. This workshop shares an open-sourced package developed by Google, for building an end-to-end Propensity Modeling solution using datasets like GA360, Firebase or CRM and using the propensity predictions to design, activate and measure the impact of a media campaign. The package has enabled companies from e-commerce, retail, gaming, CPG and other industries to make accelerated data-driven marketing decisions.
This workshop was conducted by Lingling Xu, Bingjie Xu, Shalini Pochineni and Xi Li, data scientists on the Google APAC team.
Useful resources for this workshop:
– https://bit.ly/github_propensity_mode…
– https://bit.ly/bigquery_export_schema
– https://bit.ly/ga_sample_dataset
– https://bit.ly/ml_windowing_pipeline
The workshop would focus on the basic to intermediate levels of SQL. We will start with querying a database, using filters to clean the data. Joining different tables. Aggregate functions and use of ‘CASE WHEN’ for better query performances. Subqueries and Common Table Expressions (CTEs) and a comparison between them. Use of window functions. Lead and lag functions and the scenarios when they can be used. Pivot tables and when not to use them!
This workshop was conducted by Sreelaxmi Chakkadath, Data Science Master’s student at Indiana University Bloomington.
Useful resources for this workshop:
– PostgreSQL install link: https://www.postgresql.org/
– https://bit.ly/sql_workshop_script
– https://bit.ly/sql_workshop_codes
– https://bit.ly/sql_ppt_slides
In the current era, Data Science is rapidly evolving and proving very decisive in ERP (Enterprise Resource Planning). The dataset required for building the analytical model using data science, is collected from various sources such as Government, Academic, Web Scraping, API’s, Databases, Files, Sensors and many more. We cannot use such real-world data for analysis process directly because it is often inconsistent, incomplete, and more likely to contain bulk errors. We often hear the phrase “garbage in, garbage out”. Dirty data or messy data riddled with inaccuracies and errors, result in a bad/improperly trained model which in turn might result in poor business decisions and sometimes even hazardous to the domain. Any powerful algorithm is failed in providing correct analysis when applied to bad data. Therefore, data must be curated, cleaned and refined to be used in data science and products based on data science. To perform these tasks, “Data Preparation” is required which includes two methods that are: Data Pre-processing, and Data Wrangling. Most data scientists spend the majority of their time in data preparation.
This workshop was conducted by Pariza Kamboj, Professor at Sarvajanik College of Engineering & Technology (SCET).
Useful resources for this workshop:
– https://bit.ly/jupyter_code
– https://bit.ly/cars3_dataset
– https://bit.ly/execution_google_colab
– https://bit.ly/anaconda_installation_…
Maria Gargiulo, Statistician, Human Rights Data Analysis Group, talks with theCUBE’s Stephanie Chan for WiDS 2022
License
Creative Commons Attribution license (reuse allowed)
Show les
How can we make sense of the unseen world? Using AI, sensors & IoT for scene exploration | Mathworks
Have you wondered about being able to detect buried objects? Do you think your mobile device can be used to detect these buried objects? Metal is all around us and is often not seen but buried. The detection of metal is in many places on Earth. In fact the detection of metal is connected to a variety of applications such as: to provide insight regarding land use, detection of historic artifacts, determine the presence of various devices, and more.
In our workshop, we will explore using your own mobile device as a metal detector in your local environment. During this workshop we will provide an overview of the basics of sensors, AI, and IoT which will be required for building a prototype of our application. We’ll do hands-on exercises where you will acquire data from sensors, obtain summary statistics on the acquired data, and train a human activity classifier to understand what was done while data was being collected. We will also have an engaged discussion regarding topics to be mindful of with respect to this application such as considerations regarding the collection and usage of location data. You will leave motivated and ready to use sensors, AI, and IoT in your own projects via MATLAB!
Workshop presenters:
– Louvere Walker-Hannon, Application Engineering Senior Team Lead, MathWorks
– Loren Shure, Consulting Application Engineer, MathWorks
– Sarah Mohamed, Senior Software Engineer, MathWorks
– Shruti Karulkar, Quality Engineering Manager, MathWorks
Michelle Rodriquez Serra, Professor and Researcher at Universidad del Pacifico delivers the virtual opening address at the WiDS Worldwide conference. Michelle has been a WiDS ambassador since 2017.
Debra Satz, Dean of the School of Humanities and Sciences, Stanford University, delivers the Opening Address at the WiDS Worldwide conference.
Debra is the Vernon R. and Lysbeth Warren Anderson Dean of the School of Humanities and Sciences at Stanford University, the Marta Sutton Weeks Professor of Ethics in Society, Professor of Philosophy, and, by courtesy, Political Science.
Cecilia Aragon, Professor, Human Centered Design & Engineering, University of Washington, presents a Keynote at the WiDS Worldwide conference.
Very often, the words ‘rigorous’ and ‘human-centered’ have been used as opposites in technical fields, with the implication that a focus on human aspects makes science ‘soft’ or ‘insufficiently technical’. This is a false dichotomy that Cecilia will argue in this talk.
While extraordinary advances in our ability to collect, analyze, and interpret vast amounts of data have been transforming the fundamental nature of data science, the human aspects of data science, including how to support scientific creativity and human insight, how to address ethical concerns, and the consideration of societal impacts, have been less studied. Yet these human issues are becoming increasingly vital to the future of data science. Cecilia will reflect on a 30-year career in data science in industry, government, and academia, discuss what it means for data science to be both rigorous and human-centered, and speculate upon future directions for data science.
Maria Gargiulo, Statistician, Human Rights Data Analyst Group, presents a Technical Vision Talk at the WiDS Worldwide conference.
Collecting data on human rights violations in conflict settings is difficult and dangerous, and the data that results is often incomplete on multiple levels. Some victims� stories are never recorded, and those whose stories are documented may still be missing critical information about the victim, the perpetrator, or other contextual details about the violation. Furthermore, the data that is documented may not be statistically representative of the victim population as a whole. Drawing population-level inferences from this data without correcting for the missingness risks incorrectly answering questions about patterns of violence.
This talk will demonstrate how multiple systems estimation and multiple imputation can be used together to address both levels of missingness in order to draw population level inferences that are statistically valid and include a measure of uncertainty.
WiDS Worldwide panel: Data Science in Healthcare: Opportunities & Challenges
Moderated by Tina Hernandez Boussard, Associate Professor, Stanford University
Panelists:
– Sylvia K. Plevritis, Chair of Biomedical Data Science, Stanford University
– Tanveer Syeda-Mahmood, IBM Fellow, IBM Research Center
– Jinoos Yazdany, Chief of Rheumatology, Zuckerberg San Francisco General Hospital
Beyond Bias: Algorithmic Unfairness, Infrastructure and Genealogies of Data | Alex Hanna | WiDS 2022
Alex Hanna, Director of Research, DAIR Institute, presents a Technical Vision Talk at the WiDS Worldwide conference.
Problems of algorithmic bias are often framed in terms of lack of representative data or formal fairness optimization constraints to be applied to automated decision-making systems. However, these discussions sidestep deeper issues with data used in AI, including problematic categorizations and the extractive logics of crowd work and data mining.
In this talk Alex will make two interventions: first by reframing of data as a form of infrastructure, and as such, implicating politics and power in the construction of datasets; and secondly discussing the development of a research program around the genealogy of datasets used in machine learning and AI systems.
Tierra Bills, Assistant Professor of Civil and Environmental Engineering and Public Policy, UCLA, presents a Technical Vision Talk at the WiDS Worldwide conference.
Should regions invest in more buses on transit routes, or new bus routes to provide greater transportation accessibility for vulnerable communities? What mix of transportation improvements will offer the greatest boost in accessibility for travelers who most need it? Such questions can be addressed using travel demand analysis tools.
This presentation will summarize various biases in travel data that arise due to underrepresentation of vulnerable populations, how they may come to be, and how such biases can influence travel modeling outcomes.
WiDS Worldwide panel: Algorithms and Data for Equity
Moderated by Jenny Suckale, Associate Professor, Stanford University
Panelists:
– Tierra Bills, Assistant Professor of Civil and Environmental Engineering and Public Policy, UCLA
– Jessica Granderson, Director for Building Technology, White House Council on Environmental Quality
– Ling Jin, Research Scientist, Lawrence Berkeley National Laboratory
Vidya Setlur, Director of Tableau Research, Tableau, presents a Keynote at the WiDS Worldwide conference.
In this keynote, Vidya will discuss how natural language can be leveraged in various aspects of the analytical workflow ranging from smarter data transformations, visual encodings, autocompletion to supporting analytical intent, to conversational interfaces. With a better understanding of how users explore data in their flow of analysis, can people doing analysis be supported by more intelligent tools? In this keynote, we will explore this question.
Nadia Fawaz, Senior Staff Applied Research Scientist – Tech Lead Inclusive AI at Pinterest, presents a Technical Vision Talk at the WiDS Worldwide conference.
Through this tech talk one can gain knowledge of how machine learning technologies are paving the way for more inclusive inspirations in Search and in our augmented reality technology Try-On, and are also driving advances for more diverse recommendations across the platform. Developing inclusive AI in production requires an end-to-end iterative and collaborative approach.
WiDS 2022 Fireside Chat with Susan Wojcicki, CEO of YouTube, and Google Felllow Diane Tang. Moderated by Professor Margot Gerritsen.
Chiara Sabatti, Professor of Biomedical Data Science and Statistics at Stanford University, presents a Technical Vision Talk at the WiDS Worldwide conference.
In a world where large comprehensive datasets are readily available in digital form, scientists engage in data analysis before formulating precise hypotheses, with the goal of exploring and identifying tantalizing patterns. In this talk Professor Sabatti will help us review some classical approaches to quantifying the strength of evidence, identify some of their limitations, and explore novel proposals. We will underscore the connections between clear, precise reporting of scientific evidence and �social good�.
Denice Ross, U.S. Chief Data Scientist, White House Office of Science and Technology Policy presents a Tech Vision Talk at the WiDS Worldwide conference.
Listen to Denice, the first female Chief Data Scientist of the United States, discuss how data science will be critical for delivering many programs equitably, where they are needed most. Hear how you can help.
The WiDS Educational Outreach program aspires to take data science to secondary school students. Through the program we strive to educate and inspire young minds by facilitating relevant courses and paths to consider future careers involving data science, artificial intelligence (AI) and other related areas.
Watch this video to learn of the Education Outreach collaborations with schools around the world from Hyderabad, India to Dar es Salaam, Tanzania, and more.
Cecilia Aragon, Professor, Human Centered Design & Engineering, University of Washington, sits down with SiliconANGLE’s Lisa Martin as part of the WiDS Worldwide Conference.
John Furrier & Lisa Martin wrap up WiDS 2022 & the Women in Tech: International Women’s Day event from Stanford University.
Join us online on March 7, 2022, for the Women in Data Science (WiDS) Worldwide conference, a technical conference featuring outstanding women doing exceptional work in data science and related fields, in a wide variety of domains. Everyone is welcome and encouraged to attend. Broadcasted LIVE from Stanford University 8am – 5pm PST.
Alex Hanna, Director of Research, The DAIR Institute, talks with Lisa Martin for WiDS 2022 at Stanford University.
Vidya Setlur, Director of Tableau Research, sits down with SiliconANGLE’s Lisa Martin.
License
Creative Commons Attribution license (reuse allowed)
Show les
Tahu Kukutai, a professor at the University of Waikato in New Zealand and a Māori woman, is leading data sovereignty initiatives that advocate for indigenous data ownership, guardianship, and governance. On a new WiDS Podcast episode Tahu delves into indigenous data sovereignty, as she walks us through her experiences and initiatives.
Data Science workflows typically entail using Machine Learning.
Machine Learning can provide insight into various datasets and can assist with automating various types of analysis.
In this workshop you will explore a process for getting started with implementing Machine Learning interactively to train a model to predict tsunami intensity and implement other relevant tasks.
This workshop was conducted by Louvere Walker-Hannon, and Heather Gorr from Mathworks.
Karina Edmonds, Global Head of Academies and University Alliances at SAP, has spent her career building bridges between business and academia. On the WiDS Podcast she talks about her passion about promoting fairness in data science by bringing more young people, women, and underrepresented groups into the field.
In this third workshop in linear algebra, we will investigate the link between Principal Component Analysis and the Singular Value Decomposition. Along the way, we are introduced to several linear algebra concepts including linear regression, eigenvalues and eigenvectors and conditioning of a system. We will use shared python scripts and several examples to demonstrate the ideas discussed.
This workshop builds on the previous 2 workshops in linear algebra (Part I and Part II), and we will assume that the linear algebra concepts introduced in those workshops are familiar to the audience. They include: vector algebra (including inner products, angle between vectors), matrix-vector multiplications, matrix-matrix multiplications, matrix-vectors solves, singularity, and singular values.
Links:
1. Code is available for viewers to follow along: https://github.com/lalyman/lin-alg-wo…
2. The covariance matrix is defined for centered X, and the inequality n 1 given is strict.
This workshop was conducted by Laura Lyman, phD student at Stanford University, ICME.
Want to learn more about trends like AI, IoT and wearable tech? In one hour, we will cut through the hype by building a “smart” fitness tracker using your own mobile device. We’ll do hands-on exercises: you’ll acquire data from sensors, design a step counter and train a human activity classifier. You will leave motivated and ready to use machine learning and sensors in your own projects!
This workshop was conducted by Louvere Walker-Hannon, Shruti Karulkar, & Sarah Mohamed from MathWorks.
How can sharing stories help us as a community? How do we learn how to find a story from the events of someone else’s life or our own? How can this relate to our own tendency as data-scientists to connect the dots, to find meaning through patterns? Join us in this WiDS workshop on telling and sharing stories where we will address these questions and learn how our stories are important in shaping the community we want to see in Data Science.
This workshop was conducted by Izzy Aguiar, phD student at Stanford University, ICME.
In this workshop, you will learn about the core concepts of BML – how it is different from the frequentist approaches, building blocks of Bayesian inference and what known ML techniques look like in a bayesian set-up. You will also learn how to use various sampling techniques for bayesian inference and why we need such techniques in the first place. The workshop will also provide links and materials to continue your Bayesian journey afterwards.
This workshop is meant as an introduction to select BML modules – we strongly recommend you to continue exploring the world of bayesian once you have taken this first step.
This workshop was conducted by Ashwini Chandrashekharaiah & Debanjana Banerjee at Walmart Global Tech.
Recommender systems are playing a major role in e-commerce industry. They are keeping users engaged by recommending relevant content and have a significant role in driving digital revenue.
Following tremendous gains in computer vision and natural language processing with deep neural networks in the past decade, the recent years have seen a shift from traditional recommender systems to deep neural network architectures in research and industry.
In this workshop, we focus on temporal domain from perspective of both traditional recommender systems and deep neural networks. We first start with the classic latent factor model. We introduce temporal dynamics in the latent factor model and show how this improves performance. We then move into sequential modelling using deep neural networks by presenting state-of-the-art in the field and discuss the advantages and disadvantages.
This workshop was conducted by Aleksandra Cerekovic & Selene Xu at Walmart Gobal Tech.
Welcome to the world of artificial intelligence (AI) and augmented reality (AR)! This workshop explains AI and AR via hands on exercises where you will interact with your augmented world. You will learn about applications where the technologies of AI+AR are combined, their limitations, and their impacts in society. You’ll leave armed with code, inspiration, and an ethical framework for your own projects!
Artificial intelligence (AI) is used in a variety of industries for many applications. AI can be combined with other technologies to assist with understanding implications of certain aspects of applications. In this workshop, you explore how pose estimation results implemented using Deep Learning are impacted based on a location which is provided using augmented reality. These combined technologies provide insight into how poses could be interpreted differently based on a scene. This workshop also raises awareness regarding consequences of using AI for applications that are different from its originally intended use, which could lead to both technical and ethical challenges.
Specific topics that will be covered in this workshop are listed below:
• understand how AI and AR can be used for applications
• explore how to implement AI and AR
• discover what tools can be used to implement AI and AR
• review code that implements pose estimation using AI and changing background scenes using AR
• gain guidance regarding challenges to address societal impacts of the results from applications that use AI and AR
In addition to receiving an overview of terminology and an understanding of the workflows for each topic, code will be provided to demonstrate how to implement these workflows with tools from MathWorks.
This workshop was conducted by Louvere Walker-Hannon, Shruti Karulkar, & Sarah Mohamed from MathWorks.
Data science is being applied in a growing number of domains that affect everyone’s lives, in healthcare, financial services, agriculture, resource management, and beyond. While data science has huge potential for good, there are also unintended consequences. Data scientists need to take steps to mitigate as many unintended consequences as they can using Responsible Data Science — a set of policies, procedures, and best practices to ensure algorithmic fairness, transparency, and explainability.
Natural language processing has direct real-world applications, from speech recognition to automatic text generation, from lexical semantics understanding to question answering. In just a decade, neural machine learning models became widespread, largely abandoning the statistical methods due to its requirement of elaborate feature engineering. Popular techniques include use of word-embeddings to capture semantic properties of words. In this workshop, we take you through the ever-changing journey of neural models while addressing their boons and banes.
The workshop will address concepts of word-embedding, frequency-based and prediction-based embedding, positional embedding, multi-headed attention and application of the same in unsupervised context.
This workshop was conducted by Riyanka Bhowal, Senior Data Scientist at Walmart Gobal Tech.
To celebrate Juneteenth 2021, we revisit our pledges, report progress, and renew the commitments we made on Juneteenth last year, in the wake of George Floyd’s death. We reinforced our commitment to extend our outreach to the communities, organizations, universities, and schools that serve Black and underrepresented minority communities.
Karen Hao, Senior Editor at MIT Technology Review, discusses her experiences covering the latest research and social impacts of AI, and ethics washing in the tech industry, on a recent WiDS Podcast episode.
In this workshop, Dora Demszky, a Stanford PhD student, illustrates how natural language processing (NLP) can be used to answer social science questions. The workshop will focus on applying NLP to analyze the content of 15 US history textbooks used in Texas, to analyze the representation of historically marginalized people and groups.
The workshop is based on a paper (https://journals.sagepub.com/doi/pdf/…) that also has an associated toolkit, and it will provide examples of how this toolkit can be used using a Jupyter notebook that will be made available.
Want to learn more about trends like AI, IoT and wearable tech? In less than one hour, we will cut through the hype by building a “smart” fitness tracker using your own mobile device.
We’ll do hands-on exercises: you’ll acquire data from sensors, design a step counter and train a human activity classifier. You will leave motivated and ready to use machine learning and sensors in your own projects!
This workshop was conducted by Louvere Walker-Hannon, Shruti Karulkar, & Sarah Mohamed from MathWorks.
Prerequisite: We will assume that you are familiar with the vector and matrix algebra.
This the second workshop devoted to linear algebra, which forms the foundation of many algorithms in data science. In part I of the series we introduced vector and matrix algebra, and briefly looked at the intriguing and ever so useful Singular Value Decomposition (SVD). In this workshop, we will take a deeper into the SVD. We will explain how it is derived, how it can be computed, and also how it is used.
This workshop is taught by Professor Margot Gerritsen and Stanford ICME PhD student, Laura Lyman.
Cecilia Aragon, a professor in the in the Department of Human Centered Design & Engineering at the University of Washington in Seattle, explains on a recent WiDS Podcast episode how overcoming her fears to become an aerobatic pilot propelled her forward in her career as a data scientist, professor, and author.
Emily Miller, Senior Data Scientist at Drivendata.org hosts a workshop on ‘Actionable Ethics for Data Scientists’ in which she illustrates the different types of ethical concerns that arise in the course of data science work, grounding these in concrete examples of times where things have gone wrong.
Eileen Martin, Assistant Professor at Virginia Tech hosts a workshop on ‘Why we love arrays for data science’ in which she walks through some of the basics of computer architecture and how it affects the performance of our codes for common data analysis techniques.
Madeleine Udell, Assistant Professor at Cornell hosts a workshop on ‘Automating Machine Learning’ in which she surveys interesting strategies for automated machine learning.
Sita Syal, Ph.D. Candidate of Mechanical Engineering at Stanford University hosts a workshop on ‘Design Thinking for Data Science Problems’.
Debanjana Banerjee, Data Scientist and Sinduja Subramaniam, Staff Data Scientist with Walmart host a workshop ‘Evolution of Applied Recommender Systems’ where they take you through the whirlwind journey of the recommender system from GroupLens in the 1990s, Content Based Filtering, Matrix Factorization and Hybrid Recommender Systems in the late 2000s all the way to DeepLearning based recommenders of today. The workshop will address foundational concepts such as user-item interaction matrix, user/item profiles, cold-start problem, sparsity, scalability, etc. along with mathematical formulation for different types of recommender systems using applications in Retail.
Megan Price, Executive Director and Maria Gargiulo, Statistician with Human Rights Data Analysis Group (HRDAG) host a workshop on ‘Data Processing and Statistical Models to Impute Missing Perpetrator Information’ where they use methods from statistics and computer science to help answer questions about mass violence using incomplete and unrepresentative datasets from the context in which HRDAG works and how open-source tools are crucial to their analytical projects.
Have an opportunity to Meet-the-Speakers from WiDS Worldwide! Speaker Emily Fox, Distinguished Engineer at Apple and Professor at the University of Washington is interviewed by Joy Ku, Director, Communications & Engagement, Mobilize Center, Stanford University.
Have an opportunity to Meet-the-Speakers from WiDS Worldwide! Speaker Geetha Manjunath, Founder and CEO of Niramai is interviewed by Radhika Kannan, Staff Technical Program Manager, Intuit
Have an opportunity to Meet-the-Speakers from WiDS Worldwide! Speaker Kalinda Griffiths, Scientia Lecturer at Centre Big Data Research in Health, UNSW is interviewed by Margot Gerritsen, Professor at Stanford University.
Have an opportunity to Meet-the-Speakers from WiDS Worldwide! Speaker Dina Machuve, Lecturer and Researcher at Nelson Mandela African Institution of Science and Technology is interviewed by Mahadia Tunga, Co Founder and Director Data Science, Research and Capacity Development of Tanzania Data Lab
Have an opportunity to Meet-the-Speakers from WiDS Worldwide! Speaker Fatima Abu Salem, Associate Professor at the American University of Beirut is interviewed by Lama Moussawi, Associate Dean for Research and Faculty Development at the American University of Beirut.
Have an opportunity to Meet-the-Speakers from WiDS Worldwide! Speaker Karina Edmonds, VP, Head of Academies and University Alliances at SAP is interviewed by Deepa Gautam-Nigge, Senior Director, Global Lead SAP Next-Gen Ecosystem, SAP.
Have an opportunity to Meet-the-Speakers from WiDS Worldwide! Speaker Gina Papush, Global Chief Data and Analytics Officer at Evernorth is interviewed by Lori Sherer, Partner at Bain & Co.
Best of WiDS features Ema Rie on her talk ‘The Fusion of Science and Fashion’ from WiDS Tokyo @ Yokohama City University, 2020!
Best of WiDS features Sanghamitra Bandhyopdhyay in her ‘Fireside Chat’ from WiDS Bengaluru @ Intuit, 2020!
Best of WiDS features Timnit Gebru on her talk ‘Understanding the Limitations of AI: When Algorithms Fail’ from WiDS Stanford 2019!
Best of WiDS features Been Kim on her talk ‘Interpretability for Everyone’ from WiDS Stanford 2020!
Best of WiDS features Marzyeh Ghassemi on her talk ‘Improving Healthcare with Machine Learning’ from Stanford 2019!
Best of WiDS features Hila Gonin on her talk ‘Gender Bias in Words Embeddings’ from Tel Aviv 2019!
Panel discussion on ‘Diversity and Data Science Education’
Moderator: Talitha Washington, Professor of Mathematics, Clark Atlanta University and Director, Atlanta University Center
Panelists:
-Jo Boaler, Professor of Education, Stanford University
-Karina Edmonds, VP, Head of Academies and University Alliances, SAP
-Loreto Bravo, Data Science Institute Director, UDD
Best of WiDS features Latanya Sweeney on her talk ‘Data Science to Save the World’ from Stanford 2018!
Best of WiDS features Carla Viera on her talk ‘Artificial Intelligence’s Black Box’ from Sao Paulo 2020!
Best of WiDS features Leda Braga on her talk ‘When Data Science IS the Business’ from Stanford 2018!
Best of WiDS features Jessica Santos on her talk ‘Deep Learning on Medical Images’ from WiDS São Paulo 2019!
Best of WiDS features Madeleine Udell on her talk ‘Filling in Missing Data with Low Rank Models’ from WiDS Stanford 2019!
Emily Glassberg-Sands | Data Science for Unlocking Teaching & Learning at Scale | WiDS Stanford 2019
Best of WiDS features Emily Glassberg-Sands on her talk ‘Data Science for Unlocking Teaching & Learning at Scale’ from WiDS Stanford 2019!
Best of WiDS features Fanny Chevalier on her talk ‘Don’t Look. See! Are We Blinded by Data (Visualization)?’ from Stanford 2020!
Suzanne Weekes, Professor, Mathematical Sciences, Worcester Polytechnic Institute and Executive Director, SIAM (Society for Industrial and Applied Mathematics) starts the WiDS Worldwide conference with an Opening Address.
Tech Talk: The joys and perils of leveraging mechanistic models in health ML | Emily Fox | WiDS 2021
Emily Fox, Distinguised Engineer at Apple and Professor at the University of Washington explores the hybrid approaches that combine the domain knowledge of mechanistic models with the flexibility and expressivity of machine learning methods. She explore these ideas through two use cases: Glucose forecasting in Type 1 diabetes and modeling the relationship between mobility and transmission in the COVID-19 pandemic.
Panel discussion on ‘The Democratization of Data’
Moderator: Margot Gerritsen, Professor at Stanford University,
Panelists:
-Mary Gray, Senior Principal Researcher at Microsoft Research and Associate Professor, The School of Informatics, Computing, and Engineering at Indiana University
-Zhamak Dehghani, Director of Next Tech Incubation, Thoughtworks
-Amanda Obidike, Executive Director, STEMi Makers Africa
Learn more about WiDS ambassador and Professor at Yokohama City University, Yoko Ono, as you watch her go through a day in her life!
Kalinda Griffiths, Scientia Lecturer at Centre for Big Data Research in Health, UNSW Sydney discusses priority issues when identifying Indigenous people in the national data in Australia’s colonial context.
Danielle Jiang, Assistant Director, Monetary Authority of Singapore, discusses why we need AI systems to be responsible, “FEAT” Principle and Project Veritas – Fairness Ethics Accountability and Transparency and assessing Fairness of AI based credit lending system and how it is done both technically and practically.
Dina Machuve, Lecturer and Researcher, Nelson Mandela African Institution of Science and Technology, discusss how with the help of CNNs, farmers will have the potential to better diagnose poultry diseases and improve livestock health in small to medium scale farming (crop and livestock) which accounts for 70% of the food production of the developing world and supports over 380 million farming households.
Fatima Abu Salem, Associate Professor at the American University of Beirut reports on a series of works associated with the Syrian conflict, with help from data obtained from the Violations Documentation Center (VDC). Fatima presents on fake news detection, predicting primary health care demand by Syrian refugees in Lebanon, and understanding some notions of Syrian refugee mobility in Turkey, all seen as instigated by “peaks’’ in the Syrian war, revealed through the VDC. She also presents a brief overview of in-progress projects with a social impact, in application to smart irrigation, predicting birth defects in Lebanon using air pollution data, and quantifying anti-refugee bias across Lebanese news corpora.
Maria Schuld, Senior Researcher at Xanadu, and the University of KwaZulu-Natal provides an overview of quantum machine learning research and illustrate that quantum algorithms can be trained like neural nets, but look formally very similar to kernel methods.
Panel discussion on ‘Ethics and Responsible Data Science’
Moderator: Shir Meir Lador, Data Science Group Manager, Intuit
Panelists:
-Andrea Martin, Leader IBM Watson Center Munich & EMEA Client Centers, IBM Distinguished Engineer, IBM
-Monica Scannapieco, Head of the Division “Information and Application Architecture”, Italian National Institute of Statistics
-Nazareen Ebrahim, AI Ethics Officer, Socially Acceptable – South Africa
Daniela Braga, Founder and CEO of DefinedCrowd discusses how her company, DefinedCrowd approaches the process of developing AI beginning early in the data collection phase. Daniela discusses that we cannot safely build AI with the processes used to build software. AI must build it differently, because once it is released it will be very hard to control.
Hulya Emir-Farinas, Director of Data Science at FitBit discusses how machine learning is a key capability in making any solution smart and more personalized.
Kristian Lum, a statistician and research assistant professor at the University of Pennsylvania, describes how following her interests has led her on an everchanging career path across business, public service, and academia, on a recent episode of the WiDS Podcast.
Manisha Desai is a professor of medicine (research) and of biomedical data science, and director of the Quantitative Sciences Unit at Stanford University. She is an expert in the design and analysis of clinical trials and epidemiologic studies across multiple diseases, including COVID-19. In a recent episode of the WiDS Podcast, she provides some insights into the challenges and progress of COVID-19 clinical trials.
To commemorate Juneteenth, we are reaching out to clearly state our actions and intentions in response to recent events. In the past weeks and months the US and the world have been reeling–first with a global pandemic, followed by the systemic injustice, inequality, and racism that the pandemic and most recent acts of police brutality exposed.
Women in Data Science (WiDS) is an annual conference at Stanford University that hopes to inspire and educate data scientists and support women in the field. I missed the opportunity to buy tickets because I was underground caving, but thankfully there were regional events planned to coincide, so I jumped at the opportunity to attend WiDS at UC Berkeley.
The Women in Data Science (WiDS) Conference at Stanford University is the hub of the global WiDS conference that includes 150+ regional events in 50+ countries worldwide. WiDS Stanford is a one-day technical conference that features amazing thought leaders in data science, machine learning, and artificial intelligence (AI) from academia, industry, non-profits, and government.
This video shows the highlights of the WiDS Stanford 2020 conference which featured keynotes, technical talks, a data ethics panel, and a career panel, as seen through the eyes of a Stanford undergraduate student. WiDS conference presenters covered a broad range of topics from multiple domains including data ethics and privacy, healthcare, data visualization, natural language processing, and more.
Ya Xu manages LinkedIn’s global team of data scientists that manage data science projects across the company’s products, sales, marketing, economics, infrastructure, and operations. On a recent episode of the WiDS Podcast, she says the company takes active responsibility over the data they collect to ensure fairness and protect privacy. They are very proactive about how they maintain their members’ trust, either with how they share the data externally or leverage the data to create opportunities.
This open-to-all webinar on-demand explores challenges and opportunities from working with healthcare data, and discuss distinct issues around the technology and the clinical aspect of healthcare machine learning. The panel discusses privacy and compliance, reproducibility, data sensitivity, data complexity, and the end-to-end workflow of AI-based solutions that impact healthcare in the United States and globally.
Speakers:
– Vani Mandava, Director, Data Science, Microsoft Research
– Carly Eckert MD MPH, Director of Clinical Informatics, KenSci
– Leo Anthony Celi MD MS MPH, MIT, Beth Israel Deaconess Medical Center
– Marzyeh Ghassemi PhD, Assistant Professor, University of Toronto
– Meredith Lee PhD, Executive Director, West Big Data Innovation Hub
Download webinar slides: bit.ly/wids_datathon_webinar_slides
More information: widsconference.org/datathon
Fanny Chevalier, Assistant Professor at University of Toronto delivers a Technical Vision Talk at WiDS Stanford University on March 2, 2020:
We are constantly required to make decisions about the world we live in. But are we good judges of how things work and what is best to do in each situation? Dr. Chevalier’s talk will explore why we may not always make well-informed decisions, even with best intentions, and even when our motivations are driven by careful examination of data. She will challenge the ways we leverage data for analysis and communication, and propose strategies that embrace the imperfect, subjective nature of human’s perception.
Moderated by Margot Gerritsen, WiDS Co-Director, Stanford University
Panelists:
– Aslihan Demirkaya, Research Scientist, Vianai Systems, Inc
– Lucy Bernholz, PhD, Senior Research Scholar, Stanford Center on Philanthropy + Civil Society
– Lynn Kirabo, PhD student, Carnegie Mellon University
Moderated by Martina Lauchengco, Operating Partner, Board Member, Costanoa Ventures
Panelists:
– Rukmini Iyer, Distinguished Engineer, Bing Advertising Marketplace & Serving, AI & Research, Microsoft
– Talithia Williams, Associate Dean, Associate Professor of Mathematics, Harvey Mudd
– Denice Ross, Senior Fellow, National Conference on Citizenship Fellow, Georgetown University
– Lillian Carrasquillo, Insights Manager, Spotify
Tsu-Jae King Liu, Dean of Berkeley School of Engineering at University of California, Berkeley delivers a Keynote presentation at WiDS Stanford University on March 2, 2020:
Today we live in a dynamic and unpredictable world that is increasingly dependent on engineered devices, processes and systems. A 2017 workforce report by the McKinsey Global Institute indicates that all workers will need to adapt as their occupations evolve with increasingly capable machines. In the age of artificial intelligence (AI) and data science, workers will spend more time on activities that require social and emotional skills, creativity, high-level cognitive capabilities and other skills that are relatively hard to automate.
There is growing evidence of the importance of a high emotional quotient (EQ) as a predictor of success and organizational performance. In this talk, Professor Liu will share insights gained from her personal career journey and describe initiatives being undertaken in the College of Engineering at the University of California, Berkeley to cultivate EQ in their students and to advance equity and inclusion, toward a brighter future for all.
Rama Akkiraju, IBM Fellow and Director of AI Operations at IBM, delivers a Technical Vision Talk at WiDS Stanford University on March 2, 2020:
AI applications are proliferating in consumer and business domains these days around the world. Have you ever wondered how Siri, Google Home, Google Maps or Amazon Echo speaks to users in different countries in their local languages? How does an automated customer support chat bot that you are speaking with or texting with speak or understand your local language to resolve your problems? AI models that power these applications have to speak the language of the user and the language of the business for them to be useful and relevant. Polyglot AI is not magic just the way AI itself is not magic! It takes a lot of hard work to teach AI to understand and speak new languages. In this talk, I’ll take you through some behind the scenes hard work to build multilingual natural language processing systems that enables AI to speak multiple languages.
Been Kim, Research Scientist at Google Brain delivers a Technical Vision Talk at WiDS Stanford University on March 2, 2020:
In this talk, Been will reflect on some of the progress made in the field of interpretable machine learning. We will reflect on where we are going as a field, and what are the things we need to be aware and be careful as we make progress. With that perspective, she will then discuss some of her recent work 1) sanity checking popular methods and 2) developing more lay-person friendly interpretability method.
Ya Xu, Head of Data Science at LinkedIn delivers a Technical Vision Talk at WiDS Stanford University on March 2, 2020:
At LinkedIn, data plays an essential role in achieving our vision of creating economic opportunity for every member of the global workforce. It is critical that we are not just using data to create opportunities, but creating them responsibly. This goes beyond just complying with regulations. It starts with taking data privacy protection seriously with Differential Privacy, and avoiding unintended consequences in both our products and ML models to ensure fairness. In this talk, Ya will share perspectives from her experience addressing these challenges at LinkedIn.
Emily Glassberg Sands, Head of Data Science at Coursera delivers a Technical Vision Talk at WiDS Stanford University on March 2, 2020:
Coursera is the world’s largest platform for higher education, providing 50 million learners access to life-transforming skills and credentials. With the rich data generated as over 50 million learners engage on the platform, we have the unique opportunity to use data science and machine learning to unlock high-quality teaching and learning at scale. This talk will take you behind-the-scenes of some of our latest data products — from the personalized coaching that motivates and unblocks learners, to the algorithmic skill scores that track real-time progress against career goals, to the human-in-the-loop systems accelerating grading and student support. We’ll touch on the math, the product, the impact, and our own learnings along the way.
Moderated by Martina Lauchengco, Operating Partner, Board Member, Costanoa Ventures
Panelists:
– Rukmini Iyer, Distinguished Engineer, Bing Advertising Marketplace & Serving, AI & Research, Microsoft
– Talithia Williams, Associate Dean, Associate Professor of Mathematics, Harvey Mudd
– Denice Ross, Senior Fellow, National Conference on Citizenship Fellow, Georgetown University
– Lillian Carrasquillo, Insights Manager, Spotify
Emily Glassberg Sands, Head of Data Science at Coursera delivers a Technical Vision Talk at WiDS Stanford University on March 2, 2020:
Coursera is the world’s largest platform for higher education, providing 50 million learners access to life-transforming skills and credentials. With the rich data generated as over 50 million learners engage on the platform, we have the unique opportunity to use data science and machine learning to unlock high-quality teaching and learning at scale. This talk will take you behind-the-scenes of some of our latest data products ‚Äî from the personalized coaching that motivates and unblocks learners, to the algorithmic skill scores that track real-time progress against career goals, to the human-in-the-loop systems accelerating grading and student support. We‚Äôll touch on the math, the product, the impact, and our own learnings along the way.
Talithia Williams, Host of NOVA Wonders PBS & Associate Professor of Mathematics, Harvey Mudd College | @Dr_TalithiaW sits down with Sonia Tagare for WiDS 2020 in Stanford, CA.
#WiDS2020 #WomenInTech #theCUBE
https://siliconangle.com/2020/03/05/i…
Harvey Mudd College professor highlights importance of personal health data, diversity in tech
There’s no doubt that the use of data is valuable for businesses. But it’s not just companies that can benefit from data insights.
Individuals can and should also collect their own body data and use it to have a better life, according to Talithia Williams (pictured), associate dean and associate professor of mathematics at Harvey Mudd College.
“We have so many devices that collect data automatically for us, and often we don’t pause long enough to actually look at that history,” she said. “It’s really challenging people to think about how they can use data that they collect about their bodies to help make better health decisions.”
Williams spoke with Sonia Tagare, host of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, during the Women in Data Science conference in Stanford, California. They discussed the ways in which people can obtain their own data, the importance of including women of color in the technology industry and the privacy challenges related to using data for business purposes.
Understanding the information
Among the information that people can collect about themselves are, for example, blood pressure, blood sugar, and temperature. But just as important as collecting the data is to be active in interpreting it, according to Williams.
“It’s not like if you take this data, you will be healthier or you will live to 100,” she added. “It’s really a matter of challenging people to own the data that they have and get excited about understanding it.”
Data is also important in enabling individuals to set goals to change their lifestyle practices.
“When I take my heart rate data or my pulse, I’m really trying to see if I can get lower than how it was before,” Williams said. “So, the push is really how my exercise and my diet are changing so that I can bring my resting heart rate down.”
Diversity in STEM fields
With a doctorate in statistics, in addition to her role as a professor, Williams is host of a PBS program called “NOVA Wonders,” which “follows researchers as they tackle unanswered questions about life and the cosmos.” She also wrote the book “Power in Numbers: The Rebel Women of Mathematics,” which aims to inspire women of color to work in technology-related industries.
“I really wanted to highlight sort of where we have been, but also where we are going and the amazing women that are doing work on it,” she explained.
It’s the responsibility of those in STEM fields to find ways to advocate for women and especially for women of color, according to Williams.
“Often it takes someone who’s already at the table to invite other people to the table,” she said. “I think the onus is more on people who occupy those spaces already to think about how they can be more intentional in bringing diversity.”
John Hoegger, Principal Data Scientist Manager, Microsoft sits down with Sonia Tagare at Stanford University for WiDS 2020.
#WiDS2020 #WomenInTech #theCUBE
https://siliconangle.com/2020/03/03/i…
Diversity in hiring process helps Microsoft fuel opportunity for women data scientists
It’s no secret that the representation of women in the technology workforce is lower than it should be.
For predictive analytics professionals, a Burch Works study showed that women comprised 26% of the workforce in 2019, an increase over the previous year of only 2%.
Microsoft Corp. is taking its own steps to change that. Through active participation in major industry gatherings, such as WiDS 2020, and paying close attention to the hiring process, the company is looking to change workforce percentages.
“We make sure that we have women on every set of interviews,” said John Hoegger (pictured), principal data science manager at Microsoft. “What’s it like to be a woman on this team? If it’s all men, you can’t answer that question. I’ve now got a team of 30 data scientists and half of them are women.”
Hoegger spoke with Sonia Tagare, host of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, during the Women in Data Science conference in Stanford, California. They discussed the growth of WiDS over the past four years and advice for women seeking to join companies as data scientists.
From conference to a movement
When the Microsoft manager discovered that the WiDS event had only one sponsor — WalMart Labs — in its inaugural year, he quickly decided that his company should become a supporter too. The organization has since expanded its portfolio of global events to include “datathons” and the development of role models for women data scientists.
“It’s amazing to see how this event has grown over the four years,” Hoegger said. “There’s all of these new regional events that have been set up every year. It’s turned from just a conference into a movement.”
Microsoft has hosted various WiDS events at its headquarters in Redmond, Washington, along with New York and Boston, according to Hoegger. Asked about what advice he would offer for women seeking positions in the data science field, Hoegger encouraged an open approach that would provide candidates with useful insight into the company culture.
“Go to those interviews and ask what it’s like to be a woman on the team,” Hoegger said. “You want to ensure that the team you join and company you join are inclusive and really value diversity in the workforce.”
Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of the Women in Data Science conference.
Lillian Carrasquillo, Insights Manager, Spotify sits down with Sonia Tagare for WiDS at Stanford University.
#WiDS2020 #WomenInTech #theCUBE
https://siliconangle.com/2020/03/04/d…
Diversity helps make data models fairer, says Spotify insights manager
Gender inequality in the tech industry has gained many headlines of late. For good reason. Diversity is essential to bring new perspectives and help, for example, to reduce the unfair bias of data models, according to Lillian Carrasquillo (pictured), insights manager at Spotify Technology SA.
“Because you are different, your voice is needed even more,” Carrasquillo said. “Your voice matters, and I always ask: How can I highlight your voice more?”
Carrasquillo spoke with Sonia Tagare, host of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, during the Women in Data Science conference in Stanford, California. They discussed Spotify’s way of dealing with diversity, and Carrasquillo’s advice for women entering the data science field.
From recruiting to micro decisions about inclusion
Spotify has a team focused on diversity, which encourages all managers to think about inclusiveness, from recruitment to micro decisions that can be routinely made to create an inclusive environment, according to Carrasquillo.
“It’s not just about diversity. It’s also about making people feel like this is where they should be,” she stated.
With the increasing use of machine learning and artificial intelligence, diversity, both of individuals and of educational background, is essential for the search for fairer models, according Carrasquillo.
“I think that a strong, collaborative, and even on an individual level across disciplinary education is really the only way that we’re going to be able to make connections to understand what kind of second-order effects we’re having based on the decisions of parameters for a model,” Carrasquillo explained.
And she knows what she’s talking about. With a diverse education — she has a degree in industrial mathematics and went to a liberal arts college “on purpose” — Carrasquillo leads people with different experiences on her team, which is responsible for thinking about data and algorithms that help power the larger personalization experiences across Spotify.
“I personally manage a data scientist and a user researcher, and the three of us collaborate highly together across our disciplines,” she pointed out.
For women who are leaving college now and going into data science, her advice is that they follow their interests. Because there are many different types of technology problems to solve, women do not just need to just seek a data scientist title, Carrasquillo added.
“You can follow your interest and use your data science skills in ways that might require a lot of collaboration or mixed methods, or work within a team where there are different types of expertise coming together to work on problems,”she concluded.
Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of the Women in Data Science conference.
License
Creative Commons Attribution license (reuse allowed)
Show les
Lucy Bernholz, Senior Research Scholar, Stanford University | @p2173 sits down with Sonia Tagare for WiDS 2020 at Stanford University.
#WiDS2020 #WomenInTech #theCUBE
https://siliconangle.com/2020/03/04/d…
Diverse teams help build less biased algorithms, says Stanford researcher
As powerful as the benefits of artificial intelligence are, using biased data and defective AI models can cause a lot of damage.
To address that growing issue, human values must be integrated into the entire data science process, according to Lucy Bernholz (pictured), senior research scholar and director of the Digital Civil Society Lab at Stanford University.
“[Values] shouldn’t be a separate topic of discussion,” she said. “We need this conversation about what we’re trying to build for, who we’re trying to protect, how we’re trying to recognize individual human agency, and that has to be built in throughout data science.”
Bernholz spoke with Sonia Tagare, host of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, during the Women in Data Science conference in Stanford, California. They discussed the importance of values in data science, why it is necessary to have a diverse team to build and analyze algorithms, and the work being done by the Digital Civil Society Laboratory.
Breaking the bias cycle
All data is biased because it is people who collect it, according to Bernholz. “And we’re building the biases into the data science and then exporting those tools into bias systems,” she highlighted. “And guess what? Problems are getting worse. So, let’s stop doing that.”
When creating algorithms and analyzing them, data scientists need to make sure that they are considering all the different types of people in the data set and understanding those people in context, Bernholz explained.
“We know perfectly well that women of color face a different environment than white men; they don’t walk through the world in the same way,” she explained. “And it’s ridiculous to assume that your shopping algorithm isn’t going to affect that difference that they experience in the real world.”
It is also necessary to have different profiles of people involved in the creation of the algorithms, as well as in the management of the companies, who can make decisions about whether and how to use them, she added.
“We need a different set of teaching mechanisms where people are actually trained to consider from the beginning what’s the intended positive, what’s the intended negative, and what is some likely negatives, and then decide how far they go down that path,” Bernholz concluded.
Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of the Women in Data Science conference:
Ya Xu, Head of Data Science, LinkedIn sits down with Sonia Tagare for WiDS 2020 in Stanford, CA.
#WiDS2020 #WomenInTech #theCUBE
https://siliconangle.com/2020/03/04/l…
LinkedIn pursues its vision to leverage data for global economic opportunity
It would be a mistake to simply characterize LinkedIn Corp. as merely a job or networking website.
With 660 million users, LinkedIn has the ability to leverage a tremendous amount of data in ways that go far beyond the latest job switch or promotion. The field of data science is helping it transform economic structures on a worldwide scale.
“Everybody can benefit from better data and better data access,” said Ya Xu (pictured), head of data science at LinkedIn. “We truly believe in the vision that we are working towards, which is creating economic opportunity for every member of the global workforce.”
Xu spoke with Sonia Tagare, host of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, during the Women in Data Science conference in Stanford, California. They discussed using data responsibly, the importance of diversity, and advice for women seeking a career in the data science field.
Data privacy is fundamental
Speaking at the WiDS conference this week, Xu addressed ways that responsible data can create global opportunity. Data privacy and diversity are fundamental components of that strategy, according to Xu.
“The fundamental thing that we have to start with is to be able to preserve the privacy of our members,” Xu said. “If you have a diverse team that is a representation of the customers you are serving, then you are able to come up with better features that are able to serve the needs of a population. That’s just the right thing to do.”
In her role as a data science leader for LinkedIn, Xu has advice for other women who may be seeking to follow their own careers in the field.
“Just have that ‘can do’ attitude,” Xu said. “We’re not any less than a man, and there are certainly many strong and talented women that we have in the field. Don’t let people’s’ perceptions or biases around you bring you down.”
Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of the Women in Data Science conference:
Talithia Williams, Host of NOVA Wonders PBS & Associate Professor of Mathematics, Harvey Mudd College | @Dr_TalithiaW sits down with Sonia Tagare for WiDS 2020 in Stanford, CA.
#WiDS2020 #WomenInTech #theCUBE
https://siliconangle.com/2020/03/05/i…
Harvey Mudd College professor highlights importance of personal health data, diversity in tech
There’s no doubt that the use of data is valuable for businesses. But it’s not just companies that can benefit from data insights.
Individuals can and should also collect their own body data and use it to have a better life, according to Talithia Williams (pictured), associate dean and associate professor of mathematics at Harvey Mudd College.
“We have so many devices that collect data automatically for us, and often we don’t pause long enough to actually look at that history,” she said. “It’s really challenging people to think about how they can use data that they collect about their bodies to help make better health decisions.”
Williams spoke with Sonia Tagare, host of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, during the Women in Data Science conference in Stanford, California. They discussed the ways in which people can obtain their own data, the importance of including women of color in the technology industry and the privacy challenges related to using data for business purposes.
Understanding the information
Among the information that people can collect about themselves are, for example, blood pressure, blood sugar, and temperature. But just as important as collecting the data is to be active in interpreting it, according to Williams.
“It’s not like if you take this data, you will be healthier or you will live to 100,” she added. “It’s really a matter of challenging people to own the data that they have and get excited about understanding it.”
Data is also important in enabling individuals to set goals to change their lifestyle practices.
“When I take my heart rate data or my pulse, I’m really trying to see if I can get lower than how it was before,” Williams said. “So, the push is really how my exercise and my diet are changing so that I can bring my resting heart rate down.”
Diversity in STEM fields
With a doctorate in statistics, in addition to her role as a professor, Williams is host of a PBS program called “NOVA Wonders,” which “follows researchers as they tackle unanswered questions about life and the cosmos.” She also wrote the book “Power in Numbers: The Rebel Women of Mathematics,” which aims to inspire women of color to work in technology-related industries.
“I really wanted to highlight sort of where we have been, but also where we are going and the amazing women that are doing work on it,” she explained.
It’s the responsibility of those in STEM fields to find ways to advocate for women and especially for women of color, according to Williams.
“Often it takes someone who’s already at the table to invite other people to the table,” she said. “I think the onus is more on people who occupy those spaces already to think about how they can be more intentional in bringing diversity.”
Join us for the fifth annual Women in Data Science (WiDS) Conference on March 2. This one-day, technical conference features outstanding women doing outstanding data science work in academia, industry, government, and non-profits. The conference broadcast features keynotes, technical talks, an ethics panel, and a career panel, interspersed with speaker interviews. For more information: https://widsconference.org
One year has passed since I discovered WiDS (Women in Data Science) and I have since become an advocate in Japan. The WiDS initiative aims to inspire and educate data scientists worldwide, regardless of gender, and to support women in the field. Since the 4th WiDS Tokyo@YCU workshop wrapped up, my motivation increased further. So in reflection, I decided to look back at the first time I encountered WiDS and chronicle my passion.
Timnit Gebru, a research scientist and technical co-lead of Google’s Ethical Artificial Intelligence Team, explains the importance of advocating for diversity, inclusion and ethics in AI on the Women in Data Science (WiDS) Podcast.
Talia Tron attended her first WiDS conference in Israel after hearing about it during a job interview with Intuit. Her experience at this conference sealed her decision to join Intuit…
Sarah Rice, a senior information architect, is thrilled with how WiDS provides role models and inspiration for young women entering the field of data science. While she personally had an early aptitude for math, her early college environment did not encourage women to pursue math and computer science. Instead she studied social sciences and went on to earn her Master’s degree in Library and Information Sciences. She now combines those two disciplines in her consulting work in user experience design.
Shir Meir Lador, data science team lead at Intuit in Israel, develops machine learning models for security, risk and fraud in products like Quickbooks, Turbo Tax and Mint. In addition to her job at Intuit, Lador is a WiDS ambassador in Israel, has her own podcast about data science, and is a co-founder of PyData Tel Aviv meetups.
Rania Ahmed says WiDS has provided her with the confidence, skills and a network to take charge of her data science career. She holds an M.A. in Urban Affairs from the University of San Francisco and a B.S. in Urban Planning from Cairo University in Egypt. Rania is an alumnus of the first Wellesley College’s cohort of Women in Public Service Project (WPSP) and has been a Stanford University Women in Data Science (WiDS) Ambassador since 2018. She is currently a Research Associate at Urban Strategies Council, where she uses data science to develop policy recommendations for the public good in the San Francisco Bay Area.
Natalie Michelle Evans Harris, a leader on ethical and responsible use of data, explains on the Women in Data Science (WiDS) Podcast how building trust through a shared vision and data “code of ethics” is essential to promote both innovation and privacy.
The Women in Data Science (WiDS) initiative aims to inspire and educate data scientists worldwide, regardless of gender, and support women in the field. Watch highlights from WiDS 2019 Stanford.
WiDS 2019 Career Panel moderated by Margot Gerritsen, Senior Associate Dean, Stanford University
Panelists:
– Natalie Evans Harris; Co-founder and Head of Strategy Initiatives, BrightHive Inc.
– Marzyeh Ghassemi; Assistant Professor, University of Toronto
– Emily Glassberg Sands; Head of Data Science and Data Engineering, Coursera
– Yinglian Xie; CEO and Co-Founder, DataVisor
Yoky Matsuoka, Vice President, Google Health in conversation with Lori Sherer, Partner, Bain & Company
Marzyeh Ghassemi, Assistant Professor, University of Toronto
Professor Marzyeh Ghassemi tackles part of this puzzle with machine learning. This talk will cover some of the novel technical opportunities for machine learning in health challenges, and the important progress to be made with careful application to domain.
Anima Anandkumar, Professor of Computing and Mathematical Sciences at CalTech and Director of Research in Machine Learning, NVIDIA.
Standard deep-learning algorithms are based on a function-fitting approach that do not exploit any domain knowledge or constraints. This makes them unsuitable in applications that have limited data or require safety or stability guarantees, such as robotics. By infusing structure and physics into deep-learning algorithms, we can overcome these limitations. There are several ways to do this. For instance, we use tensorized neural networks to encode multidimensional data and higher-order correlations. We infuse symbolic expressions into deep learning to obtain strong generalization. We utilize spectral normalization of neural networks to guarantee stability and apply it to stable landing of quadrotor drones. These instances demonstrate that building structure into ML algorithms can lead to significant gains.
Timnit Gebru, Research Scientist on the Ethical AI Team, Google
Automated decision making tools are currently used in high stakes scenarios. From natural language processing tools used to automatically determine one’s suitability for a job, to health diagnostic systems trained to determine a patient’s outcome, machine learning models are used to make decisions that can have serious consequences on people’s lives. In spite of the consequential nature of these use cases, vendors of such models are not required to perform specific tests showing the suitability of their models for a given task. Nor are they required to provide documentation describing the characteristics of their models, or disclose the results of algorithmic audits to ensure that certain groups are not unfairly treated.
I will show some examples to examine the dire consequences of basing decisions entirely on machine learning based systems, and discuss recent work on auditing and exposing the gender and skin tone bias found in commercial gender classification systems. I will end with the concept of an AI datasheet to standardize information for datasets and pre-trained models, in order to push the field as a whole towards transparency and accountability.
Madeleine Udell, Assistant Professor of Operations Research & Information Engineering, & Richard and Sybil Smith Sesquicentennial Fellow, Cornell University
Data scientists are often faced with the challenge of understanding a high dimensional data set organized as a table. These tables may have columns of different (sometimes, non-numeric) types, and often have many missing entries. In this talk, we discuss how to use low rank models to analyze these big messy data sets.
Low rank models perform well across a wide range of data science applications, including recommender systems, movie references, topic models, medical records, and genomics. In this talk, we introduce the mathematics of low rank models,
demonstrate a few surprising applications of low rank models in data science, and present a simple mathematical explanation for their efficacy.
Margot Gerritsen, Karen Matthys, and Judy Logan, Co-Directors of ICME at Stanford University, open the WiDS 2019 Conference held at Stanford University on March 4, 2019.
Kavita Sangwan, Director, Technical Programs, Artificial Intelligence and Machine Learning, Intuit sits down with Lisa Martin at Stanford University for WiDS 2019.
#WiDS2019 #Intuit #theCUBE
https://siliconangle.com/2019/03/06/d…
Data and company: Intuit’s recipe for successful customer obsession
Many vendors are talking the customer-first talk lately. But who’s walking the walk? What does it take to win customers’ loyalty among so many digitally armed competitors?
The recipe combines technological and human ingredients, according to Kavita Sangwan (pictured), director of technical programs, artificial intelligence and machine learning, at Intuit Inc. For example, Intuit uses customer data to build products with AI and machine learning. This requires the business and financial software company to build trust with its customers.
Intuit operates with the principal and the mindset that this is our customers’ data and we are their stewards,” Sangwan said.So we make sure that we are one of the best stewards for their data.”
Their customers will be able to tell as soon as they use the resulting products. The result is a loop that reinforces Intuit’s bonds with its customers.That’s what we reflect in our products — how we serve them, build intelligent products for them. And that’s how we start to gain trust from our customers,” she added.
Sangwan spoke with Lisa Martin (@LisaMartinTV), host of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, during the Stanford Women in Data Science event in Stanford, California. They discussed Intuit’s technological and cultural methods for staying in touch with its customer base.
Pouring more than tech in the product pot
Intuit goes further than gathering data points from customers. It strives to maintain a team of product developers and others that understand customers on a human level. This is how they are able to deliver products they want and need.
It’s very important for us to build a culture which reflects the values and the talents and the skills of our customers,” Sangwan said.
The company strives to put together a team with diverse skills to build smart products that are easy for end users to adopt.It’s very important for us to operate in a team setting,” she stated.A data scientist has to interact with a product manager, a data engineer, a business person, a legal person.”
The diverse skills and knowledge these team members bring are necessary to build its products, Sangwan concluded.
Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of the Stanford Women in Data Science event.
Kristina Draper, Technology Division Executive, Consumer Bank & Services Technology, Wells Fargo | @kristinadraper sits down with Lisa Martin at Stanford University for WiDS 2019.
#WiDS2019 #WellsFargo #theCUBE
https://siliconangle.com/2019/03/05/q…
Q&A: Wells Fargo aims for 100-percent data transparency in new era of consumer trust
The big data explosion has created transformative innovation opportunities for technology, as well as businesses across industries. As consumers better understand their piece in that data puzzle and the market begins to find its footing in a data-driven digital landscape, companies must adopt a responsibility around transparency to maintain trust and efficiency.
Greater visibility around data-driven processes can also lead to more comprehensive solutions through interdisciplinary collaboration, according to Kristina Draper (pictured), chief technology officer at Wells Fargo & Co.
Draper spoke with Lisa Martin (@LisaMartinTV), host of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, during the Stanford Women in Data Science event in Stanford, California. They discussed the role data is playing in a new era of accountability at Wells Fargo, as well as how Draper is reaching beyond the financial industry for greater innovation opportunities.
[Editor’s note: The following answers have been condensed for clarity.]
Tell a little about your involvement in WiDS, as well as Wells Fargo’s involvement as a sponsor.
Draper: We believe so strongly that in the consumer bank space we have a tremendous opportunity and responsibility to understand how our customers interact with Wells Fargo, and that will require a discipline around data science. We had an opportunity this year to be an executive sponsor and jumped at it. I think we’ll continue to be at that sponsor level in future years.
You were recently named one of the 50 most powerful women in technology. What are some of the [ways] Wells Fargo is re-imagining data and trust? What have you seen of the evolution of females in technology and leadership roles?
Draper: The recognition [of] women in technology … is an opportunity to demonstrate that we should be very confident in the value that we bring as leaders, and that confidence as a woman is hard to come by. I think of my own personal career and the way that doors were opened for me along the way; often we are our own worst enemies. We second guess ourselves, we second guess our value, and we have to really work for that seat at the table.
My coming back to Wells was really … as a leader in technology. I felt I could make a real impact. When I think about what we can do as women leaders in technology and in data science, a lot of it is owning that accountability to leadership and paving the way for leaders behind us. There comes a part in a career, certainly mine, where you’re no longer thinking about the next job for yourself.
We’re in a consumer banking space and financial services, so there’s certainly a lot of places to innovate [and] think about how technology can help to serve a Wells Fargo customer. You need your bank throughout your entire life. Whether you are thinking about a home purchase, an auto purchase, college for your children, retirement, there’s so many big markers in life. And that’s where I get excited about not only the leadership role that I have now, but I have the opportunity to bring a team with me to contribute real value.
You have a pay-it-forward attitude. How are you using that to expand your team … to continue this big re-imagining that Wells Fargo as a business is undergoing?
Draper: WiDS is … a tremendous network opportunity. [I’m] so inspired about how they’re turning data science and really thinking about different problems [and] ways we can improve not only our lives, but the lives of future generations to come.
I come from a financial services background, but the problems that our future generations will face can’t be solved with just one lens. You can’t solve problems with just a financial services expertise or just a technical expertise. It’s the space in between art and science. It’s an ability to think across industry and apply solutions and innovation that have been brought forward through other industries, through other companies, through other academia, and thinking about how that could apply in solving the problems that we’re faced with in the financial services space.
If I turned some of the problems that we’re faced with upside down and thought about it with that perspective, and invited some collaboration to help solve problems, we might come up with a better answer.
How can financial services and the data that you deal with help customers?
…
Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of the Stanford Women in Data Science event.
Janet George, “Fellow” Chief Data Officer/Scientist/Big Data/Cognitive Computing, Western Digital sits down with Lisa Martin at Stanford University for WiDS 2019.
#WiDS2019 #WesternDigital #theCUBE
https://siliconangle.com/2019/03/07/q…
Q&A: How AI is cultivating a responsible community to better mankind
Artificial intelligence initiatives powered by big data are propelling businesses beyond the capacity of human labor. While AI tech offers an undeniable opportunity for innovation, it has also sparked a debate around potential misuse through the vast reach of programmed biases and other problematic behaviors.
The power of AI can be comprehensively harnessed for good by fostering diverse teams focused on ethical solutions and working in tandem with policymakers to ensure responsible scale, according to Janet George (pictured), fellow and chief data officer at WD, a Western Digital Company.
George spoke with Lisa Martin (@LisaMartinTV), host of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, during the Stanford Women in Data Science event in Stanford, California. They discussed the range of possibilities in AI and how WD is leveraging the technology toward sustainability.
[Editor’s note: The following answers have been condensed for clarity.]
Tell us about Western Digital’s continued sponsorship and what makes this important to you.
George: Western Digital has recently transformed itself … and we are a data-driven … data-infrastructure company. This momentum of AI is a foundational shift in the way we do business. Businesses are realizing that they’re going to be in two categories, the ‘have’ and the ‘have not.’ In order to be in the have category, you have to embrace AI … data … [and] scale. You have to transform yourself to put yourself in a competitive position. That’s why Western Digital is here.
How has Western Digital transformed to harness AI for good?
George: We are not just a company that focuses on business for AI. One of the initiatives we are doing is AI for Good and … Data for Good … working with the UN. We’ve been focusing on trying to figure out the data that impacts climate change. Collecting data and providing infrastructure to stow massive amounts of species data in the environment that we’ve never actually collected before. Climate change is a huge area for us, education … [and] diversity. We’re using all of these areas as a launching pad for Data for Good and trying to use data … and AI to better mankind.
Now we have the data to put out massively predictive models that can help us understand what the change would look like 25 years from now and take corrective action. We know carbon emissions are causing very significant damage to our environment and there’s something we can do about it. Data is helping us do that. We have the infrastructure, economies of scale. We can build massive platforms that can stow this data and then we can analyze this data at scale. We have enough technology now to adapt to our ecosystem … and be better in the next 10 years.
What are your thoughts on data scientists taking something like a Hippocratic Oath to start owning accountability for the data that they’re working with?
George: We need a diversity of data scientists to have multiple models that are completely diverse, and we have to be very responsible when we start to create. Creators have to be responsible for their creation. Where we get into tricky areas are when you are the human creator of an AI model, and now the AI model has self-created because it has self-learned. Who owns the copyright to those when AI becomes the creator? The group of people that are responsible for creating the environment, creating the models, the question comes into how do we protect the authors, the users, the producers, and the new creators of the original piece of art.
You can use the creation for good or bad. The creation recreates itself, like AI learning, on its own with massive amounts of data after an original data scientist has created the model. Laws have to change; policies have to change. Innovation has to go, and at the same time, we have to be responsible about what we innovate.
Where are we as a society in starting to understand the different principles and practices that have to be implemented in order for proper management of data to enable innovation?
George: We’re debating the issues. We’re coming together as a community. We’re having discussions with experts. What are we seeing as the longevity of that AI model in a business setting, in a non-business setting? How does the AI perform? We are now able to see the sustained performance of the AI model.
…
Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of the Stanford Women in Data Science event.
Srujana Kaddevarmuth, Senior Manager, Data Science , Accenture & Women in Data Science Ambassador, Bengaluru | @Srujanadev sits down with Lisa Martin at Stanford University for WiDS 2019.
#WiDS2019 #Accenture #theCUBE
https://siliconangle.com/2019/03/11/w…
WiDS Datathon mixes up data science with collaborative teams
If only a data set and some pre-packaged data-analytics software were all it takes to solve real-world problems. The reality is that tools require hands to ply them. And just like a comprehensive data set is better than a limited one, a comprehensive set of skills helps people design better solutions.
Looking at the problem from different perspectives and collaboration are the keys to be able to be successful in data science,” said Srujana Kaddevarmuth (pictured), data science and analytics executive at Accenture LLP and ambassador for the Women in Machine Learning & Data Science team in Bengaluru (formerly Bangalore).
Take a problem like deforestation from palm-oil plantations. Consider all the factors that might be involved: agriculture, climate, ecology, economics, politics, etc. What are the odds that one random data expert can ask all the right questions, pull together all the necessary data, and derive actionable insight? Probably not great.
This is the thinking behind collaborative data-science projects, like the Women in Data Science, or WiDS, Datathon. This year, it organized several teams to collaborate and use data and satellite imagery to analyze this particular problem.
Kaddevarmuth spoke with Lisa Martin (@LisaMartinTV), host of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, during the Stanford Women in Data Science event in Stanford, California. They discussed this year’s Datathon and why collaboration results in better outcomes for data scientists.
From clueless to Kaggle code in three weeks
At the WiDS Bengaluru regional event, organizers set up a community workshop. The goal was to form teams to participate in the Datathon. They would submit the fruits of their endeavors to something called Kaggle, a platform for data-science projects and competitions. In India, Kaggle participation is very male heavydespite that region having amazing female data scientists who are innovators in their space with multiple patents, publications and innovations to their credit,” Kaddevarmuth said.
WiDS teamed mentors with participating teams to work together for three weeks. One team from the engineering division who was brand new to Kaggle learned new concepts, honed skills in deep learning and neural networks, and submitted original code to the Kaggle leaderboard.
They were not the top-scoring team, but this entire experience of being able to collaborate, look at the problem from different perspectives, and be able to submit the code despite a lot of these challenges — and also navigate the platform in itself — was a decent achievement from my perspective,” Kaddevarmuth concluded.
Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of the Stanford Women in Data Science event.
Hear from Women in Data Science (WiDS) ambassadors worldwide about why they are bringing WiDS to their cohort, colleagues, or community.
Leda Braga, Chief Executive Officer at Systematica Investments, delivers a Keynote presentation at the WiDS 2018 Conference held at Stanford University.
Objective analysis of relevant data can improve the execution of most businesses. From the simple client feedback form through to production statistics, listening to the data helps. In the investment management industry, by contrast, data analysis IS the business. Investment management is information management and data science is not an aid to decision making, but rather the essence of it.
This talk will explore the reality of investment management, how recent developments in data and AI are shaping the fund management industry and the challenges of dealing with financial data. Also in the context of the WiDS forum and its clear focus on diversity, trends such as ethical investing (or Socially responsible Investing SRI) will also be discussed.
A Career Panel moderated by Margot Gerrtisen and with questions from the audience. Panelists include:
– Elena Grewal, Head of Data Science at Airbnb
– Bhavani Thuraisingham, Professor of Computer Science at University of Texas at Dallas
– Ziya Ma, VP Software and Services Group and Director of Big Data Technologyies at Intel Corporation
– Jennifer Prendki, Head of Data Science at Atlassian
Margot Gerritsen, Director of ICME, delivers Closing Remarks at the WiDS 2018 Conference held at Stanford University on March 5, 2018
Latanya Sweeney, Professor of Government and Technology in Residence at Harvard University, delivers a Keynote presentation, Data Science to Save the World, at the WiDS 2018 Conference held at Stanford University on March 5, 2018.
Technology designers are the new policy makers. No one elected them and most people do not know their names, but the arbitrary decisions they make when producing the latest gadgets and online innovations dictate the code by which we conduct our daily lives and govern our countries. As technology progresses, every societal value and every state rule comes up for grabs and will likely be redefined by what technology enables or not. Data science allows us to do experiments to show how it all fits together or falls apart. Come to this talk and see how data science can help save the world.
Daniela Witten, Associate Professor of Statistics and Biostatistics at University of Washington, presents More Data, More (Statistical) Problems at the WiDS 2018 Conference held at Stanford University on March 5, 2018.
By now, virtually every field has become inundated with big data. We have been promised that this data will usher in a new era of previously unimaginable societal and scientific progress. While it is certainly true that more data brings with it incredible opportunities, it is also true that more data can bring new and previously unimaginable statistical challenges. I will talk about some of those statistical challenges, as well as statistical ways to solve them. Examples will be taken from biomedical research.
Experience the Women in Data Science (WiDS) Conference held at Stanford University on February 3, 2017. You’ll hear from speakers, students, and industry attendees about what they learned, and why they came.
Dr. Deborah Frincke leads the Research Directorate of the National Security Agency (NSA), the largest “in-house” research organization in the U.S. Intelligence Community. She also serves as the NSA Science Advisor and Innovation Champion, and is a recipient of the President’s Meritorious Rank Award. In her presentation, Dr. Frincke will discuss NSA’s unclassified research programs and describe how the Research Directorate supports national-level missions. She will provide key insights on data science challenges facing NSA and the nation.
Dr. Frincke talks about Mission-Oriented Research; how rock climbing is similar to sorting through messy data; and how adversarial machine learning is an area of active research.
The upstream oil & gas industry (i.e. the exploration for and production of hydrocarbons) needs to reap the benefits of new technology to improve efficiency. Making more effective use of increasing amounts of collected data is on the verge of transforming the business.
Transformation through data analytics is equally relevant on both the operational and financial sides of the business.
On the upstream operational side: for decades now, we have been inventing new and increasingly sophisticated tools (both hardware and software) to generate new data types that extend the boundaries of geoscience knowledge, and allow us to understand our hydrocarbon reservoirs in ever increasing detail. Historically, we have processed only a fraction of the data collected, but that is changing. Now, among the most important criteria governing the efficiency of oil and gas companies are the hugely increased volume of data collected but also the variety, velocity and veracity of information that can be extracted from that data. That’s data science! Data analytics as a discipline is now increasingly integrated within our upstream workflows in drilling, reservoir characterization and the actual production (extraction) of hydrocarbons in the most economically efficient ways possible. To this end, one goal is the development of an analytics platform that will perform a key role in increasing productivity through the simultaneous optimization of drilling planning and execution, the improvement of asset utilization and the overall reduction of non-productive time.
On the financial side: the oil and gas industry has a long history of being secretive and, as a result, judgment of the quality and accuracy of non-technical data has proved very difficult. In general, insufficient attention has been paid to addressing these challenges leading to unnecessary volatility in price movements through inadequate or conflicting data, and this volatility impacts decision-making within companies. In the information age, where markets react instantaneously to a multitude of data sources, it is time to understand better this key driver of our industry. Decision-enabling information is extremely critical to the efficient functioning of an industry that is driven by the signals coming from commercial markets. Understanding the quality and accuracy of that information through data science is a key enabler in filling a major gap currently preventing more effective management of oil and gas company assets.
Digital transformation, implying the transition from desktop to the cloud and mobile devices, easy access to information, new scalable online services and automated industrial workflows, is about to radically change the way we work in any industry (oil and gas, defense, transport, automotive, medicine, telecom, logistics, etc). This is no longer a trend, but a reality clearly demonstrated by the world’s most valuable companies adopting expanded and enhanced data analytics in response to common drivers of operational efficiency, operational safety and accuracy of real-time decision-making. That’s the promise of Big Data, to really understand the systems that make our technological industry. As you begin to understand the interactions of all the constituent components then you can build systems that are better and more effective at addressing the key industry drivers, irrespective of the industry. New technology is increasingly playing a huge new role. Data is the new oil!
Dr. Gottlib-Zeh describes how data science is transforming the oil and gas industry for better planning and efficiency, for both drilling and production.
Dr. Holmes shares a survey of the current challenges in the analyses of heterogeneous biological data. Combining networks, contingency tables and data from multiple omics domains provides the analysts with multiple choices. The result can be an erroneous p-value or a complicated workflow, both can be irreproducible. I will survey some of the recent approaches to this challenge.
Dr. Susan Holmes, Professor of Statistics, describes processes for analyzing large messy microbiome data sets, and the importance of reproducibility.
Lori Sherer, Partner, Bain & Co. and Caitlin Smallwood, VP Science and Algorithms, Netflix
Designing visualizations requires a problem-driven approach, beginning with a deep understanding of the need, and continuing with a close collaboration with domain experts to guide the design of algorithms, visual encodings, and interaction mechanisms. I’ll describe this process, and the role of visualization tools for deriving meaning and providing insights with complex, multivariate datasets.
Miriah Meyer, University of Utah
On November 2, 2015, 400 data scientists gathered at Stanford for the inaugural Women in Data Science (WiDS) Conference. This one-day technical conference boasted an all-women speaker lineup – which, according to conference host Margot Gerritsen, director of the Stanford Institute for Computational and Mathematical Engineering (ICME), defies statistics and probabilities in technological fields. World-class speakers including Caitlin Smallwood, vice president for science and algorithms at Netflix; Jennifer Chayes, Microsoft research scientist; and Fei-Fei Li, Stanford computer science professor, came together to inspire, educate and highlight the impact women are making in this field.
Director of the Stanford Institute for Computational and Mathematical Engineering (ICME), Margot Gerritsen discusses her commitment to encouraging companies and institutes to increase the number of women in computational math and STEM fields to 30% by 2030.
On November 2, 2015, 400 data scientists gathered at Stanford for the inaugural Women in Data Science (WiDS) Conference. This one-day technical conference boasted an all-women speaker lineup – which, according to conference host Margot Gerritsen, director of the Stanford Institute for Computational and Mathematical Engineering (ICME), defies statistics and probabilities in technological fields. World-class speakers including Caitlin Smallwood, vice president for science and algorithms at Netflix; Jennifer Chayes, Microsoft research scientist; and Fei-Fei Li, Stanford computer science professor, came together to inspire, educate and highlight the impact women are making in this field.
Panel Discussion: Career Paths in Data Science
Moderator: Kara Swisher, Re/code
Panelists:
Jennifer Tour Chayes, Microsoft Research
Aleksandra Korolova, USC
Shubha Nabar, Salesforce
Bin Yu, UC Berkeley