INFORMS 17th Workshop on Data Mining & Decision Analytics

Saturday, October 15

Organized by

The Data Mining Society of INFORMS is organizing the 17th INFORMS Workshop on Data Mining and Decision Analytics in conjunction with 2022 INFORMS Annual Meeting. You are cordially invited to join us and share your recent research work with peers from data mining, decision analytics, and artificial intelligence.

To participate, a full paper must be submitted before the deadline for consideration. The workshop committee also announces the best paper competition in both theoretical and applied research tracks. All accepted papers are automatically considered for the best paper competition in the chosen track.

Click here to view the full DMDA Workshop schedule.

Registration

Students and retirees: $75
Professionals: $150

Topics of Interest

Include, but are not limited to:

Analytics in Social Media & Finance
Anomaly Detection
Bayesian Data Analytics
Causal Mining (Inference)
Data Science and Artificial Intelligence
Deep Learning
Emerging Data Analytics in Industrial Applications
Ethics and Security in Data Mining
Fairness in Machine Learning
Healthcare Analytics
Interpretable Data Mining
Large-scale Data Analytics and Big Data
Longitudinal Data Analysis
Network Analysis and Graph Mining
Privacy & Fairness in Data Science
Reinforcement Learning
Reliability & Maintenance
Simulation/Optimization in Data Analytics
Text Mining & Natural Language Processing
Visual Analytics
Web Analytics/Web Mining

Timeline

May 16: Paper submission begins
August 8: Paper submission closes
September 1: Final review decision
September 14: Workshop on Data Mining and Decision Analytics registration deadline

DM Workshop Co-chairs

Nathan Gaw, Air Force Institute of Technology
Eyyub Kibis, Montclair State University
Feng Liu, Stevens Institute of Technology

DM Workshop
Management Committee

Paul Brooks, Virginia Commonwealth University
Matthew Lanham, Purdue University
Ramin Moghaddass, University of Miami
Asil Oztekin, University of Massachusetts Lowell
Cynthia Rudin, Duke University
George Runger, Arizona State University
Onur Seref, Virginia Tech
Durai Sundaramoorthi, Washington University

Thank You to Our Sponsors

Papers submission guideline

Maximum of 10 pages (including abstract, tables, figures, and references)
Single-spacing and 11-point font with one-inch margins on four sides
Papers must be submitted via the provided submission link (TBD). Late submission will not be considered for further review.
Copyright: The DM workshop will not retain the copyrights on the papers. Authors are free to submit their papers to other outlets.

Click here to view the full DMDA Workshop schedule.

Academic Keynote

Understanding How Dimension Reduction Tools Work

Cynthia Rudin, Professor of Computer Science, Duke University

Dimension reduction (DR) techniques such as t-SNE, UMAP, and TriMap have demonstrated impressive visualization performance on many real world datasets. They are useful for understanding data and trustworthy decision-making, particularly for biological data. One tension that has always faced these methods is the trade-off between preservation of global structure and preservation of local structure: past methods can either handle one or the other, but not both. In this work, our main goal is to understand what aspects of DR methods are important for preserving both local and global structure: it is difficult to design a better method without a true understanding of the choices we make in our algorithms and their empirical impact on the lower-dimensional embeddings they produce. Towards the goal of local structure preservation, we provide several useful design principles for DR loss functions based on our new understanding of the mechanisms behind successful DR methods. Towards the goal of global structure preservation, our analysis illuminates that the choice of which components to preserve is important. We leverage these insights to design a new algorithm for DR, called Pairwise Controlled Manifold Approximation Projection (PaCMAP), which preserves both local and global structure. Our work provides several unexpected insights into what design choices both to make and avoid when constructing DR algorithms.

The following papers will be discussed:

Yingfan Wang, Haiyang Huang, Cynthia Rudin, Yaron Shaposhnik Understanding How Dimension Reduction Tools Work: An Empirical Approach to Deciphering t-SNE, UMAP, TriMAP, and PaCMAP for Data Visualization Journal of Machine Learning Research (JMLR), 2021 https://jmlr.org/papers/v22/20-1061.html
Haiyang Huang, Yingfan Wang, Cynthia Rudin, and Edward P. Browne Towards a Comprehensive Evaluation of Dimension Reduction Methods for Transcriptomic Data Visualization Communications Biology (Nature), 2022. https://www.nature.com/articles/s42003-022-03628-x

About Cynthia Rudin

Cynthia Rudin is a professor of computer science, electrical and computer engineering, statistical science, mathematics, and biostatistics & bioinformatics at Duke University, and directs the Interpretable Machine Learning Lab. Previously, Prof. Rudin held positions at MIT, Columbia, and NYU. She holds an undergraduate degree from the University at Buffalo, and a Ph.D. from Princeton University. She is the recipient of the 2022 Squirrel AI Award for Artificial Intelligence for the Benefit of Humanity from the Association for the Advancement of Artificial Intelligence (AAAI). This award is the most prestigious award in the field of artificial intelligence. Similar only to world-renowned recognitions, such as the Nobel Prize and the Turing Award, it carries a monetary reward at the million-dollar level. Prof. Rudin is also a three-time winner of the INFORMS Innovative Applications in Analytics Award, was named as one of the “Top 40 Under 40” by Poets and Quants in 2015, and was named by Businessinsider.com as one of the 12 most impressive professors at MIT in 2015, and is a 2022 Guggenheim Fellow. She is a fellow of the American Statistical Association, the Institute of Mathematical Statistics, and AAAI.

Prof. Rudin is past chair of both the INFORMS Data Mining Section and the Statistical Learning and Data Science Section of the American Statistical Association. She has also served on committees for DARPA, the National Institute of Justice, AAAI, and ACM SIGKDD. She has served on several committees for the National Academies of Sciences, Engineering and Medicine, including the Committee on Applied and Theoretical Statistics, the Committee on Law and Justice, the Committee on Analytic Research Foundations for the Next-Generation Electric Grid, and the Committee on Facial Recognition Technology. She has given keynote/invited talks at several conferences including KDD (twice), AISTATS, SDM, Machine Learning in Healthcare (MLHC), Fairness, Accountability and Transparency in Machine Learning (FAT-ML), ECML-PKDD, and the Nobel Conference. Her work has been featured in news outlets including the NY Times, Washington Post, Wall Street Journal, the Boston Globe, Businessweek, and NPR.

Industry Keynote

Forecasting 2.0 – New Ways to See Around Corners

Kirk Borne, Chief Science Officer, DataPrime, Inc.

Predictive modeling and predictive analytics are among the most common industry and business applications of data science and machine learning. I will first review some interesting (amusing and/or impactful) failure cases of traditional forecasting and predictive modeling. These include traditional autoregressive time series forecasting, which I refer to as “forecasting 1.0”. I will then introduce some approaches to predictive analytics that are different from standard forecasting 1.0. These novel “forecasting 2.0” methods are more contextual, exploiting the insights that come from external contextual data sources. Context-based methods are therefore more beneficial than autoregressive methods in the current data-intensive era in which data sources and data formats extend far beyond traditional time series. Contextual analytics approaches also enable opportunities for prescriptive analytics (causal analysis and causality discovery), which are similar to O.R., but again these go beyond traditional methods of optimization through the application of data from multiple diverse sensors, including the exploding growth of data sources in the IoT (Internet of Things). In this environment, I envision the IoT as the “Internet of Context” enabling “Forecasting-as-a-Service” (FaaS). Several examples and algorithm categories will be presented to illustrate diverse forecasting 2.0 applications.

About Kirk Borne

Dr. Kirk Borne is the Chief Science Officer at AI startup DataPrime Inc and is the owner and founder of his own freelance consulting business Data Leadership Group LLC. He is a career data professional, data science leader, and research astrophysicist. From 2015 to 2021, he was Principal Data Scientist, Data Science Fellow, and Executive Advisor at management consulting firm Booz Allen Hamilton. Previously, Kirk was professor of Astrophysics and Computational Science at George Mason University for 12 years where he co-founded the world’s first data science undergraduate degree program, and where did research and taught data science at the graduate and undergraduate levels. Before that, he spent 20 years supporting data systems activities for NASA space science missions, including a role as NASA’s Data Archive Project Scientist for the Hubble Telescope. He has a Ph.D. in astronomy from Caltech. He is an elected Fellow of the International Astrostatistics Association for his contributions to big data research in astronomy. In 2020, he was elected a Fellow of the American Astronomical Society for lifelong contributions to the field of astronomy. Since 2013, he has been identified as a top worldwide influencer on social media, promoting analytics, data science, machine learning, AI, and data literacy for all. He is currently advisor to several businesses and educational institutions. He is most recently exploring the synergies and innovation opportunities at the convergence of multiple emerging digital technologies: IoT, the metaverse, digital twins, intelligent edge, immersive realities, autonomous systems, and more!

Joint Panel Discussion with Quality Statistics and Reliability (QSR) Workshop: Fairness and Interpretability in AI/ML

Many black box machine learning models are being used for high-stakes decisions in healthcare, manufacturing, social media, and various other fields. As a result, there is high susceptibility to bias toward different population demographics as well as poor interpretability in understanding why models make a variety of predictions. This panel will cover recent topics and developments across a number of applications for which fairness and interpretability of machine learning models are crucial.

Panelists:

Dr. Cynthia Rudin is a professor of computer science and engineering at Duke University. She directs the Interpretable Machine Learning Lab, and her goal is to design predictive models that people can understand. Her lab applies machine learning in many areas, such as healthcare, criminal justice, and energy reliability. She holds degrees from the University at Buffalo and Princeton. She is the recipient of the 2022 Squirrel AI Award for Artificial Intelligence for the Benefit of Humanity from the Association for the Advancement of Artificial Intelligence (the “Nobel Prize of AI”). She received a 2022 Guggenheim fellowship, and is a fellow of the American Statistical Association, the Institute of Mathematical Statistics, and the Association for the Advancement of Artificial Intelligence. Her work has been featured in many news outlets including the NY Times, Washington Post, Wall Street Journal, and Boston Globe.

Dr. Na Zou is currently a Corrie & Jim Furber ’64 assistant professor in Engineering Technology and Industrial Distribution at Texas A&M University. She was an Instructional Assistant Professor in Industrial and Systems Engineering at Texas A&M University from 2016 to 2020. She holds both a Ph.D. in Industrial Engineering and a MSE in Civil, Environmental and Sustainable Engineering from Arizona State University. Her research focuses on fair and interpretable machine learning, transfer learning, network modeling and inference, supported by NSF and industrial sponsors. The research projects have resulted in publications at prestigious journals such as Technometrics, IISE Transactions and ACM Transactions, including one Best Paper Finalist and one Best Student Paper Finalist at INFORMS QSR section and two featured articles at ISE Magazine. She was the recipient of IEEE Irv Kaufman Award and Texas A&M Institute of Data Science Career Initiation Fellow.

Dr. Kinjal Basu is currently a Senior Staff Software Engineer in LinkedIn’s AI team, primarily focusing on Responsible AI, encompassing challenging problems in Fairness, Explainability and Privacy. He leads several efforts that can be applied to different product applications towards making LinkedIn a responsible and equitable platform. Throughout the years, Dr. Basu has worked on a variety of problems and on various product applications. His focus has ranged from developing prediction models for complex recommender systems powering News Feed Ranking and People You May Know (PYMK) to extreme large-scale optimization problems trying to solve complex matching and allocation problems. He has been the chief architect and designer for the AutoML library used internally by various teams such as Feed, Notifications, Ads and PYMK. Dr. Basu has also worked towards developing accurate causal estimates in the presence of network interference.

Poster Competition

Are you a student or practitioner working on applied work in the fields of data mining or data science? Click here for more information on the Poster Competition.

Previous Workshops

16th Virtual INFORMS Workshop on Data Mining and Decision Analytics

1 5 th Virtual INFORMS Workshop on Data Mining and Decision Analytics

14th INFORMS Workshop on Data Mining and Decision Analytics

13th INFORMS Workshop on Data Mining and Decision Analytics