The Data Mining Society of INFORMS is organizing the 17th INFORMS Workshop on Data Mining and Decision Analytics in conjunction with 2022 INFORMS Annual Meeting. You are cordially invited to join us and share your recent research work with peers from data mining, decision analytics, and artificial intelligence.
To participate, a full paper must be submitted before the deadline for consideration. The workshop committee also announces the best paper competition in both theoretical and applied research tracks. All accepted papers are automatically considered for the best paper competition in the chosen track.
Topics of Interest
Include, but are not limited to:
- Analytics in Social Media & Finance
- Anomaly Detection
- Bayesian Data Analytics
- Causal Mining (Inference)
- Data Science and Artificial Intelligence
- Deep Learning
- Emerging Data Analytics in Industrial Applications
- Ethics and Security in Data Mining
- Fairness in Machine Learning
- Healthcare Analytics
- Interpretable Data Mining
- Large-scale Data Analytics and Big Data
- Longitudinal Data Analysis
- Network Analysis and Graph Mining
- Privacy & Fairness in Data Science
- Reinforcement Learning
- Reliability & Maintenance
- Simulation/Optimization in Data Analytics
- Text Mining & Natural Language Processing
- Visual Analytics
- Web Analytics/Web Mining
May 16: Paper submission begins
August 8: Paper submission closes
September 1: Final review decision
September 14: Workshop on Data Mining and Decision Analytics registration deadline
DM Workshop Co-chairs
Nathan Gaw, Air Force Institute of Technology
Eyyub Kibis, Montclair State University
Feng Liu, Stevens Institute of Technology
Paul Brooks, Virginia Commonwealth University
Matthew Lanham, Purdue University
Ramin Moghaddass, University of Miami
Asil Oztekin, University of Massachusetts Lowell
Cynthia Rudin, Duke University
George Runger, Arizona State University
Onur Seref, Virginia Tech
Durai Sundaramoorthi, Washington University
Thank You to Our Sponsors
Papers submission guideline
- Maximum of 10 pages (including abstract, tables, figures, and references)
- Single-spacing and 11-point font with one-inch margins on four sides
- Papers must be submitted via the provided submission link (TBD). Late submission will not be considered for further review.
- Copyright: The DM workshop will not retain the copyrights on the papers. Authors are free to submit their papers to other outlets.
Click here to view the full DMDA Workshop schedule.
Understanding How Dimension Reduction Tools Work
Cynthia Rudin, Professor of Computer Science, Duke University
Dimension reduction (DR) techniques such as t-SNE, UMAP, and TriMap have demonstrated impressive visualization performance on many real world datasets. They are useful for understanding data and trustworthy decision-making, particularly for biological data. One tension that has always faced these methods is the trade-off between preservation of global structure and preservation of local structure: past methods can either handle one or the other, but not both. In this work, our main goal is to understand what aspects of DR methods are important for preserving both local and global structure: it is difficult to design a better method without a true understanding of the choices we make in our algorithms and their empirical impact on the lower-dimensional embeddings they produce. Towards the goal of local structure preservation, we provide several useful design principles for DR loss functions based on our new understanding of the mechanisms behind successful DR methods. Towards the goal of global structure preservation, our analysis illuminates that the choice of which components to preserve is important. We leverage these insights to design a new algorithm for DR, called Pairwise Controlled Manifold Approximation Projection (PaCMAP), which preserves both local and global structure. Our work provides several unexpected insights into what design choices both to make and avoid when constructing DR algorithms.
The following papers will be discussed:
- Yingfan Wang, Haiyang Huang, Cynthia Rudin, Yaron Shaposhnik Understanding How Dimension Reduction Tools Work: An Empirical Approach to Deciphering t-SNE, UMAP, TriMAP, and PaCMAP for Data Visualization Journal of Machine Learning Research (JMLR), 2021 https://jmlr.org/papers/v22/20-1061.html
- Haiyang Huang, Yingfan Wang, Cynthia Rudin, and Edward P. Browne Towards a Comprehensive Evaluation of Dimension Reduction Methods for Transcriptomic Data Visualization Communications Biology (Nature), 2022. https://www.nature.com/articles/s42003-022-03628-x
About Cynthia Rudin
Cynthia Rudin is a professor of computer science, electrical and computer engineering, statistical science, mathematics, and biostatistics & bioinformatics at Duke University, and directs the Interpretable Machine Learning Lab. Previously, Prof. Rudin held positions at MIT, Columbia, and NYU. She holds an undergraduate degree from the University at Buffalo, and a Ph.D. from Princeton University. She is the recipient of the 2022 Squirrel AI Award for Artificial Intelligence for the Benefit of Humanity from the Association for the Advancement of Artificial Intelligence (AAAI). This award is the most prestigious award in the field of artificial intelligence. Similar only to world-renowned recognitions, such as the Nobel Prize and the Turing Award, it carries a monetary reward at the million-dollar level. Prof. Rudin is also a three-time winner of the INFORMS Innovative Applications in Analytics Award, was named as one of the “Top 40 Under 40” by Poets and Quants in 2015, and was named by Businessinsider.com as one of the 12 most impressive professors at MIT in 2015, and is a 2022 Guggenheim Fellow. She is a fellow of the American Statistical Association, the Institute of Mathematical Statistics, and AAAI.
Prof. Rudin is past chair of both the INFORMS Data Mining Section and the Statistical Learning and Data Science Section of the American Statistical Association. She has also served on committees for DARPA, the National Institute of Justice, AAAI, and ACM SIGKDD. She has served on several committees for the National Academies of Sciences, Engineering and Medicine, including the Committee on Applied and Theoretical Statistics, the Committee on Law and Justice, the Committee on Analytic Research Foundations for the Next-Generation Electric Grid, and the Committee on Facial Recognition Technology. She has given keynote/invited talks at several conferences including KDD (twice), AISTATS, SDM, Machine Learning in Healthcare (MLHC), Fairness, Accountability and Transparency in Machine Learning (FAT-ML), ECML-PKDD, and the Nobel Conference. Her work has been featured in news outlets including the NY Times, Washington Post, Wall Street Journal, the Boston Globe, Businessweek, and NPR.
Forecasting 2.0 – New Ways to See Around Corners
Kirk Borne, Chief Science Officer, DataPrime, Inc.
Predictive modeling and predictive analytics are among the most common industry and business applications of data science and machine learning. I will first review some interesting (amusing and/or impactful) failure cases of traditional forecasting and predictive modeling. These include traditional autoregressive time series forecasting, which I refer to as “forecasting 1.0”. I will then introduce some approaches to predictive analytics that are different from standard forecasting 1.0. These novel “forecasting 2.0” methods are more contextual, exploiting the insights that come from external contextual data sources. Context-based methods are therefore more beneficial than autoregressive methods in the current data-intensive era in which data sources and data formats extend far beyond traditional time series. Contextual analytics approaches also enable opportunities for prescriptive analytics (causal analysis and causality discovery), which are similar to O.R., but again these go beyond traditional methods of optimization through the application of data from multiple diverse sensors, including the exploding growth of data sources in the IoT (Internet of Things). In this environment, I envision the IoT as the “Internet of Context” enabling “Forecasting-as-a-Service” (FaaS). Several examples and algorithm categories will be presented to illustrate diverse forecasting 2.0 applications.
About Kirk Borne
Dr. Kirk Borne is the Chief Science Officer at AI startup DataPrime Inc and is the owner and founder of his own freelance consulting business Data Leadership Group LLC. He is a career data professional, data science leader, and research astrophysicist. From 2015 to 2021, he was Principal Data Scientist, Data Science Fellow, and Executive Advisor at management consulting firm Booz Allen Hamilton. Previously, Kirk was professor of Astrophysics and Computational Science at George Mason University for 12 years where he co-founded the world’s first data science undergraduate degree program, and where did research and taught data science at the graduate and undergraduate levels. Before that, he spent 20 years supporting data systems activities for NASA space science missions, including a role as NASA’s Data Archive Project Scientist for the Hubble Telescope. He has a Ph.D. in astronomy from Caltech. He is an elected Fellow of the International Astrostatistics Association for his contributions to big data research in astronomy. In 2020, he was elected a Fellow of the American Astronomical Society for lifelong contributions to the field of astronomy. Since 2013, he has been identified as a top worldwide influencer on social media, promoting analytics, data science, machine learning, AI, and data literacy for all. He is currently advisor to several businesses and educational institutions. He is most recently exploring the synergies and innovation opportunities at the convergence of multiple emerging digital technologies: IoT, the metaverse, digital twins, intelligent edge, immersive realities, autonomous systems, and more!
Joint Panel Discussion with Quality Statistics and Reliability (QSR) Workshop: Fairness and Interpretability in AI/ML
Many black box machine learning models are being used for high-stakes decisions in healthcare, manufacturing, social media, and various other fields. As a result, there is high susceptibility to bias toward different population demographics as well as poor interpretability in understanding why models make a variety of predictions. This panel will cover recent topics and developments across a number of applications for which fairness and interpretability of machine learning models are crucial.
Dr. Cynthia Rudin is a professor of computer science and engineering at Duke University. She directs the Interpretable Machine Learning Lab, and her goal is to design predictive models that people can understand. Her lab applies machine learning in many areas, such as healthcare, criminal justice, and energy reliability. She holds degrees from the University at Buffalo and Princeton. She is the recipient of the 2022 Squirrel AI Award for Artificial Intelligence for the Benefit of Humanity from the Association for the Advancement of Artificial Intelligence (the “Nobel Prize of AI”). She received a 2022 Guggenheim fellowship, and is a fellow of the American Statistical Association, the Institute of Mathematical Statistics, and the Association for the Advancement of Artificial Intelligence. Her work has been featured in many news outlets including the NY Times, Washington Post, Wall Street Journal, and Boston Globe.
Are you a student or practitioner working on applied work in the fields of data mining or data science? Click here for more information on the Poster Competition.