6565
6666
6767\usepackage {longtable,booktabs,array }
68+ \newcounter {none} % for unnumbered tables
6869\usepackage {calc } % for calculating minipage widths
6970% Correct order of tables after \paragraph or \subparagraph
7071\usepackage {etoolbox }
@@ -291,6 +292,14 @@ \subsubsection{Unit I --- Experimental Data as a Learning Problem (Weeks
291292 Why ML failure modes are common in experimental science.
292293\end {itemize }
293294
295+ \textbf {Summary: } This unit introduces the transition from classical
296+ physics-based modeling to data-driven discovery in materials science. We
297+ explore the unique challenges of experimental materials data, including
298+ its multi-modal nature, high acquisition cost, and the fundamental
299+ Processing-Structure-Property-Performance (PSPP) relationships. Key
300+ concepts include data scales, measurement uncertainty, and the CRISP-DM
301+ process adapted for scientific workflows.
302+
294303\textbf {Exercise: }\\
295304Inspect real microscopy and process datasets; identify sources of bias
296305and noise.
@@ -315,6 +324,15 @@ \subsubsection{Unit I --- Experimental Data as a Learning Problem (Weeks
315324 Relation to MFML refresher on PCA and covariance.
316325\end {itemize }
317326
327+ \textbf {Summary: } This unit bridges the gap between the physical process
328+ of data acquisition and the mathematical tools used to describe it. We
329+ analyze how signals are formed in characterization tools and how
330+ physical constraints (resolution, noise, sampling) act as priors for
331+ learning. We then introduce Principal Component Analysis (PCA) and
332+ Singular Value Decomposition (SVD) as fundamental techniques for
333+ discovering low-dimensional structure in high-dimensional experimental
334+ datasets.
335+
318336\textbf {Exercise: }\\
319337Fourier inspection of micrographs; effects of sampling and filtering.
320338
@@ -336,6 +354,15 @@ \subsubsection{Unit I --- Experimental Data as a Learning Problem (Weeks
336354 Why `` good accuracy'' often means a broken pipeline.
337355\end {itemize }
338356
357+ \textbf {Summary: } This unit focuses on the most critical and often
358+ overlooked part of the ML pipeline: data integrity. We discuss
359+ systematic data cleaning and normalization techniques while highlighting
360+ the unique challenges of labeling experimental materials data, such as
361+ inter-annotator variance. A major focus is on \textbf {Data Leakage },
362+ specifically how spatial and physical correlations in materials samples
363+ can lead to deceptively high model performance. We introduce robust
364+ validation strategies to ensure models generalize to truly unseen data.
365+
339366\textbf {Exercise: }\\
340367Construct a deliberately flawed ML pipeline and diagnose its failure.
341368
@@ -363,6 +390,16 @@ \subsubsection{Unit II --- Representation Learning for Microstructures
363390 Transition to learned representations.
364391\end {itemize }
365392
393+ \textbf {Summary: } This unit marks the transition from classical,
394+ hand-crafted microstructure quantification (like grain size and phase
395+ fractions) to the modern paradigm of \textbf {learned representations }.
396+ We first review traditional stereological metrics and their limitations
397+ in capturing complex structural nuances. We then introduce the
398+ foundational unit of modern ML: the \textbf {artificial neuron }. By
399+ understanding weights, biases, and non-linear activation functions, we
400+ build the framework for Multi-Layer Perceptrons (MLPs) that can
401+ automatically learn optimal features from materials data.
402+
366403\textbf {Exercise: }\\
367404Compare classical features vs simple NN-based features for
368405microstructure tasks.
@@ -386,6 +423,16 @@ \subsubsection{Unit II --- Representation Learning for Microstructures
386423 Overfitting risks with small datasets.
387424\end {itemize }
388425
426+ \textbf {Summary: } This unit introduces \textbf {Convolutional Neural
427+ Networks (CNNs) }, the workhorse of modern computer vision, and applies
428+ them to materials characterization. We explore how convolutions allow
429+ networks to automatically learn hierarchical structure detectors---from
430+ simple edges to complex phase morphologies---while drastically reducing
431+ the number of parameters compared to standard MLPs. Through case studies
432+ in phase segmentation and defect detection, students learn the intuition
433+ behind filters, pooling, and the unique challenges of applying deep
434+ learning to high-resolution, noisy experimental micrographs.
435+
389436\textbf {Exercise: }\\
390437Train a small CNN on microstructure images; analyze failure cases.
391438
@@ -407,6 +454,16 @@ \subsubsection{Unit II --- Representation Learning for Microstructures
407454 When transfer learning helps---and when it does not.
408455\end {itemize }
409456
457+ \textbf {Summary: } This unit addresses the fundamental bottleneck of
458+ materials informatics: \textbf {Data Scarcity }. We explore how to build
459+ powerful deep learning models when only a few hundred labeled images or
460+ signals are available. The core focus is on \textbf {Transfer Learning },
461+ where we leverage knowledge from models pretrained on millions of
462+ natural images to accelerate learning and improve generalization on
463+ materials tasks. We also cover \textbf {Data Augmentation } strategies
464+ tailored for scientific data and discuss when and why transferring
465+ knowledge across different physical domains succeeds or fails.
466+
410467\textbf {Exercise: }\\
411468Fine-tune a pretrained model; compare against training from scratch.
412469
@@ -432,6 +489,17 @@ \subsubsection{Unit III --- Learning from Processing Data (Weeks
432489 Relation to MFML concepts of generalization.
433490\end {itemize }
434491
492+ \textbf {Summary: } This unit explores the application of machine learning
493+ to \textbf {Time-Series Data }, specifically for monitoring and predicting
494+ materials processing outcomes. We introduce \textbf {Recurrent Neural
495+ Networks (RNNs) } and their advanced variants like \textbf {LSTMs }, which
496+ are designed to handle sequential dependencies. We discuss the critical
497+ preprocessing steps of signal smoothing and triggering required to
498+ handle noisy experimental logs. Through case studies in additive
499+ manufacturing and process stability, students learn how to build models
500+ that `` remember'' the processing history to predict future states and
501+ detect anomalies in real-time.
502+
435503\textbf {Exercise: }\\
436504Predict a process outcome from time-series data using regression or
437505simple RNNs.
@@ -454,6 +522,16 @@ \subsubsection{Unit III --- Learning from Processing Data (Weeks
454522 Robustness as a design criterion.
455523\end {itemize }
456524
525+ \textbf {Summary: } This unit shifts the focus from model performance to
526+ \textbf {Model Reliability }. We explore the Bias-Variance tradeoff and
527+ the fundamental challenge of generalization---ensuring that an ML model
528+ works on new, unseen data from the factory floor. We introduce robust
529+ validation techniques like K-Fold and Stratified Cross-Validation to
530+ stabilize performance estimates on small materials datasets. A key focus
531+ is on \textbf {Process Robustness }, where we use sensitivity analysis to
532+ identify `` Process Windows'' ---regions in parameter space where material
533+ quality is maximized and insensitive to industrial noise.
534+
457535\textbf {Exercise: }\\
458536Analyze model robustness under perturbed process conditions.
459537
@@ -475,6 +553,17 @@ \subsubsection{Unit III --- Learning from Processing Data (Weeks
475553 Physics-informed vs unconstrained regression.
476554\end {itemize }
477555
556+ \textbf {Summary: } This unit explores \textbf {Inverse Problems }---the
557+ cornerstone of materials design where we seek the processing parameters
558+ required to achieve a target microstructure or performance. We contrast
559+ these with causal forward problems and discuss why they are often
560+ ill-posed and multi-valued. We introduce \textbf {Physics-Informed
561+ Learning } as a way to solve these challenges by enriching models with
562+ physical transformations and constraints. Students learn how to build
563+ and interpret \textbf {Process Maps } and `` Process Corridors,'' using
564+ machine learning to visualize safe operating regions in complex
565+ experimental spaces.
566+
478567\textbf {Exercise: }\\
479568Construct a simple ML-based process map; compare constrained vs
480569unconstrained models.
@@ -501,6 +590,17 @@ \subsubsection{Unit IV --- Uncertainty, Surrogates, and Automation
501590 Using ML without destroying physical meaning.
502591\end {itemize }
503592
593+ \textbf {Summary: } This unit focuses on the processing of
594+ high-dimensional \textbf {Characterization Signals } (like XRD, EDS, and
595+ EELS) using unsupervised learning. We introduce \textbf {K-Means
596+ Clustering } and \textbf {t-SNE } for the automatic identification and
597+ visualization of phases in large experimental libraries. We then explore
598+ \textbf {Autoencoders }---neural networks that learn to compress complex
599+ spectra into a low-dimensional `` latent space.'' This allows for
600+ advanced denoising and feature extraction, enabling scientists to handle
601+ the massive data volumes produced by modern high-throughput
602+ characterization tools without losing physical insight.
603+
504604\textbf {Exercise: }\\
505605Apply PCA/NMF to spectral datasets; interpret components physically.
506606
@@ -647,7 +747,19 @@ \subsection{Lab Possibilities}\label{lab-possibilities}
647747 Multi-modal fusion of images, spectra, and process parameters.
648748\end {itemize }
649749
650- \phantomsection \label {refs }
750+ \textbf {Summary: } This unit explores the cutting edge of
751+ \textbf {Autonomous Characterization }, where machine learning moves from
752+ passive data analysis to active instrument control. We introduce
753+ \textbf {Multi-Modal Data Fusion } techniques to combine information from
754+ diverse sensors like SEM images, EDS spectra, and process logs using
755+ Bayesian frameworks. We then discuss \textbf {Reinforcement Learning
756+ (RL) } as a tool for automating complex laboratory tasks, such as
757+ instrument tuning and process optimization. Through case studies in
758+ microscopy and industrial processing, students learn how to build
759+ integrated pipelines that can autonomously find, characterize, and
760+ decide the next steps of an experiment.
761+
762+ \protect\phantomsection \label {refs }
651763\begin {CSLReferences }{1}{0}
652764\bibitem [\citeproctext ]{ref-sandfeld2024materials}
653765Sandfeld, S. (2024). \emph {Materials data science: Introduction to data
0 commit comments