Skip to content

Commit 0ea76eb

Browse files
author
Quarto GHA Workflow Runner
committed
Built site for gh-pages
1 parent a295081 commit 0ea76eb

18 files changed

+940
-448
lines changed

.nojekyll

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
53315200
1+
8070aba6

_tex/index.tex

Lines changed: 113 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,7 @@
6565

6666

6767
\usepackage{longtable,booktabs,array}
68+
\newcounter{none} % for unnumbered tables
6869
\usepackage{calc} % for calculating minipage widths
6970
% Correct order of tables after \paragraph or \subparagraph
7071
\usepackage{etoolbox}
@@ -291,6 +292,14 @@ \subsubsection{Unit I --- Experimental Data as a Learning Problem (Weeks
291292
Why ML failure modes are common in experimental science.
292293
\end{itemize}
293294

295+
\textbf{Summary:} This unit introduces the transition from classical
296+
physics-based modeling to data-driven discovery in materials science. We
297+
explore the unique challenges of experimental materials data, including
298+
its multi-modal nature, high acquisition cost, and the fundamental
299+
Processing-Structure-Property-Performance (PSPP) relationships. Key
300+
concepts include data scales, measurement uncertainty, and the CRISP-DM
301+
process adapted for scientific workflows.
302+
294303
\textbf{Exercise:}\\
295304
Inspect real microscopy and process datasets; identify sources of bias
296305
and noise.
@@ -315,6 +324,15 @@ \subsubsection{Unit I --- Experimental Data as a Learning Problem (Weeks
315324
Relation to MFML refresher on PCA and covariance.
316325
\end{itemize}
317326

327+
\textbf{Summary:} This unit bridges the gap between the physical process
328+
of data acquisition and the mathematical tools used to describe it. We
329+
analyze how signals are formed in characterization tools and how
330+
physical constraints (resolution, noise, sampling) act as priors for
331+
learning. We then introduce Principal Component Analysis (PCA) and
332+
Singular Value Decomposition (SVD) as fundamental techniques for
333+
discovering low-dimensional structure in high-dimensional experimental
334+
datasets.
335+
318336
\textbf{Exercise:}\\
319337
Fourier inspection of micrographs; effects of sampling and filtering.
320338

@@ -336,6 +354,15 @@ \subsubsection{Unit I --- Experimental Data as a Learning Problem (Weeks
336354
Why ``good accuracy'' often means a broken pipeline.
337355
\end{itemize}
338356

357+
\textbf{Summary:} This unit focuses on the most critical and often
358+
overlooked part of the ML pipeline: data integrity. We discuss
359+
systematic data cleaning and normalization techniques while highlighting
360+
the unique challenges of labeling experimental materials data, such as
361+
inter-annotator variance. A major focus is on \textbf{Data Leakage},
362+
specifically how spatial and physical correlations in materials samples
363+
can lead to deceptively high model performance. We introduce robust
364+
validation strategies to ensure models generalize to truly unseen data.
365+
339366
\textbf{Exercise:}\\
340367
Construct a deliberately flawed ML pipeline and diagnose its failure.
341368

@@ -363,6 +390,16 @@ \subsubsection{Unit II --- Representation Learning for Microstructures
363390
Transition to learned representations.
364391
\end{itemize}
365392

393+
\textbf{Summary:} This unit marks the transition from classical,
394+
hand-crafted microstructure quantification (like grain size and phase
395+
fractions) to the modern paradigm of \textbf{learned representations}.
396+
We first review traditional stereological metrics and their limitations
397+
in capturing complex structural nuances. We then introduce the
398+
foundational unit of modern ML: the \textbf{artificial neuron}. By
399+
understanding weights, biases, and non-linear activation functions, we
400+
build the framework for Multi-Layer Perceptrons (MLPs) that can
401+
automatically learn optimal features from materials data.
402+
366403
\textbf{Exercise:}\\
367404
Compare classical features vs simple NN-based features for
368405
microstructure tasks.
@@ -386,6 +423,16 @@ \subsubsection{Unit II --- Representation Learning for Microstructures
386423
Overfitting risks with small datasets.
387424
\end{itemize}
388425

426+
\textbf{Summary:} This unit introduces \textbf{Convolutional Neural
427+
Networks (CNNs)}, the workhorse of modern computer vision, and applies
428+
them to materials characterization. We explore how convolutions allow
429+
networks to automatically learn hierarchical structure detectors---from
430+
simple edges to complex phase morphologies---while drastically reducing
431+
the number of parameters compared to standard MLPs. Through case studies
432+
in phase segmentation and defect detection, students learn the intuition
433+
behind filters, pooling, and the unique challenges of applying deep
434+
learning to high-resolution, noisy experimental micrographs.
435+
389436
\textbf{Exercise:}\\
390437
Train a small CNN on microstructure images; analyze failure cases.
391438

@@ -407,6 +454,16 @@ \subsubsection{Unit II --- Representation Learning for Microstructures
407454
When transfer learning helps---and when it does not.
408455
\end{itemize}
409456

457+
\textbf{Summary:} This unit addresses the fundamental bottleneck of
458+
materials informatics: \textbf{Data Scarcity}. We explore how to build
459+
powerful deep learning models when only a few hundred labeled images or
460+
signals are available. The core focus is on \textbf{Transfer Learning},
461+
where we leverage knowledge from models pretrained on millions of
462+
natural images to accelerate learning and improve generalization on
463+
materials tasks. We also cover \textbf{Data Augmentation} strategies
464+
tailored for scientific data and discuss when and why transferring
465+
knowledge across different physical domains succeeds or fails.
466+
410467
\textbf{Exercise:}\\
411468
Fine-tune a pretrained model; compare against training from scratch.
412469

@@ -432,6 +489,17 @@ \subsubsection{Unit III --- Learning from Processing Data (Weeks
432489
Relation to MFML concepts of generalization.
433490
\end{itemize}
434491

492+
\textbf{Summary:} This unit explores the application of machine learning
493+
to \textbf{Time-Series Data}, specifically for monitoring and predicting
494+
materials processing outcomes. We introduce \textbf{Recurrent Neural
495+
Networks (RNNs)} and their advanced variants like \textbf{LSTMs}, which
496+
are designed to handle sequential dependencies. We discuss the critical
497+
preprocessing steps of signal smoothing and triggering required to
498+
handle noisy experimental logs. Through case studies in additive
499+
manufacturing and process stability, students learn how to build models
500+
that ``remember'' the processing history to predict future states and
501+
detect anomalies in real-time.
502+
435503
\textbf{Exercise:}\\
436504
Predict a process outcome from time-series data using regression or
437505
simple RNNs.
@@ -454,6 +522,16 @@ \subsubsection{Unit III --- Learning from Processing Data (Weeks
454522
Robustness as a design criterion.
455523
\end{itemize}
456524

525+
\textbf{Summary:} This unit shifts the focus from model performance to
526+
\textbf{Model Reliability}. We explore the Bias-Variance tradeoff and
527+
the fundamental challenge of generalization---ensuring that an ML model
528+
works on new, unseen data from the factory floor. We introduce robust
529+
validation techniques like K-Fold and Stratified Cross-Validation to
530+
stabilize performance estimates on small materials datasets. A key focus
531+
is on \textbf{Process Robustness}, where we use sensitivity analysis to
532+
identify ``Process Windows''---regions in parameter space where material
533+
quality is maximized and insensitive to industrial noise.
534+
457535
\textbf{Exercise:}\\
458536
Analyze model robustness under perturbed process conditions.
459537

@@ -475,6 +553,17 @@ \subsubsection{Unit III --- Learning from Processing Data (Weeks
475553
Physics-informed vs unconstrained regression.
476554
\end{itemize}
477555

556+
\textbf{Summary:} This unit explores \textbf{Inverse Problems}---the
557+
cornerstone of materials design where we seek the processing parameters
558+
required to achieve a target microstructure or performance. We contrast
559+
these with causal forward problems and discuss why they are often
560+
ill-posed and multi-valued. We introduce \textbf{Physics-Informed
561+
Learning} as a way to solve these challenges by enriching models with
562+
physical transformations and constraints. Students learn how to build
563+
and interpret \textbf{Process Maps} and ``Process Corridors,'' using
564+
machine learning to visualize safe operating regions in complex
565+
experimental spaces.
566+
478567
\textbf{Exercise:}\\
479568
Construct a simple ML-based process map; compare constrained vs
480569
unconstrained models.
@@ -501,6 +590,17 @@ \subsubsection{Unit IV --- Uncertainty, Surrogates, and Automation
501590
Using ML without destroying physical meaning.
502591
\end{itemize}
503592

593+
\textbf{Summary:} This unit focuses on the processing of
594+
high-dimensional \textbf{Characterization Signals} (like XRD, EDS, and
595+
EELS) using unsupervised learning. We introduce \textbf{K-Means
596+
Clustering} and \textbf{t-SNE} for the automatic identification and
597+
visualization of phases in large experimental libraries. We then explore
598+
\textbf{Autoencoders}---neural networks that learn to compress complex
599+
spectra into a low-dimensional ``latent space.'' This allows for
600+
advanced denoising and feature extraction, enabling scientists to handle
601+
the massive data volumes produced by modern high-throughput
602+
characterization tools without losing physical insight.
603+
504604
\textbf{Exercise:}\\
505605
Apply PCA/NMF to spectral datasets; interpret components physically.
506606

@@ -647,7 +747,19 @@ \subsection{Lab Possibilities}\label{lab-possibilities}
647747
Multi-modal fusion of images, spectra, and process parameters.
648748
\end{itemize}
649749

650-
\phantomsection\label{refs}
750+
\textbf{Summary:} This unit explores the cutting edge of
751+
\textbf{Autonomous Characterization}, where machine learning moves from
752+
passive data analysis to active instrument control. We introduce
753+
\textbf{Multi-Modal Data Fusion} techniques to combine information from
754+
diverse sensors like SEM images, EDS spectra, and process logs using
755+
Bayesian frameworks. We then discuss \textbf{Reinforcement Learning
756+
(RL)} as a tool for automating complex laboratory tasks, such as
757+
instrument tuning and process optimization. Through case studies in
758+
microscopy and industrial processing, students learn how to build
759+
integrated pipelines that can autonomously find, characterize, and
760+
decide the next steps of an experiment.
761+
762+
\protect\phantomsection\label{refs}
651763
\begin{CSLReferences}{1}{0}
652764
\bibitem[\citeproctext]{ref-sandfeld2024materials}
653765
Sandfeld, S. (2024). \emph{Materials data science: Introduction to data
-6.04 KB
Binary file not shown.

index-meca.zip

-613 KB
Binary file not shown.

index.docx

2.51 KB
Binary file not shown.

0 commit comments

Comments
 (0)