Goal:
- Minimization of an energy function to achieve an optimal or stable state.
Method:
- Define an energy function that quantifies the "cost" or "stability" of a system’s state.
- Find the configuration of variables that minimizes this energy function.
Generalization:
where:
$E, \alpha, \beta \in \mathbb{R}$ -
$A \in \mathbb{R}^{N \times N}$ (where$N$ is the number of variables) $b = [b_1, b_2, \ldots, b_N]^T \in \mathbb{R}^N$ $C = [C_1, C_2, \ldots, C_N]^T \in \mathbb{R}^N$
Explain:
- Pairwise interaction term (
$A_{ij} b_i b_j$ ): Models pairwise relationships or couplings between variables. Coefficient$\alpha$ adjusts the strength of interactions. - Linear term (
$C_i b_i$ ): Encodes external influences or biases on individual variables. Coefficient$\beta$ scales their effect. - The minimization process reflects a trade-off between these terms, tailored to the specific context of each model, with the
$\pm$ signs of$\alpha$ and$\beta$ allowing positive or negative contributions.
The general optimization form:
or in matrix-vector notation:
can diverge into two types of behavior:
The energy function
-
$A$ is positive semi-definite (PSD), i.e.$x^T Ax \geq 0 \quad \forall x \in \mathbb{R}^N$ -
$\alpha \geq 0$
In this case, there exist efficient algorithms (e.g. convex solvers, quadratic programming) to find the global minimum.
If either of the following holds:
-
$A$ is not positive semi-definite (i.e., has negative eigenvalues), or $\alpha < 0$
then
In such cases, approximate or heuristic methods are commonly used:
- Monte Carlo simulations (e.g. simulated annealing)
- Gradient descent (may converge to local minima)
- Evolutionary algorithms, etc.
1. Markowitz Model - Portfolio Optimization
Equation:
where:
-
$F$ : Objective function (risk minus return). -
$C_{ij}$ : Covariance between assets$i$ and$j$ , representing the risk contribution. -
$n_i$ : Weight of asset$i$ (proportion of portfolio invested). -
$\zeta$ : Risk tolerance parameter. -
$R_i$ : Expected return of asset$i$ .
Principle:
Minimizing
2. Sherrington-Kirkpatrick Model - Spin Glasses
Equation:
where:
-
$H$ : Hamiltonian - the total energy of the spin glass system. -
$J_{ij}$ : Random interaction strength between spins$i$ and$j$ (e.g., Gaussian distributed with mean 0 and variance 1). -
$S_i \in \lbrace-1, +1\rbrace$ : Spin variable at site$i$ . -
$h_i$ : External field at site$i$ (can be uniform$h$ in some formulations).
Principle: Minimizing
3. Ising Model - Magnetic Systems
Equation:
where:
-
$H(\sigma)$ : Hamiltonian for spin configuration$\sigma = (\sigma_1, \sigma_2, \ldots, \sigma_N)$ . -
$J_{ij}$ : Coupling between neighboring spins$i$ and$j$ . -
$\sigma_i \in \lbrace-1, +1\rbrace$ : Spin at site$i$ . -
$\mu$ : Magnetic moment of a single spin (often set to 1 or absorbed into$h_i$ ). -
$h_i$ : External magnetic field at site$i$ .
Principle: Minimizing
4. Hopfield Model - Neural Network
Equation:
where:
-
$E$ : Energy function, represents the network's stability. -
$w_{ij}$ : Weights of the learning rule between neuron$i$ and$j$ (symmetric,$w_{ij} = w_{ji}$ ). -
$v_i$ : State of neuron$i$ (e.g., 0 or 1, or$\pm 1$ in some formulations). -
$\theta_i$ : Activation threshold of neuron$i$ .
Principle:
Minimizing
5. Graph Cuts for Image Segmentation - Computer Vision
Equation:
where:
-
$E(L)$ : Total energy for a labeling$L = {L_p}$ . -
$L_p$ : Label assigned to pixel$p$ (e.g. foreground or background). -
$D_p(L_p)$ : Data term — cost of assigning label$L_p$ to pixel$p$ , encourages data fidelity. -
$V_{pq}(L_p, L_q)$ : Smoothness term — cost of label discontinuity between neighboring pixels$p$ and$q$ , encourages label coherence. -
$N$ : Set of adjacent pixel pairs (i.e. edges in the image grid graph).
Principle:
Minimizing