This report presents a comprehensive investigation into the functional programming foundations underlying PyTorch's computational model, with particular emphasis on reframing tensor operations through the lens of Haskell-style functional programming. We establish that PyTorch's Tensor type exhibits fundamental characteristics of free monads, particularly in its lazy evaluation modes and code generation pipelines. This work bridges two seemingly disparate domains: imperative machine learning frameworks and pure functional programming theory.
The core contribution is a formal analysis demonstrating that PyTorch's design, when used in advanced scenarios involving automatic differentiation, symbolic computation, and meta-programming, necessarily implements monadic composition patterns. We provide detailed Haskell-based encodings of tensor operations using both initial (free monad) and final (tagless final) approaches, conduct a rigorous comparison of their respective strengths and limitations, and survey related work in functional machine learning. Additionally, we investigate why these powerful functional programming abstractions remain largely unacknowledged in the PyTorch community despite their ubiquitous presence in the framework's architecture.
- 1.1 Motivation and Problem Statement
- 1.2 Scope and Contributions
- 1.3 Target Audience and Prerequisites
- 2.1 Monads and Monadic Laws
- 2.2 Free Monads and Initial Encoding
- 2.3 Final Encoding and Tagless Final Interpreters
- 2.4 Type-Indexed Operations and Dependent Types
- 2.5 Automatic Differentiation from a Functional Perspective
- 3.1 The Tensor Type as an Indexed Monad
- 3.2 Applicative Functors in Tensor Composition
- 3.3 Monadic Bind and Imperative Loop Semantics
- 3.4 Eager Evaluation vs. Lazy Evaluation: White Box vs. Black Box
- 3.5 The Autograd Graph as a Free Monad Structure
- 4.1 Defining the Tensor Free Monad in Haskell
- 4.2 Primitive Operations and the Free Monad Functor
- 4.3 Monadic Laws and Tensor Composition
- 4.4 Implementing Basic Operations (Addition, Multiplication, Transpose)
- 4.5 Automatic Differentiation in the Free Monad
- 4.6 Advanced Operations: Softmax and Dropout in Initial Encoding
- 4.7 Interpreters: From Free Monad to Computation
- 5.1 Principles of Final Encoding
- 5.2 Type Classes for Tensor Operations
- 5.3 Implementing Tensors via Final Encoding
- 5.4 Basic Operations in Final Encoding
- 5.5 Automatic Differentiation in Final Encoding
- 5.6 Advanced Operations: Softmax and Dropout in Final Encoding
- 5.7 Multiple Interpreters and Extensibility
- 6.1 Expressiveness and Compositionality
- 6.2 Performance Characteristics
- 6.3 Extensibility and Modularity
- 6.4 Handling Advanced Operations
- 6.5 Practical Considerations for ML Frameworks
- 6.6 Strengths and Weaknesses Summary
- 7.1 Catalog of Required Primitive Operations
- 7.2 Challenges in Representing Softmax
- 7.3 Challenges in Representing Dropout and Stochastic Operations
- 7.4 Shape Tracking and Type-Level Constraints
- 7.5 Memory Management and In-Place Operations
- 7.6 GPU Operations and Hardware Abstraction
- 8.1 Haskell-Based ML Frameworks
- 8.2 Functional Approaches to Automatic Differentiation
- 8.3 Symbolic Computation in Functional Languages
- 8.4 Academic Research on Functional ML
- 8.5 Category-Theoretic Perspectives on Neural Networks
- 9.1 Historical Development and Design Philosophy
- 9.2 Cultural Factors in the ML Community
- 9.3 Performance Optimization Priorities
- 9.4 Accessibility and Pedagogical Considerations
- 9.5 Language Features and Runtime Constraints in Python
- 10.1 TensorFlow's Graph Construction Model
- 10.2 JAX's Functional Transformation Approach
- 10.3 MXNet's Lazy Evaluation Design
- 10.4 Functional Frameworks: Glow, MLIR, and Circuit
- 10.5 Comparative Encoding Strategies
- 11.1 Implications for ML Framework Design
- 11.2 Opportunities for Formal Verification
- 11.3 Type Safety in Machine Learning
- 11.4 Bridging Functional and Imperative Paradigms
- 11.5 Recommendations for Framework Developers