Below is a deep, operational breakdown of AI / ML Theory hiring aligned exactly to your 2022→2025 curve, with job archetypes, keywords, and math skills that let you (a) identify real demand, and (b) design OSS that assists or replaces scarce theorists.

This is written from a “what actually breaks without theory” lens — not buzzwords.

What counts as “AI / ML Theory” (scope boundary)

These are roles where progress requires proofs, bounds, or structural understanding, not just experiments:

learning theory
optimization theory
information theory
high-dimensional geometry
statistical mechanics–style analysis
interpretability theory
robustness & generalization guarantees

If the output is “we can explain / bound / guarantee X”, it counts.

2022 — Baseline (~2,000 hires)

“Theory exists, but scale optimism dominates”

Dominant job archetypes

Research Scientist (Theory)
Machine Learning Theorist
Statistical Learning Researcher
Optimization Research Scientist

Theory teams are small, insulated, often pre-LLM-boom.

High-signal keywords (2022)

Learning theory

PAC learning
generalization bounds
VC dimension
Rademacher complexity
uniform convergence

Optimization

non-convex optimization
convergence guarantees
saddle points
gradient dynamics

Probability / stats

concentration inequalities
random matrices
asymptotic behavior

Math skill stack

probability theory (measure-level)
functional analysis
convex & non-convex optimization
classical learning theory

📌 Interpretation
Theory exists, but is not decision-critical yet.

2023 — Contraction (~1,700 hires, −15%)

“Scale works — do we still need theory?”

What happened

Model scaling succeeded faster than theory
Labs consolidated
Theory perceived as “non-blocking”

Who got cut

speculative theory hires
long-horizon foundational work
theory not tied to immediate product risk

Surviving role archetypes

Theoretical ML Researcher (Robustness / Safety)
Optimization Researcher (Training Stability)
Statistical Modeling Scientist

Keyword shift (2023)

Less abstraction, more relevance

training stability
loss landscape
scaling laws
empirical risk minimization limits
failure modes

Early warning signs

overfitting at scale
distribution shift
spurious correlations

Math skill stack

asymptotic analysis
stochastic processes
large-scale optimization theory
random matrix theory

📌 Interpretation
Theory is tolerated only where systems might fail.

2024 — Inflection (~2,300 hires, +35%)

“Why does this work — and when will it break?”

This is the panic year.

What broke

alignment failures
hallucinations
brittleness under distribution shift
scaling unpredictability
safety & regulation pressure

Suddenly, intuition is not enough.

Exploding job archetypes

AI Theory Research Scientist (Foundations)
Learning Theory Scientist (Generalization)
Robustness & Distribution Shift Researcher
Interpretability Theorist
Statistical Mechanics of Learning Researcher

High-signal keywords (2024)

These strongly correlate with pure math demand:

Generalization & structure

implicit bias
double descent
benign overfitting
inductive bias
margin theory

High-dimensional geometry

concentration of measure
random features
overparameterization
geometry of representations

Information theory

mutual information
information bottleneck
compression vs generalization

Interpretability theory

mechanistic interpretability
feature geometry
linear representations
causal structure

Math skill stack

high-dimensional probability
information theory
differential geometry (representations)
statistical mechanics methods
asymptotic regime analysis

📌 Interpretation
Theory becomes risk infrastructure, not curiosity.

2025 — Structural expansion (~2,900 hires, +26%)

“Theory is now required to scale safely”

This is where theory catches up to scale.

New role archetypes (very important)

AI Foundations Scientist
Learning Guarantees Researcher
Model Reliability & Guarantees Scientist
Alignment Theory Researcher
Theoretical Interpretability Scientist

These roles now gate deployment.

Keywords that scream “pure theory hire”

If you see these, it’s not applied ML:

Guarantees

provable robustness
worst-case bounds
certification
impossibility results

Limits

expressivity bounds
scaling limits
sample complexity
computational-statistical gaps

Causality & structure

causal representation learning
identifiability
invariance principles

Math skill stack (2025)

measure-theoretic probability
advanced learning theory
game theory (alignment, multi-agent)
control theory analogies
causal inference foundations

📌 Interpretation
Theory is no longer optional — it is deployment-critical.

Why this is a “classic post-paradigm theory surge”

This pattern has happened before:

New paradigm works empirically
Scale hides flaws
Failures appear
Theory is needed to:
- explain
- bound
- control
- regulate

AI is now in phase 4.

OSS opportunities that map directly to these jobs

If your goal is assist or replace scarce theory roles, the highest-leverage OSS areas are:

A) Generalization & scaling analyzers

detect benign vs harmful overfitting
estimate effective capacity
approximate bounds from empirical stats

B) Representation geometry tooling

measure linearity, anisotropy, concentration
detect feature collapse / brittleness

C) Robustness certificate generators

probabilistic robustness bounds
distribution shift stress-tests

D) Assumption extractors

“what must be true for this to generalize?”
turns informal reasoning into explicit claims

One-line takeaway

2023 cut theory because scale worked.
2024 rehired theory because scale broke.
2025 institutionalizes theory because failure is expensive.

Dave Ishii - Problem Solving Blog

Doing Math like Dragon Ball - Finding Something Important

AI x Pure Math Job Market Analysis

What counts as “AI / ML Theory” (scope boundary)

2022 — Baseline (~2,000 hires)

“Theory exists, but scale optimism dominates”

Dominant job archetypes

High-signal keywords (2022)

Math skill stack

2023 — Contraction (~1,700 hires, −15%)

“Scale works — do we still need theory?”

What happened

Who got cut

Surviving role archetypes

Keyword shift (2023)

Math skill stack

2024 — Inflection (~2,300 hires, +35%)

“Why does this work — and when will it break?”

What broke

Exploding job archetypes

High-signal keywords (2024)

Math skill stack

2025 — Structural expansion (~2,900 hires, +26%)

“Theory is now required to scale safely”

New role archetypes (very important)

Keywords that scream “pure theory hire”

Math skill stack (2025)

Why this is a “classic post-paradigm theory surge”

OSS opportunities that map directly to these jobs

A) Generalization & scaling analyzers

B) Representation geometry tooling

C) Robustness certificate generators

D) Assumption extractors

One-line takeaway