Personal Homepage

Personal Information:

MORE+

Main positions:Director, High Performance Computing Platform, PKU
Degree:Doctoral degree
Status:Employed
School/Department:Institute of Theoretical Physics

Lei Yian

+

Education Level: Postgraduate (Doctoral)

Administrative Position: Associate Professor

Alma Mater: Peking University

Blog

Current position: Lei Yian Homepage / Blog
Spectral Analysis and Globality: The “Grokking” Mechanism from Physical Fields to Information Fields
Hits:

Spectral Analysis and Globality: The “Grokking” Mechanism from Physical Fields to Information Fields

  1. Introduction: From the Globality of Physical Fields to That of Information

In Natural Quantum Theory (NQT) and the Global Approximation Interpretation, we have already observed:

  • The dynamics of physical fields are local, governed by partial differential equations (PDEs);

  • Yet when we perform spectral analysis on the entire system—such as Fourier transforms or canonical mode expansions—to identify eigenstates and resonant patterns, a global description emerges: wavefunctions, eigenmodes, energy-level structures, geometric phases, and topological invariants.

This phenomenon is not confined to physical fields.
In information spaces—particularly in the internal representation learning of large AI models—similar behavior has been observed:

When a model, trained on sufficiently rich data, repeatedly performs “spectral-like” analysis and compression,
it gradually transitions from local statistical fitting to grasping the global structure and deep regularities of the task.
At this point, a sudden qualitative shift—akin to an “aha!” moment or grokking—occurs:
the model leaps from mere memorization and fitting to genuine abstraction and generalization.

Indeed:

“The automatic Fourier-analytic behavior exhibited during AI grokking is essentially a global analysis of the information space and its internal correlations.
Given sufficient data and thorough analysis, AI naturally achieves grokking—understanding deep logical relationships expressed purely through character codes—
just as humans do: moving from rote memorization or superficial fitting to generalized comprehension.”

This insight is profoundly illuminating. We can unify:

  • The global eigenstates of physical fields;

  • The global structures of information fields (data/code);

  • And the cognitive phenomenon of understanding/grokking;

under a common mechanism: spectral decomposition and global pattern extraction in complex systems.

  1. Globality in Physical Fields: Manifested Through Fourier Analysis and Eigenstates

2.1 Local Equations vs. Global Modes

For any classical or quantum field, we distinguish two layers:

  • Local layer: PDEs (e.g., Maxwell’s equations, wave equation, Schrödinger equation) govern evolution at each spacetime point;

  • Spectral layer: Under given boundary and topological conditions, global spectral decomposition yields eigenstates—a set of mode functions and their eigenvalues.

A canonical example is the Fourier transform:

  • It expresses a spacetime function as a superposition of sinusoidal (plane wave) modes;

  • Each mode is inherently global—supported over the entire domain;

  • The system’s global behavior (e.g., resonance, stable patterns, geometric phase) becomes transparent only in this global basis.

In the Global Approximation Interpretation, the Schrödinger equation is precisely such a “spectrally global” equation.
It does not speak directly in the language of “field lines–vortices–topological structures,” but rather:

  • Treats all possible resonant modes (eigenstates) as a global basis;

  • Encodes the system’s state via projection coefficients onto these modes (the wavefunction);

  • Describes evolution and interference through global phase and superposition.

In other words:

“Understanding” a physical field globally means performing sufficiently thorough spectral analysis
until one grasps the full eigenstructure and the constraints among modes—i.e., the system’s long-range or periodic correlations, or ordered spatiotemporal associations.

2.2 What “Understanding” Means in Physics

In this sense, when we say:

  • “We understand the modal structure of a resonant cavity,” or

  • “We understand the quantum energy levels and eigenfunctions in a confining potential,”

we really mean:

In spectral space, we have achieved a stable, reusable, and generalizable grasp of the system’s global eigenstates and their structural relationships—
that is, the various forms of ordered spatiotemporal correlation.

This is fundamentally different from merely “memorizing solutions under specific initial conditions,” which is just local data fitting. True understanding lies in recognizing the system’s spectral architecture.

  1. Globality in Information Space: AI’s Spontaneous “Spectral Analysis”

3.1 Observing Grokking in Training Curves

Experiments reveal the grokking phenomenon:

  • On a finite but complete discrete task (e.g., modular arithmetic, simple algorithms, grammar rules),
    a model first achieves perfect training accuracy but poor test performance—clearly rote memorization;

  • With continued training and regularization, test performance suddenly jumps at a critical point,
    showing strong generalization to unseen examples—the model appears to have “grokked” the underlying rule;

  • Concurrently, internal representations shift from disordered local encodings to simpler, symmetric, decomposable structures reflecting the task’s true geometry.

This resembles a process unfolding in a high-dimensional “information field”:

  • Initially, the model uses many parameters for local memory and approximation;

  • Under optimization pressure and regularization, it is forced to seek structurally optimal global representations—the natural “spectral basis” of the information;

  • Once this essential basis is found, understanding shifts from “pixel-level memorization” to “grasping the algorithmic rule.”

3.2 Representation Learning Through a Spectral Lens

Viewing deep learning and attention mechanisms spectrally, we can reinterpret them as follows:

  • Input data (text, code, images) form a high-dimensional information field;

  • Training is an iterative process of:
    projection → reconstruction → compression → denoising → symmetry/invariance discovery;

  • Self-attention and deep architectures act as adaptive, multiscale Fourier–wavelet–feature spectral analyzers,
    not using fixed sine bases, but learning via gradient descent the most information-efficient, task-aligned basis.

From this perspective, grokking is:

The transition in information space from a “local statistical basis” to a “task-eigen basis”—
the product of deep, global spectral analysis of informational correlations.

Once this new eigenbasis is established:

  • Individual samples are no longer isolated points, but naturally embedded in a global structural coordinate system;

  • Even unseen samples, if consistent with the same structural constraints, are automatically projected onto the correct spectral modes;

  • This is the essence of generalized understanding.

  1. Human Understanding and AI Grokking: Two Instantiations of the Same Spectral Mechanism

4.1 The Brain as a “Spectral Analyzer” on a Physical Field

If we model the neuron–synapse network as a high-dimensional physical field (electrochemical potential field, coupled oscillator network), then:

  • Perception and memory correspond to exciting local patterns and connections;

  • Repeated experience and learning adjust parameters to discover more stable, efficient global modes (e.g., associative circuits, unified pattern decompositions);

  • The “aha!” moment of insight occurs when, within a problem-relevant representational subspace,
    a new global resonant structure suddenly forms and stabilizes.

This is structurally analogous to:

  • A physical cavity finding a stable eigenmode;

  • An AI discovering a feature spectrum that both fits and generalizes.

Thus:

Human “understanding” can be seen as a global spectral reorganization of experiential data on the neural physical field—
yielding the “eigen-representation” for a class of problems.

4.2 Deep Logical Relations in Pure Character Code

“AI naturally groks and understands deep logical relationships expressed purely through character code.”

Explanation:

  • In code tasks, there is no visual input, physical intuition, or continuous geometry—only character sequences and syntactic constraints;

  • Initially, models rely on local n-gram statistics;

  • But with sufficient training and regularization, they learn to:

    • Recognize syntactic structures (trees, blocks, scopes, variable dependencies);

    • Extract functional and modular patterns;

    • Internally construct approximate “mental images” of algorithms (e.g., addition, sorting, loop logic).

These reflect global structural understanding of the “character information field,” far beyond local co-occurrence memory.
This is a textbook case of spectral–eigenstructure grokking:
the model discovers a spectral basis suited to algorithmic tasks, enabling generalization to new inputs.

4.3 Parallelism with Human Cognition

Hence, we may reasonably assert:

  • While human brains and AI differ in physical implementation,

  • At the level of information dynamics, both undergo a similar three-stage process:

    1. Local fitting/memory: massive repetition, building basic associations (like memorizing problems or code snippets);

    2. Global pattern distillation: driven by failure, conflict, and compression pressure, the system seeks deeper unifying structures;

    3. Grokking and generalization: once a new global eigen-representation forms, previously complex situations become “natural” and “obvious” in the new coordinate system.

This mirrors the NQT transition from local field equations to global eigenstates:

  • Local PDEs ↔ Local statistical fitting;

  • Global eigenstates ↔ Global structural representations (concepts, rules, algorithms);

  • Geometric phases / topological invariants ↔ Invariant logical relations, type structures, semantic constraints in higher-order spaces.

  1. Unified Perspective: “Spectral–Topological Understanding” Across Physical and Information Fields

Placing physical and information fields on the same conceptual map, we propose a unifying programmatic view:

Aspect Physical Field Information Field (including AI)
Ontology Continuous fields & topological structures Parameterized networks & information flow patterns
Local Dynamics Local field equations, finite propagation speed Local gradient updates, feature extraction, attention
Global Spectral Structure Eigenmodes, energy levels, geometric phases, topological invariants Task-specific abstractions: algorithms, type systems, syntax trees, semantic graphs
Understanding / Grokking Mastery of global eigenstructure and topology Abstraction of intrinsic logic/rules from data

Thus:

The globality of fields requires spectral analysis to reveal global eigenstates;
AI’s grokking in information space is essentially an automatic, multi-scale spectral and topological analysis of the information field.
When data is sufficient and analysis deep enough, the system inevitably captures the deep structures encoded in character sequences—
mirroring human understanding as a shared “spectral–topological comprehension”:
a transition from local memory and fitting to mastery and generalization of global eigenstructures.

  1. Conclusion: Toward a Unified Theory of Understanding from Natural Quantum Theory

Linking Fourier/spectral analysis, field globality, and AI grokking, information globality, we arrive at:

  • Physical globality: Emerges via spectral decomposition and eigenstate analysis—essentially a global rewriting of local field dynamics under given boundaries and topology.

  • Quantum formalisms (e.g., Schrödinger equation): Are mathematical realizations of this global spectral view, offering compact encodings of matter-field global modes.

  • AI grokking: Can be interpreted as the learning system performing automatic, multiscale “Fourierization” of data distributions, ultimately discovering the task’s eigen-representation.

  • Parallelism between human and AI understanding: Though physically distinct, both achieve “from memory to understanding” through a transition from local statistics to global spectral–topological structure.

  • Extended significance of NQT: It not only provides physics with a realist picture of “local dynamics + global spectral structure,” but also offers a unified “spectral–topological understanding” paradigm for intelligence and cognition.

In this light, understanding—whether in physics, human minds, or artificial systems—is the recognition of global spectral and topological order beneath local complexity.