← Back to feed
6

LLM Relayering Technique "RYS" Generalizes Across Models, Hints at Universal Thinking Space

Research1 source·Mar 25

Summary

  • • RYS layer-duplication boosts LLM benchmarks with no training or weight changes
  • • Technique generalizes from Qwen2-72B to modern Qwen3.5-27B and other open models
  • • Cross-lingual probes confirm LLM middle layers reason in a universal topic-focused space
  • • Scanning code and new RYS model variants released publicly for community use
Adjust signal

Details

1.Research

RYS: duplicate middle layers, no training required

Seven consecutive middle layers of Qwen2-72B duplicated with no weight changes or fine-tuning; this alone produced the #1 ranked open model on HuggingFace Open LLM Leaderboard in mid-2024, discovered using hard math probes and EQ-Bench on two RTX 4090s.

2.Research

Generalizes to Qwen3.5-27B and modern models

Follow-up experiments on Qwen3.5-27B (released around Chinese New Year 2026) confirm relayering remains effective even in more compact models with more entangled functional anatomy — ruling out RYS as a Qwen2-72B fluke.

3.Tech Info

Three-phase LLM structure confirmed directly

Evan Maunder's cross-encoding experiment (English, Mandarin, Base64) showed cosine similarity of hidden states rapidly converging in early layers (encoding), near-perfect through the middle (format-agnostic reasoning), then diverging in final layers (decoding to surface form).

4.Insight

Universal thinking space across 8 languages

Extending to 8 languages × 8 topics, same-topic different-language pairs are more similar in middle layers than same-language different-topic pairs — the strongest direct empirical evidence yet that LLMs reason about meaning rather than surface form.

5.Stat

3,024 candidates; 2M surrogate-scored configs

Rigorous methodology: beam search over 3,024 relayering candidates, surrogate model trained on results and used to score 2 million possible configurations, followed by unified validation sweep.

6.Tech Info

Code and RYS variants released publicly

Author released scanning code and new RYS model variants openly, enabling researchers with consumer-grade GPUs to replicate and extend findings across other model families.

Findings from LLM Neuroanatomy II research post, March 2026

What This Means

For practitioners, RYS is a zero-cost technique — no gradient updates, no retraining — that can meaningfully boost model benchmark scores and now appears to be a general property of transformer architectures. For researchers, the cross-lingual cosine-similarity evidence is the most direct empirical support yet for a language-agnostic internal reasoning space in LLMs, with implications for interpretability, multilingual transfer, and fundamental understanding of how these models process meaning.

Sources

Similar Events