Yuan He何源 — personal seal

Applied Scientist · Post-training · Amazon Rufus

About

I'm an Applied Scientist at Amazon Rufus, working on post-training foundation models into capable agents. Most of my time goes into building the environments these agents are trained and evaluated in — across coding, search, tool use, and other long-horizon tasks.

I'm also an active open-source contributor, mostly with CAMEL-AI.org on agentic workflows and data. Earlier in my PhD I built DeepOnto and a few smaller projects on knowledge engineering with LMs.

When I'm not at a terminal I like to play (badminton, piano, games), sing (pop music), and write (Chinese literature).

Blog

  • Strands-SGLang: Bridging Agent Scaffolding and RL Training Jan 2026

    Existing agent scaffolds like Strands-Agents make it easy to serve tool-using agents, but face a key challenge: they operate on text (usually an OpenAI-compatible endpoint) while RL training requires exact token IDs (token-in, token-out). This mismatch causes retokenization drift — the tokens used for computing logprobs and gradients no longer match the tokens that were actually generated — leading to effectively off-policy updates and unstable RL training. Strands-SGLang bridges this gap by extending Strands-Agents with SGLang's native endpoint while preserving the customizable agent loop…

  • How Adam Steers Gradient Descent Aug 2025

    Let's start from the most basic update rule. Suppose we want to minimize an objective. Vanilla gradient descent updates parameters by moving against the gradient: where is the learning rate. This rule is fully reactive: the step at time depends only on the current gradient. That can work, but it has a well-known failure mode in ill-conditioned landscapes (think "long narrow valleys")…

  • Approximating the Softmax Function Jan 2021

    The softmax function is widely used in the output layer of neural-network models for classification. In the binary case, it reduces to the familiar sigmoid mapping. Given a score (logit) vector, the softmax probabilities are In particular, where is the sigmoid function. More generally, softmax can be viewed as normalizing positive weights obtained from log-scale inputs. If we write with, then…

Experience

Applied Scientist, Amazon Rufus
May 2025 — now
Visiting Researcher, CAMEL-AI.org
Dec 2024 — now
Research Associate, University of Oxford
Apr 2024 — Apr 2025

Open-source

Agent & RL

CAMELcore
One of the earliest open-source multi-agent frameworks for LLMs
Loongcore
Synthesizing verifiable long-CoT data for reasoning RL training
strands-envcore
Gym-like agent environment interface for agentic RL training and evaluation with Strands
strands-sglangcore
On-policy agentic rollout infrastructure built on SGLang and Strands
slimecontrib
RL post-training framework behind the GLM model family
OpenEnvcontrib
Environment interface for agentic RL post-training (Meta PyTorch)

Knowledge Engineering

HiTcore
Hierarchy representation learning with language models
DeepOntocore
Ontology engineering with language models
OAEI Bio-MLcore
Biomedical ontology alignment benchmark

Service

Organizer & Program Chair

SEA Workshop
Scaling Environments for Agents — NeurIPS 2025
ELMKE Workshop
Evaluation of Language Models in Knowledge Engineering — EKAW 2024, ESWC 2025, ESWC 2026
OAEI Bio-ML Track
OAEI biomedical ontology alignment track — ISWC 2022–2024

Reviewer

Conferences
NeurIPS, ICLR, ICML, ARR (ACL, EMNLP, NAACL), AAAI, ECML PKDD, CIKM, ISWC, ESWC

Education

DPhil (PhD), Computer Science, University of Oxford
2020 — 2024
BSc (Hons), AI & Mathematics, University of Edinburgh
2016 — 2020

Awards

Excellent Open-sourced Tool (DeepOnto), OpenKG
2024
Best Resource Paper Runner-Up, CIKM [cert]
2023
Best Resource Paper Candidate, ISWC [nom]
2022
Best Research Report Award, ISWS Summer School [cert]
2022
PhD Scholarship, Samsung Research UK
2021
Joint Class Prize (Top 1, BSc AI & Maths), University of Edinburgh [cert]
2020

Amateur

Literature
Chinese novels, poetry, and prose [selected]
Music
Amateur composer, singer, and pianist [pieces]
Sports
Member of Oxford University Badminton Club (OuBaC) [photo]