Jiang's picture

7 13 2

Jiang

Dongwei

·

Some-random

AI & ML interests

None yet

Organizations

upvoted 2 papers 2 months ago

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

Paper • 2510.08240 • Published Oct 9 • 41

IA2: Alignment with ICL Activations Improves Supervised Fine-Tuning

Paper • 2509.22621 • Published Sep 26 • 8

upvoted 2 articles 3 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

+21

Jul 8

•

735

Article

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

Feb 11

•

89

upvoted a paper 3 months ago

Jointly Reinforcing Diversity and Quality in Language Model Generations

Paper • 2509.02534 • Published Sep 2 • 24

upvoted a paper 6 months ago

Feedback Friction: LLMs Struggle to Fully Incorporate External Feedback

Paper • 2506.11930 • Published Jun 13 • 53

upvoted a paper 9 months ago

Optimizing Decomposition for Optimal Claim Verification

Paper • 2503.15354 • Published Mar 19 • 18

upvoted 4 papers about 1 year ago

Generative World Explorer

Paper • 2411.11844 • Published Nov 18, 2024 • 77

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Paper • 2409.12183 • Published Sep 18, 2024 • 39

AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies

Paper • 2402.12370 • Published Feb 19, 2024 • 2

Benchmarking Language Model Creativity: A Case Study on Code Generation

Paper • 2407.09007 • Published Jul 12, 2024 • 4

upvoted a collection about 1 year ago

Cognition

Perception and abstraction. Each modality is tokenized and embedded into vectors for model to comprehend. • 200 items • Updated Apr 15 • 6

upvoted a paper about 1 year ago

RATIONALYST: Pre-training Process-Supervision for Improving Reasoning

Paper • 2410.01044 • Published Oct 1, 2024 • 35