RLAIF

Team

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

AngelRaychev updated a dataset 2 days ago

RLAIF/webgpt

AngelRaychev published a dataset 2 days ago

RLAIF/webgpt

AngelRaychev updated a dataset 2 days ago

RLAIF/tldr

View all activity

AngelRaychev

updated a dataset 2 days ago

RLAIF/webgpt

Viewer • Updated 2 days ago • 13.3k • 8

AngelRaychev

published a dataset 2 days ago

RLAIF/webgpt

Viewer • Updated 2 days ago • 13.3k • 8

AngelRaychev

updated a dataset 2 days ago

RLAIF/tldr

Viewer • Updated 2 days ago • 92.9k • 10

AngelRaychev

published a dataset 2 days ago

RLAIF/tldr

Viewer • Updated 2 days ago • 92.9k • 10

AngelRaychev

updated a dataset 3 days ago

RLAIF/ultrafeedback-binarized

Viewer • Updated 3 days ago • 63.5k • 17

AngelRaychev

published a dataset 4 days ago

RLAIF/ultrafeedback-binarized

Viewer • Updated 3 days ago • 63.5k • 17

AngelRaychev

updated a dataset about 1 month ago

RLAIF/gm_toy_example

Viewer • Updated Nov 1 • 1.1k • 26

AngelRaychev

published a dataset about 1 month ago

RLAIF/gm_toy_example

Viewer • Updated Nov 1 • 1.1k • 26

Asap7772

authored 3 papers 2 months ago

Personalized Preference Fine-tuning of Diffusion Models

Paper • 2501.06655 • Published Jan 11

Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models

Paper • 2502.17387 • Published Feb 24 • 7

RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems

Paper • 2510.02263 • Published Oct 2 • 8

nlile

authored a paper 6 months ago

Just Enough Thinking: Efficient Reasoning with Adaptive Length Penalties Reinforcement Learning

Paper • 2506.05256 • Published Jun 5 • 2

sea-snell

authored a paper 8 months ago

Learning Adaptive Parallel Reasoning with Language Models

Paper • 2504.15466 • Published Apr 21 • 44

nlile

authored a paper 9 months ago

Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models

Paper • 2502.17387 • Published Feb 24 • 7

Asap7772

authored a paper 9 months ago

Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs

Paper • 2503.01307 • Published Mar 3 • 38

nlile

authored a paper 9 months ago

Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs

Paper • 2503.01307 • Published Mar 3 • 38

violetxi

authored 2 papers 11 months ago

Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models

Paper • 2407.07086 • Published Jul 9, 2024

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8 • 99

nlile

authored a paper 11 months ago

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8 • 99

Asap7772

authored a paper 11 months ago

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8 • 99

AI & ML interests

Recent Activity

Team members 9

RLAIF's activity