-
Can Large Language Models Understand Context?
Paper β’ 2402.00858 β’ Published β’ 23 -
OLMo: Accelerating the Science of Language Models
Paper β’ 2402.00838 β’ Published β’ 85 -
Self-Rewarding Language Models
Paper β’ 2401.10020 β’ Published β’ 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper β’ 2401.17072 β’ Published β’ 25
Collections
Discover the best community collections!
Collections including paper arxiv:2401.10891
-
Deconstructing Denoising Diffusion Models for Self-Supervised Learning
Paper β’ 2401.14404 β’ Published β’ 18 -
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper β’ 2401.10891 β’ Published β’ 62 -
R-Zero: Self-Evolving Reasoning LLM from Zero Data
Paper β’ 2508.05004 β’ Published β’ 130
-
Contrastive Learning for Many-to-many Multilingual Neural Machine Translation
Paper β’ 2105.09501 β’ Published -
Cross-modal Contrastive Learning for Speech Translation
Paper β’ 2205.02444 β’ Published -
ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Paper β’ 2210.03052 β’ Published -
Diffusion Glancing Transformer for Parallel Sequence to Sequence Learning
Paper β’ 2212.10240 β’ Published β’ 1
-
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning
Paper β’ 2312.06134 β’ Published β’ 3 -
Efficient Monotonic Multihead Attention
Paper β’ 2312.04515 β’ Published β’ 8 -
Contrastive Decoding Improves Reasoning in Large Language Models
Paper β’ 2309.09117 β’ Published β’ 39 -
Exploring Format Consistency for Instruction Tuning
Paper β’ 2307.15504 β’ Published β’ 8
-
Can Large Language Models Understand Context?
Paper β’ 2402.00858 β’ Published β’ 23 -
OLMo: Accelerating the Science of Language Models
Paper β’ 2402.00838 β’ Published β’ 85 -
Self-Rewarding Language Models
Paper β’ 2401.10020 β’ Published β’ 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper β’ 2401.17072 β’ Published β’ 25
-
Contrastive Learning for Many-to-many Multilingual Neural Machine Translation
Paper β’ 2105.09501 β’ Published -
Cross-modal Contrastive Learning for Speech Translation
Paper β’ 2205.02444 β’ Published -
ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Paper β’ 2210.03052 β’ Published -
Diffusion Glancing Transformer for Parallel Sequence to Sequence Learning
Paper β’ 2212.10240 β’ Published β’ 1
-
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning
Paper β’ 2312.06134 β’ Published β’ 3 -
Efficient Monotonic Multihead Attention
Paper β’ 2312.04515 β’ Published β’ 8 -
Contrastive Decoding Improves Reasoning in Large Language Models
Paper β’ 2309.09117 β’ Published β’ 39 -
Exploring Format Consistency for Instruction Tuning
Paper β’ 2307.15504 β’ Published β’ 8
-
Deconstructing Denoising Diffusion Models for Self-Supervised Learning
Paper β’ 2401.14404 β’ Published β’ 18 -
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper β’ 2401.10891 β’ Published β’ 62 -
R-Zero: Self-Evolving Reasoning LLM from Zero Data
Paper β’ 2508.05004 β’ Published β’ 130