Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Imam Nur Bani Yusuf's picture

Imam Nur Bani Yusuf

imamnurby

21world's profile picture

·

https://imamnurby.github.io/

imamnurby

AI & ML interests

None yet

Organizations

imamnurby 's collections 14

Long Sequences for LLM

YaRN: Efficient Context Window Extension of Large Language Models

Paper • 2309.00071 • Published Aug 31, 2023 • 77

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 151

Graph Neural Network

Personalized Audiobook Recommendations at Spotify Through Graph Neural Networks

Paper • 2403.05185 • Published Mar 8, 2024 • 25

Stealing Part of a Production Language Model

Paper • 2403.06634 • Published Mar 11, 2024 • 91

Continual Training

Simple and Scalable Strategies to Continually Pre-train Large Language Models

Paper • 2403.08763 • Published Mar 13, 2024 • 51

Model Merging and Safety Alignment: One Bad Model Spoils the Bunch

Paper • 2406.14563 • Published Jun 20, 2024 • 30

Instruction Tuning

Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models

Paper • 2406.13542 • Published Jun 19, 2024 • 17

Attention in LLM

Simple linear attention language models balance the recall-throughput tradeoff

Paper • 2402.18668 • Published Feb 28, 2024 • 20
BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences

Paper • 2403.09347 • Published Mar 14, 2024 • 22

Design2Code: How Far Are We From Automating Front-End Engineering?

Paper • 2403.03163 • Published Mar 5, 2024 • 98

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

Paper • 2403.07816 • Published Mar 12, 2024 • 44

General Purpose LLM

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Paper • 2403.05530 • Published Mar 8, 2024 • 66
Gemma: Open Models Based on Gemini Research and Technology

Paper • 2403.08295 • Published Mar 13, 2024 • 50

Instruction Pre-Training: Language Models are Supervised Multitask Learners

Paper • 2406.14491 • Published Jun 20, 2024 • 95

Chain of Thought

Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities

Paper • 2406.14562 • Published Jun 20, 2024 • 28

REPOEXEC: Evaluate Code Generation with a Repository-Level Executable Benchmark

Paper • 2406.11927 • Published Jun 17, 2024 • 11

Long Sequences for LLM

YaRN: Efficient Context Window Extension of Large Language Models

Paper • 2309.00071 • Published Aug 31, 2023 • 77

Attention in LLM

Simple linear attention language models balance the recall-throughput tradeoff

Paper • 2402.18668 • Published Feb 28, 2024 • 20
BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences

Paper • 2403.09347 • Published Mar 14, 2024 • 22

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 151

Design2Code: How Far Are We From Automating Front-End Engineering?

Paper • 2403.03163 • Published Mar 5, 2024 • 98

Graph Neural Network

Personalized Audiobook Recommendations at Spotify Through Graph Neural Networks

Paper • 2403.05185 • Published Mar 8, 2024 • 25

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

Paper • 2403.07816 • Published Mar 12, 2024 • 44

Stealing Part of a Production Language Model

Paper • 2403.06634 • Published Mar 11, 2024 • 91

General Purpose LLM

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Paper • 2403.05530 • Published Mar 8, 2024 • 66
Gemma: Open Models Based on Gemini Research and Technology

Paper • 2403.08295 • Published Mar 13, 2024 • 50

Continual Training

Simple and Scalable Strategies to Continually Pre-train Large Language Models

Paper • 2403.08763 • Published Mar 13, 2024 • 51

Instruction Pre-Training: Language Models are Supervised Multitask Learners

Paper • 2406.14491 • Published Jun 20, 2024 • 95

Model Merging and Safety Alignment: One Bad Model Spoils the Bunch

Paper • 2406.14563 • Published Jun 20, 2024 • 30

Chain of Thought

Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities

Paper • 2406.14562 • Published Jun 20, 2024 • 28

Instruction Tuning

Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models

Paper • 2406.13542 • Published Jun 19, 2024 • 17

REPOEXEC: Evaluate Code Generation with a Repository-Level Executable Benchmark

Paper • 2406.11927 • Published Jun 17, 2024 • 11

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs