Mikhail Arkhipov's picture

5 3

Mikhail Arkhipov

Mshn

·

AI & ML interests

Generative Models, Latent Variable Models

Recent Activity

reacted to danielhanchen's post with 🔥 4 days ago

Mistral's new Ministral 3 models can now be Run & Fine-tuned locally! (16GB RAM) Ministral 3 have vision support and the best-in-class performance for their sizes. 14B Instruct GGUF: https://huggingface.co/unsloth/Ministral-3-14B-Instruct-2512-GGUF 14B Reasoning GGUF: https://huggingface.co/unsloth/Ministral-3-14B-Reasoning-2512-GGUF 🐱 Step-by-step Guide: https://docs.unsloth.ai/new/ministral-3 All GGUFs, BnB, FP8 etc. variants uploads: https://huggingface.co/collections/unsloth/ministral-3

upvoted a paper 4 months ago

Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning

upvoted a paper 8 months ago

AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation

View all activity

Organizations

None yet

upvoted a paper 4 months ago

Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning

Paper • 2508.03501 • Published Aug 5 • 59

upvoted a paper 8 months ago

AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation

Paper • 2503.19693 • Published Mar 25 • 76

upvoted a collection 9 months ago

OpenScholar_V1

The set of models, index, data associated with the paper "OpenScholar: Synthesizing Scientific Literature with Retrieval-Augmented LMs". • 8 items • Updated Nov 22, 2024 • 42

upvoted a paper over 1 year ago

Long Code Arena: a Set of Benchmarks for Long-Context Code Models

Paper • 2406.11612 • Published Jun 17, 2024 • 25

upvoted a paper almost 2 years ago

Large Language Model Distillation Doesn't Need a Teacher

Paper • 2305.14864 • Published May 24, 2023 • 3