hgseo (Hyungyu seo)

liked a model 11 days ago

PrimeIntellect/INTELLECT-3

Text Generation • 107B • Updated 12 days ago • 14.8k • 182

liked a dataset 11 days ago

PrimeIntellect/SYNTHETIC-2

Viewer • Updated Jul 10 • 51.6k • 497 • 8

liked a Space 28 days ago

Big Code Models Leaderboard

📈

1.48k

Submit code models for evaluation and view leaderboard

liked a model 29 days ago

moonshotai/Kimi-K2-Thinking

Text Generation • Updated about 1 month ago • 391k • • 1.51k

liked 2 datasets about 1 month ago

HuggingFaceFW/finewiki

Viewer • Updated Oct 22 • 61.6M • 15.2k • 262

nvidia/PhysicalAI-Autonomous-Vehicles

Updated 3 days ago • 171k • 475

liked a model about 1 month ago

deepseek-ai/DeepSeek-OCR

Image-Text-to-Text • 3B • Updated Nov 4 • 5.45M • 2.94k

liked a model 2 months ago

Infinigence/Megrez2-3x7B-A3B

Text Generation • 7B • Updated Sep 18 • 34 • 17

reacted to Sri-Vigneshwar-DJ's post with 🚀 2 months ago

Post

3331

🚀 Qwen3-Omni for Marketing: A Game-Changer

Just wanted to share something exciting I've been exploring—Qwen3-Omni and how it's transforming marketing workflows.

What makes it special? At Hawky.ai we are started experimenting with Qwen3 recently for Analysis and Optimization.

Unlike traditional tools that look at text, images, or audio separately, Qwen3-Omni analyzes everything together. It handles 119 languages, processes 40-minute audio sequences, and understands both images and videos—all at once.

The cool part? It's 2-3x faster than similar models thanks to its MoE architecture.

Real applications I'm seeing:
Ad Analysis: It scores video ads by combining visual elements, audio tone, and text—giving 25% better CTR predictions than single-mode tools.
Campaign Localization: Drop in one ad, get 10 localized versions with native voiceovers in under a minute. Perfect for testing across markets.

Market Research: Feed it competitor content, podcasts, or UGC videos. It extracts actionable insights like "3-second hooks boost retention by 15%" and saves about 70% of analysis time.

Quality Checks: Automatically catches lip-sync errors and audio-visual mismatches.

Full technical breakdown: https://huggingface.co/blog/Sri-Vigneshwar-DJ/hawky-aiqwen3-omni-advanced-architecture-and-marke

Has anyone else been experimenting with multimodal models for marketing? Would love to hear what you're building!

#MultimodalAI #MarTech #OpenSource

upvoted a paper 3 months ago

Reinforcement Learning on Pre-Training Data

Paper • 2509.19249 • Published Sep 23 • 68

liked a model 3 months ago

moondream/moondream3-preview

Image-Text-to-Text • 9B • Updated Oct 9 • 10.4k • • 497

upvoted a collection 3 months ago

[Dataset] FineWeb2 Edu Korean

Collection

5 items • Updated Jul 24 • 2

liked a model 3 months ago

devngho/ko_edu_classifier_v2_nlpai-lab_KoE5

Text Classification • 0.6B • Updated Oct 11, 2024 • 8 • 5

liked a dataset 3 months ago

minpeter/fineweb-2-edu-korean-score-2

Viewer • Updated Jul 24 • 14.2M • 70 • 3

reacted to prithivMLmods's post with ❤️ 3 months ago

Post

7224

Introducing Gliese-OCR-7B-Post1.0, a document content-structure retrieval VLM designed for content extraction(OCRs) and summarization. This is the third model in the Camel Doc OCR VLM series, following Camel-Doc-OCR-062825. The new version fixes formal table reconstruction issues in both En and Zh, achieving optimal performance for long-context inferences. This model also shows significant improvements in LaTeX and Markdown rendering for OCR tasks.

🤗 Gliese-OCR-7B-Post1.0 : prithivMLmods/Gliese-OCR-7B-Post1.0
📌 Gliese-Post1.0 Collection : prithivMLmods/gliese-post10-68c52c4a6ca4935f5259a6d7
⬅️ Previous Versions : prithivMLmods/Camel-Doc-OCR-062825
🧨 Gliese-OCR-7B-Post1.0 (4-bit) Notebook Demo on T4 : prithivMLmods/Gliese-OCR-7B-Post1.0
📖 GitHub [Gliese-OCR-7B-Post1.0(4-bit)-reportlab] : https://tinyurl.com/ys7zuerc

Other Collections:

➔ Multimodal Implementations : prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0
➔ Multimodal VLMs - Aug'25 : prithivMLmods/multimodal-vlms-aug25-68a56aac39fe8084f3c168bd
➔ Multimodal VLMs - July'25 : prithivMLmods/multimodal-vlms-until-july25-688312e6b840e1e156f13027

.
.
.
To know more about it, visit the app page or the respective model page!!

2 replies

·

reacted to tomaarsen's post with ❤️ 3 months ago

Post

5692

ModernBERT goes MULTILINGUAL! One of the most requested models I've seen, The Johns Hopkins University's CLSP has trained state-of-the-art massively multilingual encoders using the ModernBERT architecture: mmBERT.

Model details:
- 2 model sizes:
- jhu-clsp/mmBERT-small
- jhu-clsp/mmBERT-base
- Uses the ModernBERT architecture, but with the Gemma2 multilingual tokenizer (so: flash attention, alternating global/local attention, unpadding/sequence packing, etc.)
- Maximum sequence length of 8192 tokens, on the high end for encoders
- Trained on 1833 languages using DCLM, FineWeb2, and many more sources
- 3 training phases: 2.3T tokens pretraining on 60 languages, 600B tokens mid-training on 110 languages, and 100B tokens decay training on all 1833 languages.
- Both models are MIT Licensed, and the full datasets and intermediary checkpoints are also publicly released

Evaluation details:
- Very competitive with ModernBERT at equivalent sizes on English (GLUE, MTEB v2 English after finetuning)
- Consistently outperforms equivalently sized models on all Multilingual tasks (XTREME, classification, MTEB v2 Multilingual after finetuning)
- In short: beats commonly used multilingual base models like mDistilBERT, XLM-R (multilingual RoBERTa), multilingual MiniLM, etc.
- Additionally: the ModernBERT-based mmBERT is much faster than the alternatives due to its architectural benefits. Easily up to 2x throughput in common scenarios.

Check out the full blogpost with more details. It's super dense & gets straight to the point: https://huggingface.co/blog/mmbert

Based on these results, mmBERT should be the new go-to multilingual encoder base models at 300M and below. Do note that the mmBERT models are "base" models, i.e. they're currently only trained to perform Mask Filling. They'll need to be finetuned for downstream tasks like semantic search, classification, clustering, etc.

reacted to Kseniase's post with 👍 3 months ago

Post

7042

10 Latest Preference Optimization Techniques

Models need feedback on what makes outputs “good” or “bad.” Policy optimization (PO) turns preferences and rewards into actual training signals. This field is evolving quickly, moving far beyond classics like PPO and GRPO. So here is our overview of 10 newest PO methods:

1. Pref-GRPO → Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning (2508.20751)
Stabilizes text-to-image reinforcement learning (RL) with pairwise preference rewards and a unified UNIGENBENCH benchmark

2. PVPO (Policy with Value Preference Optimization) → PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning (2508.21104)
This critic-free RL method uses a pre-trained model as a reference anchor to reduce bias and guide learning, selecting high-value examples through data pre-sampling

3. DCPO (Dynamic Clipping Policy Optimization) → DCPO: Dynamic Clipping Policy Optimization (2509.02333)
Uses dynamic clipping, which adjusts probability limits per token for better token exploration, and smooth reward standardization to balance rewards over training steps and prevent wasted updates

4. ARPO (Agentic Reinforced Policy Optimization) → Agentic Reinforced Policy Optimization (2507.19849)
Optimizes multi-turn LLM agents that use external tools. It uses an entropy-based adaptive rollout to explore post-tool use and an advantage attribution method to better assign credit across steps, leading to more efficient tool use with fewer resources

5. GRPO-RoC (Group Relative Policy Optimization with Resampling-on-Correct) → rStar2-Agent: Agentic Reasoning Technical Report (2508.20722)
Oversamples rollouts, then resamples them to keep diverse mistakes and only the highest-quality correct answers. It reduces noises and ends up with stronger reasoning in a code environment

Read further below ⬇️
If you like this, also subscribe to the Turing post: https://www.turingpost.com/subscribe

1 reply

·

liked a dataset 3 months ago

HuggingFaceFW/finepdfs

Viewer • Updated 6 days ago • 476M • 36.7k • 682

upvoted a collection 3 months ago

[Dataset] K-Corpus

Collection

11 items • Updated Jul 24 • 1

liked a dataset 4 months ago

nvidia/Nemotron-Post-Training-Dataset-v2

Viewer • Updated Aug 21 • 6.34M • 8.12k • 80

Hyungyu seo

AI & ML interests

Recent Activity

Organizations

PrimeIntellect/INTELLECT-3

PrimeIntellect/SYNTHETIC-2

Big Code Models Leaderboard

moonshotai/Kimi-K2-Thinking

HuggingFaceFW/finewiki

nvidia/PhysicalAI-Autonomous-Vehicles

deepseek-ai/DeepSeek-OCR

Infinigence/Megrez2-3x7B-A3B

Reinforcement Learning on Pre-Training Data

moondream/moondream3-preview

[Dataset] FineWeb2 Edu Korean

devngho/ko_edu_classifier_v2_nlpai-lab_KoE5

minpeter/fineweb-2-edu-korean-score-2

HuggingFaceFW/finepdfs

[Dataset] K-Corpus

nvidia/Nemotron-Post-Training-Dataset-v2

Hyungyu seo

AI & ML interests

Recent Activity

Organizations

hgseo's activity

Big Code Models Leaderboard