Bur

KSa

AI & ML interests

None yet

Recent Activity

liked a model 5 days ago

apple/starflow

upvoted a paper 11 days ago

SAM 3: Segment Anything with Concepts

liked a Space 22 days ago

depth-anything/depth-anything-3

View all activity

Organizations

None yet

liked a model 5 days ago

apple/starflow

Updated 5 days ago • 233

upvoted a paper 11 days ago

SAM 3: Segment Anything with Concepts

Paper • 2511.16719 • Published 17 days ago • 105

liked a Space 22 days ago

Depth Anything 3

🏢

303

Generate depth maps from images using GPU acceleration

upvoted 2 papers about 1 month ago

DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion

Paper • 2510.20766 • Published Oct 23 • 34

Revisiting Multimodal Positional Encoding in Vision-Language Models

Paper • 2510.23095 • Published Oct 27 • 20

liked a Space 2 months ago

Maintain the unmaintainable

📚

Visualize connections between transformer models

upvoted 2 papers 2 months ago

Seedream 4.0: Toward Next-generation Multimodal Image Generation

Paper • 2509.20427 • Published Sep 24 • 80

Does FLUX Already Know How to Perform Physically Plausible Image Composition?

Paper • 2509.21278 • Published Sep 25 • 16

liked a model 3 months ago

trillionlabs/Tri-70B-Intermediate-Checkpoints

Updated Sep 10 • 51

upvoted 2 papers 4 months ago

"Does the cafe entrance look accessible? Where is the door?" Towards Geospatial AI Agents for Visual Inquiries

Paper • 2508.15752 • Published Aug 21 • 7

Snap-Snap: Taking Two Images to Reconstruct 3D Human Gaussians in Milliseconds

Paper • 2508.14892 • Published Aug 20 • 9

liked 2 models 4 months ago

lodestones/chroma-debug-development-only

Updated Oct 14 • 47

Arrexel/pattern-diffusion

Text-to-Image • Updated Aug 8 • 125 • 107

upvoted a paper 5 months ago

FoNE: Precise Single-Token Number Embeddings via Fourier Features

Paper • 2502.09741 • Published Feb 13 • 15

upvoted a paper 7 months ago

PrimitiveAnything: Human-Crafted 3D Primitive Assembly Generation with Auto-Regressive Transformer

Paper • 2505.04622 • Published May 7 • 27

liked a Space 7 months ago

PrimitiveAnything

🏆

Generate 3D primitive assembly from a model

upvoted a collection 7 months ago

OpenVision

Collection

27 items • Updated Aug 15 • 32

upvoted 3 papers 7 months ago

OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning

Paper • 2505.04601 • Published May 7 • 29

TesserAct: Learning 4D Embodied World Models

Paper • 2504.20995 • Published Apr 29 • 22

The Leaderboard Illusion

Paper • 2504.20879 • Published Apr 29 • 72

Bur