Aw yisss.

by Razrien - opened May 29

May 29

Because i'm a lazy bastard, i'll just upload the screenshot of my initial thoughts about this thing while it was still fresh in my mind. Someone had this up on the horde, and I was up till 8 in the morning throwing cards at it lmao.

Nesaliti

Jun 6

Yeah, model's absolute banger. Actually got spoiled by it so it's kinda hard to go back to cydonia D:

TheDrummer

Owner Jun 17

What GPUs do you guys have? And what quant? Cydonia to this one is a pretty big jump.

Nesaliti

Jun 17

What GPUs do you guys have? And what quant? Cydonia to this one is a pretty big jump.

RX 5500 XT Nitro+ (8 Gb) at the moment ;D (waiting for RX 9070 XT Sapphire Pulse to get in stock around where I am atm)

Running Q5K_M with 16k context on KoboldCpp Vulkan and zero layers offloaded to VRAM (got 64 gb of 5600 mhz DDR5 and Ryzen 9800X3D). Funny enough, with increase of layers offload to VRAM speed actually decreases rather that what's expected.
Takes about 15 mins to get 16k context loaded up (23-27 sec per 512 batch), then it's about 1.2 tps for generation. With ContextShift it's alright considering the quality I get.

kitano-o

Jun 25

Hi. Thanks for really cool tune. I use low quant IQ3_M on KoboldCp, 4090 + 32Gb RAM, no offload 20k context with KV cache quant to 8 bit. But so far it's most coherent, slop free, really good instruction following model for me. Cydonia, recent qwen3 models don't stand even close. Only GLM-4-32B some where close to this fine tune.

010O11

Jun 26

Hi there!
4060Ti 16gb with 64 system RAM, q5, similar experience like Nesaliti. Nemotron is a BEAST !

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment