Aider Polyglot results

by Fernanda24 - opened 21 days ago

21 days ago

Hi just wanted to share the Aider Polyglot benchmark results I got for this (with thinking disabled). It's near identical to the full GLM-4.6! Also question: Any chance we will see a Kimi-K2 Thinking REAP? :)

no_think

  model: openai/glm-4.6-reap-268b-a32b-fp8
  edit_format: diff
  commit_hash: c74f5ef
  pass_rate_1: 11.6
  pass_rate_2: 39.1
  pass_num_1: 26
  pass_num_2: 88
  percent_cases_well_formed: 92.4
  error_outputs: 21
  num_malformed_responses: 20
  num_with_malformed_responses: 17
  user_asks: 39
  lazy_comments: 0
  syntax_errors: 0
  indentation_errors: 0```

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment