MoE A3B Fine-tune · 35B Total / 3B Active

Qwopus3.6-35B-A3B-v1

by Kyle Hessling · built on Jackrong's MoE fine-tune

Same 17-prompt suite as the Qwopus3.6-27B v1-preview eval — 5 agentic + 1 nothink rerun, 5 web-design, 6 canvas/WebGL. Q5_K_M on a single RTX 5090 via llama.cpp. Headline: 162 tok/s avg, 129k tokens generated, 13.3 min runtime. The MoE is 2.6× faster than the dense 27B preview at a larger quant. The web-design outputs are some of the best one-shot HTML I've seen out of any open model in this size class — verbose in a good way, the pages feel complete on the first attempt where most models land surface-level.

Read the full report GGUF on Hugging Face Compare · Qwopus3.6-27B preview eval Follow @KyleHessling1

161.9avg tok/s×2.6 vs 27B preview

14shipped runs

106,688completion tokens

~25 GBVRAM used

65Kcontext window

Web design · open to preview

SaaS landing pagePrism — AI observability

75.9 KB · 23,839 tok · 151 s

Analytics dashboardLight theme, emerald accent

37.5 KB · 14,029 tok · 87 s

Designer portfolioMaya Chen — kinetic typography

27.5 KB · 9,144 tok · 56 s

Pricing page3 tiers + animated toggle + FAQ

50.1 KB · 13,862 tok · 86 s

Mobile app marketingStillwater — CSS-only iPhone mock

47.9 KB · 16,596 tok · 104 s

Canvas / WebGL · creative coding

3 of 6 creative-canvas runs shipped clean and are below. The Mandelbulb shader, soft-body physics sandbox, and audio-reactive visualizer didn't render correctly on first attempt — these are prompts most one-shot models in this size class fail on, and they're typically the kind of brief that needs a second turn to fix shader errors or collision math. Excluded here for the headline; raw outputs are still in the repo if you want to inspect.

Particle attractor3000-particle fluid swarm

10.6 KB · 4,152 tok · 25 s

Generative flowfieldInk-line agents on simplex noise

19.3 KB · 6,934 tok · 42 s

Three.js crystal sceneTransmissive glass + bloom

16.1 KB · 5,674 tok · 35 s