• 0 Posts
  • 51 Comments
Joined 3 years ago
cake
Cake day: June 12th, 2023

help-circle




  • AMD Strix is an APU, optimized for AI. It is the cheapest option I am aware of to run bigger models at home. 2k for 56GB VRAM, and less den 300W total power Budget.

    One could run smaller models. But for the context sizes required for research work, that is nearly impossible.

    Also, external services, like openrouter, can be used to use models hosted in the cloud.

    But for self hosted, you need something that can run models with at least 15GB of VRAM + Context. For comparison. Our highly quantized model uses 20GB of vram. For our 4 slots we need another 20GB on top of it (around 5GB for 254k tokens), making it 40GB.







  • For those who want to know more, rough setup:

    • llama-cpp rocmfp4 fork
    • currently custom quantized qwen3.6 35B A3B model, working on publishing
    • be3 embedding and reranker, also GPU
    • gemma4-e4b via FastFlowLM on NPU!
    • OpenWebUI and searxng as docker containers on a Pi currently

    We get 70-100tok/s generation. Four slots with 256k context length each.

    We use a smaller Board with “only” 64GB of shared LPDDR5X. Bottleneck is memory speed, rocmfp4 quants help a lot.

    As soon as I get my imatrix calibration right, I will publish the quantized versions.

    Most existing quantized models are broken. The authors did some not supported stuff (like using a already quantized model and requantize it) that you may get issues with coherence or sudden Chinese words in the output.

    That is not an issue with rocmfp4 but with vibe coders and agent psychosis.







  • That was an example. And as someone who works in sec, I know the benefits of a package manager.

    “I only need to trust brave”.

    I don’t get it, static linking, curl to bash pipes and userepace install and everybody thinks that is fine. But as someone who needs to write a security concept for Linux in the office so I can finally use it at work, no that is not ok. That is shit.

    Rust on desktop is also a nightmare for example.

    No I do not hate arch, I hate concepts and mindsets creeping into the Linux world