Skip to main
andresilva.cc

NOTES

  • til

    The runner matters as much as the model

    Lately I've been running local LLMs through LM Studio, which is super fun. I started with the GGUF format and got great results with MoE models. They are way faster than dense ones, since they only activate part of the network per token.

    Then I tried Apple's own format, MLX. It's supposed to be faster on Apple Silicon, which is what I'm on. But running the same prompt in LM Studio, MLX came out slower than GGUF. Weird, right?

    Digging in, I discovered that LM Studio's MLX engine has known rough edges, including a KV-caching bug that re-prefills the whole prompt on every request (#1319). I wasn't really testing MLX there, I was testing a rough runner.

    So I tried oMLX, a macOS-native MLX server built for Apple Silicon. Re-ran the exact same prompt:

    StackTok/sec
    GGUF (LM Studio)~84
    MLX (LM Studio)~70
    MLX (oMLX)~95

    MLX done right wins. The lesson: before you benchmark models, make sure you're benchmarking them fairly. The runner matters as much as the model.

    permalink
  • take

    Just one more prompt

    Before AI, it was easier to stop working at, let's say, 5 PM, and live your personal life, because it took real focus and effort to develop anything. Now with AI, it's so easy to ship stuff that my free time gets constantly interrupted. Just one more prompt, one more feature, just a quick look to see how it is progressing.

    We, passionate developers, have always liked to code and ship cool stuff. AI skyrocketed that to a crazy level. Shipping is so effortless now that I've got this itch to keep interacting with it. At the same time, these micro efforts build up to so many interruptions of my free time that it's not healthy.

    I honestly don't know what the solution is. And I'm not saying it's completely bad, because I really love what I'm capable of doing now. It's kind of a mixed feeling between "I can do so much stuff" and "Oh my god, what am I doing to my personal life?".

    Do you feel the same way? Have you found a way to improve this balance?

    permalink
  • take

    Specialists vs generalists

    We always wondered if we should be specialists or generalists, debating the pros and cons and choosing a path for our careers. Now, with the help of AI, we can increase our productivity by a lot, but more importantly, we can ship anything, even in a technology we don't know well.

    Because of this, more and more companies are changing their organizational structure to be leaner and less fragmented, with extreme cases like one company testing one-person teams.

    That's why I think generalists will do better. They are not limited to a specific skill set or a stack. With the help of AI, they can ship any kind of product or feature while still having the technical knowledge to judge the AI output.

    This doesn't mean you shouldn't have a deeper expertise in one area, because this can be an important differentiator. For example, a good software developer with a deep knowledge of front-end and design can still ship anything, but will better guide AI to produce beautiful and distinct websites.

    So: do not be only a front-end, a back-end, a QA, or a mobile developer. Be a software developer who can comfortably work with the unknown, who is curious and can learn anything, and most importantly, who can tackle any kind of problem.

    permalink