<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Slm on Hi, I&#39;m Muhammad Amal</title>
    <link>https://muhammadamal.my.id/tags/slm/</link>
    <description>Recent content in Slm on Hi, I&#39;m Muhammad Amal</description>
    <generator>Hugo</generator>
    <language>en-us</language>
    <lastBuildDate>Wed, 29 Jan 2025 09:00:00 +0700</lastBuildDate>
    <atom:link href="https://muhammadamal.my.id/tags/slm/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Benchmarking SLMs for Your Use Case, From Lmeval to Custom Suites</title>
      <link>https://muhammadamal.my.id/blog/benchmarking-slms-for-your-use-case-lmeval-to-custom/</link>
      <pubDate>Wed, 29 Jan 2025 09:00:00 +0700</pubDate>
      <guid>https://muhammadamal.my.id/blog/benchmarking-slms-for-your-use-case-lmeval-to-custom/</guid>
      <description>Public leaderboards lie about your task. Build a benchmark that measures what your users actually need.</description>
    </item>
    <item>
      <title>Local RAG with SLMs, Private Knowledge Without the Cloud</title>
      <link>https://muhammadamal.my.id/blog/local-rag-with-slms-private-knowledge-without-cloud/</link>
      <pubDate>Mon, 27 Jan 2025 09:00:00 +0700</pubDate>
      <guid>https://muhammadamal.my.id/blog/local-rag-with-slms-private-knowledge-without-cloud/</guid>
      <description>End-to-end local RAG, no cloud. Embeddings, vectors, retrieval, and grounded generation on a single box.</description>
    </item>
    <item>
      <title>Structured Output and Function Calling on Local SLMs</title>
      <link>https://muhammadamal.my.id/blog/structured-output-and-function-calling-on-local-slms/</link>
      <pubDate>Wed, 22 Jan 2025 09:00:00 +0700</pubDate>
      <guid>https://muhammadamal.my.id/blog/structured-output-and-function-calling-on-local-slms/</guid>
      <description>Get production-grade JSON and tool calls out of 3B models. Constrained decoding, schemas, and what actually works.</description>
    </item>
    <item>
      <title>Fine Tuning SLMs with LoRA and QLoRA, A Hands On Tutorial</title>
      <link>https://muhammadamal.my.id/blog/fine-tuning-slms-with-lora-and-qlora-hands-on/</link>
      <pubDate>Mon, 20 Jan 2025 09:00:00 +0700</pubDate>
      <guid>https://muhammadamal.my.id/blog/fine-tuning-slms-with-lora-and-qlora-hands-on/</guid>
      <description>When prompting plateaus, LoRA and QLoRA take you the next mile. A real fine-tuning walkthrough on consumer GPUs.</description>
    </item>
    <item>
      <title>Serving SLMs at Scale with vLLM, A Production Guide</title>
      <link>https://muhammadamal.my.id/blog/serving-slms-at-scale-with-vllm-production-guide/</link>
      <pubDate>Wed, 15 Jan 2025 09:00:00 +0700</pubDate>
      <guid>https://muhammadamal.my.id/blog/serving-slms-at-scale-with-vllm-production-guide/</guid>
      <description>When Ollama and llama.cpp stop scaling, vLLM is what you reach for. PagedAttention, batching, and the real tradeoffs.</description>
    </item>
    <item>
      <title>llama.cpp Deep Dive, Quantization, GGUF, and Inference Speed</title>
      <link>https://muhammadamal.my.id/blog/llama-cpp-deep-dive-quantization-gguf-inference-speed/</link>
      <pubDate>Mon, 13 Jan 2025 09:00:00 +0700</pubDate>
      <guid>https://muhammadamal.my.id/blog/llama-cpp-deep-dive-quantization-gguf-inference-speed/</guid>
      <description>Where Ollama ends, llama.cpp begins. Quantization, GGUF, KV cache, and squeezing tokens per second.</description>
    </item>
    <item>
      <title>Running SLMs Locally with Ollama, A Step by Step Tutorial</title>
      <link>https://muhammadamal.my.id/blog/running-slms-locally-with-ollama-step-by-step/</link>
      <pubDate>Wed, 08 Jan 2025 09:00:00 +0700</pubDate>
      <guid>https://muhammadamal.my.id/blog/running-slms-locally-with-ollama-step-by-step/</guid>
      <description>Everything I do to ship a local SLM behind Ollama 0.5, from install to a real production endpoint.</description>
    </item>
    <item>
      <title>Small Language Models in January 2025, A Practical Survey</title>
      <link>https://muhammadamal.my.id/blog/slm-landscape-january-2025-practical-survey/</link>
      <pubDate>Mon, 06 Jan 2025 09:00:00 +0700</pubDate>
      <guid>https://muhammadamal.my.id/blog/slm-landscape-january-2025-practical-survey/</guid>
      <description>Where the small language model landscape actually stands in January 2025, from a backend engineer&amp;rsquo;s bench.</description>
    </item>
  </channel>
</rss>
