Rag on Hi, I'm Muhammad Amal

Rag on Hi, I'm Muhammad Amal https://muhammadamal.my.id/tags/rag/ Recent content in Rag on Hi, I'm Muhammad Amal Hugo en-us Mon, 03 Nov 2025 09:00:00 +0700 RAG Systems for Technical Support Teams in 2025 https://muhammadamal.my.id/blog/rag-systems-for-technical-support-teams-in-2025/ Mon, 03 Nov 2025 09:00:00 +0700 https://muhammadamal.my.id/blog/rag-systems-for-technical-support-teams-in-2025/ A field-tested walkthrough of building retrieval-augmented generation for L1 through L3 support, with runnable Python, pgvector and Qdrant pipelines, and the failure modes nobody talks about. Securing RAG Systems Against Data Exfiltration in 2025 https://muhammadamal.my.id/blog/securing-rag-against-data-exfiltration-2025/ Wed, 10 Sep 2025 09:00:00 +0700 https://muhammadamal.my.id/blog/securing-rag-against-data-exfiltration-2025/ Practical controls that stop the most common RAG exfiltration vectors without breaking retrieval quality. Local RAG with SLMs, Private Knowledge Without the Cloud https://muhammadamal.my.id/blog/local-rag-with-slms-private-knowledge-without-cloud/ Mon, 27 Jan 2025 09:00:00 +0700 https://muhammadamal.my.id/blog/local-rag-with-slms-private-knowledge-without-cloud/ End-to-end local RAG, no cloud. Embeddings, vectors, retrieval, and grounded generation on a single box. Securing RAG Systems Against Data Exfiltration https://muhammadamal.my.id/blog/securing-rag-systems-against-data-exfiltration/ Wed, 23 Oct 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/securing-rag-systems-against-data-exfiltration/ How to design RAG systems so that prompt injection and over-eager retrieval don’t become an exfiltration channel. Evaluating RAG, Beyond Vibes-Based Testing https://muhammadamal.my.id/blog/rag-evaluation-ragas-trulens-deepeval/ Mon, 26 Feb 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/rag-evaluation-ragas-trulens-deepeval/ Ragas, TruLens, DeepEval — measuring RAG quality. Faithfulness, context precision, answer relevance. CI integration without LLM-as-judge bills. Re-ranking and Reciprocal Rank Fusion in RAG Pipelines https://muhammadamal.my.id/blog/rag-reranking-rrf-cohere-bge/ Wed, 21 Feb 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/rag-reranking-rrf-cohere-bge/ Cross-encoder rerankers turn top-50 retrieval into clean top-5. Cohere Rerank vs BGE-reranker, latency budgets, where it slots in your RAG pipeline. Securing RAG, Per-User Document Access Without Re-indexing https://muhammadamal.my.id/blog/rag-security-access-control-multi-tenant/ Mon, 19 Feb 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/rag-security-access-control-multi-tenant/ Multi-tenant RAG without leaks. Metadata filtering at retrieval, ACL design, audit trails, and prompt-side defenses for what filters miss. Hybrid Search, BM25 Plus Vectors for Better RAG Recall https://muhammadamal.my.id/blog/hybrid-search-bm25-vectors-rag/ Wed, 14 Feb 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/hybrid-search-bm25-vectors-rag/ Pure vector search misses exact-match queries. Hybrid BM25 + dense + RRF closes the gap. Real code, real numbers, real trade-offs. Chunking Strategies for RAG That Survive Real Documents https://muhammadamal.my.id/blog/rag-chunking-strategies-real-documents/ Mon, 12 Feb 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/rag-chunking-strategies-real-documents/ Chunking is where RAG quality is won or lost. Semantic, hierarchical, sentence-window strategies and concrete code for documents that break defaults. Embedding Models in 2024, OpenAI vs Cohere vs Open Source https://muhammadamal.my.id/blog/embedding-models-2024-openai-cohere-open-source/ Wed, 07 Feb 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/embedding-models-2024-openai-cohere-open-source/ text-embedding-3, Cohere v3, bge-m3 — which embedding model in 2024. Dimension trade-offs, multilingual, cost. Honest comparison. Choosing a Vector Database, Pinecone vs Qdrant vs pgvector https://muhammadamal.my.id/blog/vector-database-pinecone-qdrant-pgvector/ Mon, 05 Feb 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/vector-database-pinecone-qdrant-pgvector/ Pinecone serverless, Qdrant v1.7, pgvector 0.5 — how to pick. Cost, hybrid search, filtering, ops. Honest trade-offs, no marketing. Why Naive RAG Fails in Production, A 2024 Reality Check https://muhammadamal.my.id/blog/naive-rag-failures-production-2024/ Fri, 02 Feb 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/naive-rag-failures-production-2024/ Naive RAG breaks in prod. Recall gaps, chunk boundaries, stale data. What the 2024 RAG stack changed and where the demo-to-prod gap still hides. Putting a RAG Evaluation Pipeline in CI, The Setup I Actually Use https://muhammadamal.my.id/blog/rag-evaluation-pipeline-ci/ Mon, 20 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/rag-evaluation-pipeline-ci/ A practical RAG eval setup wired into CI — retrieval and generation metrics, golden questions, and catching silent regressions. Securing an Internal LLM Chatbot, Threats, Boundaries, and What I Got Wrong https://muhammadamal.my.id/blog/securing-internal-llm-chatbot-data/ Tue, 14 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/securing-internal-llm-chatbot-data/ A practical guide to securing internal LLM chatbots — prompt injection, leakage, access control, and the gaps people miss. Shipping an Internal RAG Chatbot with LlamaIndex 0.8, What Actually Matters https://muhammadamal.my.id/blog/internal-rag-chatbot-llamaindex/ Thu, 02 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/internal-rag-chatbot-llamaindex/ Lessons from building a production internal RAG chatbot with LlamaIndex 0.8 — retrieval design, chunking, and the plumbing that makes it work.