Llm on Hi, I'm Muhammad Amal

Llm on Hi, I'm Muhammad Amal https://muhammadamal.my.id/tags/llm/ Recent content in Llm on Hi, I'm Muhammad Amal Hugo en-us Wed, 10 Sep 2025 09:00:00 +0700 Securing RAG Systems Against Data Exfiltration in 2025 https://muhammadamal.my.id/blog/securing-rag-against-data-exfiltration-2025/ Wed, 10 Sep 2025 09:00:00 +0700 https://muhammadamal.my.id/blog/securing-rag-against-data-exfiltration-2025/ Practical controls that stop the most common RAG exfiltration vectors without breaking retrieval quality. Advanced Prompt Injection Defenses in 2025, A Practical Guide https://muhammadamal.my.id/blog/advanced-prompt-injection-defenses-2025-practical-guide/ Mon, 01 Sep 2025 09:00:00 +0700 https://muhammadamal.my.id/blog/advanced-prompt-injection-defenses-2025-practical-guide/ Layered prompt injection defenses that actually hold up in production, with code, diagrams, and the failure modes nobody talks about. LLM Red Teaming, Practical Techniques for 2024 https://muhammadamal.my.id/blog/llm-red-teaming-practical-techniques-2024/ Wed, 30 Oct 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/llm-red-teaming-practical-techniques-2024/ How to run an LLM red team that produces actionable findings instead of party tricks, with attack inventory and triage flow. Securing RAG Systems Against Data Exfiltration https://muhammadamal.my.id/blog/securing-rag-systems-against-data-exfiltration/ Wed, 23 Oct 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/securing-rag-systems-against-data-exfiltration/ How to design RAG systems so that prompt injection and over-eager retrieval don’t become an exfiltration channel. Prompt Injection Defenses in LLM Apps, Patterns for 2024 https://muhammadamal.my.id/blog/prompt-injection-defenses-llm-apps-2024/ Mon, 07 Oct 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/prompt-injection-defenses-llm-apps-2024/ Hardening patterns for prompt injection across system prompts, tools, and retrieval, with code and policy guidance. Evaluating RAG, Beyond Vibes-Based Testing https://muhammadamal.my.id/blog/rag-evaluation-ragas-trulens-deepeval/ Mon, 26 Feb 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/rag-evaluation-ragas-trulens-deepeval/ Ragas, TruLens, DeepEval — measuring RAG quality. Faithfulness, context precision, answer relevance. CI integration without LLM-as-judge bills. Why Naive RAG Fails in Production, A 2024 Reality Check https://muhammadamal.my.id/blog/naive-rag-failures-production-2024/ Fri, 02 Feb 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/naive-rag-failures-production-2024/ Naive RAG breaks in prod. Recall gaps, chunk boundaries, stale data. What the 2024 RAG stack changed and where the demo-to-prod gap still hides. The 2023 LLM Tooling Retrospective, What Actually Changed About My Workflow https://muhammadamal.my.id/blog/2023-llm-tooling-retrospective/ Wed, 27 Dec 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/2023-llm-tooling-retrospective/ Which 2023 LLM tools actually earned their place in a senior engineer’s daily workflow, and which got dropped. LLM Vendor Risk, A Failover Playbook After the OpenAI Weekend https://muhammadamal.my.id/blog/llm-vendor-risk-failover-strategy/ Thu, 30 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/llm-vendor-risk-failover-strategy/ A failover playbook for LLM apps after the OpenAI weekend — multi-provider routing, abstraction layers, and what’s worth doing. LangChain LCEL vs LlamaIndex, Picking a Framework in Late 2023 https://muhammadamal.my.id/blog/langchain-lcel-vs-llamaindex-routing/ Tue, 28 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/langchain-lcel-vs-llamaindex-routing/ Picking between LangChain LCEL and LlamaIndex in late 2023 — orchestration vs retrieval, when to use each, and where they overlap. Claude 2.1 vs GPT-4 Turbo, A Side-by-Side at 100K Context https://muhammadamal.my.id/blog/claude-2-1-200k-context-comparison/ Fri, 24 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/claude-2-1-200k-context-comparison/ Side-by-side notes on Claude 2.1 200K vs GPT-4 Turbo 128K — long-context recall, document QA, function calling, and production fit. LLM Observability in Practice, Logs, Traces, and a Useful Dashboard https://muhammadamal.my.id/blog/llm-observability-monitoring-dashboard/ Wed, 22 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/llm-observability-monitoring-dashboard/ Practical LLM observability — what to log, what to alert on, and what a useful dashboard for a RAG system looks like. Putting a RAG Evaluation Pipeline in CI, The Setup I Actually Use https://muhammadamal.my.id/blog/rag-evaluation-pipeline-ci/ Mon, 20 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/rag-evaluation-pipeline-ci/ A practical RAG eval setup wired into CI — retrieval and generation metrics, golden questions, and catching silent regressions. Hybrid Retrieval with pgvector and BM25, A Practical Walkthrough https://muhammadamal.my.id/blog/hybrid-retrieval-pgvector-bm25/ Thu, 16 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/hybrid-retrieval-pgvector-bm25/ Building hybrid retrieval on Postgres with pgvector 0.5 and BM25 — schema, query, score fusion, and trade-offs vs managed vector DBs. Securing an Internal LLM Chatbot, Threats, Boundaries, and What I Got Wrong https://muhammadamal.my.id/blog/securing-internal-llm-chatbot-data/ Tue, 14 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/securing-internal-llm-chatbot-data/ A practical guide to securing internal LLM chatbots — prompt injection, leakage, access control, and the gaps people miss. The OpenAI Assistants API in Production, A Cautious Take https://muhammadamal.my.id/blog/openai-assistants-api-production-review/ Fri, 10 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/openai-assistants-api-production-review/ An honest production review of the OpenAI Assistants API beta — what it solves, where it falls short, and why I’m cautious about adoption. Migrating to GPT-4 Turbo, What 128K Context Actually Changes https://muhammadamal.my.id/blog/gpt-4-turbo-128k-context-migration/ Wed, 08 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/gpt-4-turbo-128k-context-migration/ Migrating a production RAG chatbot from gpt-3.5-turbo-16k to GPT-4 Turbo 128K — cost, latency, and when context matters. Shipping an Internal RAG Chatbot with LlamaIndex 0.8, What Actually Matters https://muhammadamal.my.id/blog/internal-rag-chatbot-llamaindex/ Thu, 02 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/internal-rag-chatbot-llamaindex/ Lessons from building a production internal RAG chatbot with LlamaIndex 0.8 — retrieval design, chunking, and the plumbing that makes it work. LangChain 0.0.13x, The Framework, the Hype, and the Real Engineering Tradeoffs https://muhammadamal.my.id/blog/langchain-framework-intro/ Thu, 27 Apr 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/langchain-framework-intro/ A senior engineer’s view of LangChain 0.0.13x - what it actually does, where it earns its complexity, and where you should write the code yourself instead. Error Handling and Retries for LLM APIs https://muhammadamal.my.id/blog/llm-error-handling-retries/ Fri, 27 Jan 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/llm-error-handling-retries/ OpenAI error handling: transient vs permanent, backoff, fallbacks, keep service up. LLM Cost Control and Token Budgets https://muhammadamal.my.id/blog/llm-cost-control-token-budgets/ Tue, 24 Jan 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/llm-cost-control-token-budgets/ LLM cost control: budgets, compression, caching, model selection, alerts on runaway. Streaming Responses from LLM APIs https://muhammadamal.my.id/blog/streaming-llm-responses-sse/ Fri, 20 Jan 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/streaming-llm-responses-sse/ Stream OpenAI responses via SSE. UX matters, Python + Node patterns, proxy + CDN gotchas. Few-Shot Prompting and In-Context Learning https://muhammadamal.my.id/blog/few-shot-prompting-in-context/ Tue, 17 Jan 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/few-shot-prompting-in-context/ Few-shot prompting: 2-3 examples beat long instructions. Cost trade-offs, example selection, where it wins. Prompt Engineering Basics for Engineers https://muhammadamal.my.id/blog/prompt-engineering-basics-engineers/ Fri, 13 Jan 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/prompt-engineering-basics-engineers/ Prompt engineering for engineers: structure, role priming, examples, schema. From 50% to 90% accuracy. Calling OpenAI from Node.js https://muhammadamal.my.id/blog/openai-nodejs-integration-2023/ Tue, 10 Jan 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/openai-nodejs-integration-2023/ Node + OpenAI in 2023: SDK, prompt templates, Zod validation, p-retry, production patterns. Calling OpenAI from Python, Patterns and Pitfalls https://muhammadamal.my.id/blog/openai-python-patterns-pitfalls/ Fri, 06 Jan 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/openai-python-patterns-pitfalls/ Python + OpenAI in 2023: SDK, prompt templates, JSON parsing, retries, async, production pitfalls. Why Every Backend Needs an LLM Integration in 2023 https://muhammadamal.my.id/blog/why-llm-integration-backend-2023/ Tue, 03 Jan 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/why-llm-integration-backend-2023/ Why backend engineers should integrate LLMs in 2023. Real cases beyond chatbots, OpenAI today, realistic path.