Ai on Hi, I'm Muhammad Amal

Ai on Hi, I'm Muhammad Amal https://muhammadamal.my.id/tracks/ai/ Recent content in Ai on Hi, I'm Muhammad Amal Hugo en-us Mon, 26 Feb 2024 09:00:00 +0700 Evaluating RAG, Beyond Vibes-Based Testing https://muhammadamal.my.id/blog/rag-evaluation-ragas-trulens-deepeval/ Mon, 26 Feb 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/rag-evaluation-ragas-trulens-deepeval/ Ragas, TruLens, DeepEval — measuring RAG quality. Faithfulness, context precision, answer relevance. CI integration without LLM-as-judge bills. Re-ranking and Reciprocal Rank Fusion in RAG Pipelines https://muhammadamal.my.id/blog/rag-reranking-rrf-cohere-bge/ Wed, 21 Feb 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/rag-reranking-rrf-cohere-bge/ Cross-encoder rerankers turn top-50 retrieval into clean top-5. Cohere Rerank vs BGE-reranker, latency budgets, where it slots in your RAG pipeline. Securing RAG, Per-User Document Access Without Re-indexing https://muhammadamal.my.id/blog/rag-security-access-control-multi-tenant/ Mon, 19 Feb 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/rag-security-access-control-multi-tenant/ Multi-tenant RAG without leaks. Metadata filtering at retrieval, ACL design, audit trails, and prompt-side defenses for what filters miss. Hybrid Search, BM25 Plus Vectors for Better RAG Recall https://muhammadamal.my.id/blog/hybrid-search-bm25-vectors-rag/ Wed, 14 Feb 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/hybrid-search-bm25-vectors-rag/ Pure vector search misses exact-match queries. Hybrid BM25 + dense + RRF closes the gap. Real code, real numbers, real trade-offs. Chunking Strategies for RAG That Survive Real Documents https://muhammadamal.my.id/blog/rag-chunking-strategies-real-documents/ Mon, 12 Feb 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/rag-chunking-strategies-real-documents/ Chunking is where RAG quality is won or lost. Semantic, hierarchical, sentence-window strategies and concrete code for documents that break defaults. Embedding Models in 2024, OpenAI vs Cohere vs Open Source https://muhammadamal.my.id/blog/embedding-models-2024-openai-cohere-open-source/ Wed, 07 Feb 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/embedding-models-2024-openai-cohere-open-source/ text-embedding-3, Cohere v3, bge-m3 — which embedding model in 2024. Dimension trade-offs, multilingual, cost. Honest comparison. Choosing a Vector Database, Pinecone vs Qdrant vs pgvector https://muhammadamal.my.id/blog/vector-database-pinecone-qdrant-pgvector/ Mon, 05 Feb 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/vector-database-pinecone-qdrant-pgvector/ Pinecone serverless, Qdrant v1.7, pgvector 0.5 — how to pick. Cost, hybrid search, filtering, ops. Honest trade-offs, no marketing. Why Naive RAG Fails in Production, A 2024 Reality Check https://muhammadamal.my.id/blog/naive-rag-failures-production-2024/ Fri, 02 Feb 2024 09:00:00 +0700 https://muhammadamal.my.id/blog/naive-rag-failures-production-2024/ Naive RAG breaks in prod. Recall gaps, chunk boundaries, stale data. What the 2024 RAG stack changed and where the demo-to-prod gap still hides. LLM Vendor Risk, A Failover Playbook After the OpenAI Weekend https://muhammadamal.my.id/blog/llm-vendor-risk-failover-strategy/ Thu, 30 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/llm-vendor-risk-failover-strategy/ A failover playbook for LLM apps after the OpenAI weekend — multi-provider routing, abstraction layers, and what’s worth doing. LangChain LCEL vs LlamaIndex, Picking a Framework in Late 2023 https://muhammadamal.my.id/blog/langchain-lcel-vs-llamaindex-routing/ Tue, 28 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/langchain-lcel-vs-llamaindex-routing/ Picking between LangChain LCEL and LlamaIndex in late 2023 — orchestration vs retrieval, when to use each, and where they overlap. Claude 2.1 vs GPT-4 Turbo, A Side-by-Side at 100K Context https://muhammadamal.my.id/blog/claude-2-1-200k-context-comparison/ Fri, 24 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/claude-2-1-200k-context-comparison/ Side-by-side notes on Claude 2.1 200K vs GPT-4 Turbo 128K — long-context recall, document QA, function calling, and production fit. LLM Observability in Practice, Logs, Traces, and a Useful Dashboard https://muhammadamal.my.id/blog/llm-observability-monitoring-dashboard/ Wed, 22 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/llm-observability-monitoring-dashboard/ Practical LLM observability — what to log, what to alert on, and what a useful dashboard for a RAG system looks like. Putting a RAG Evaluation Pipeline in CI, The Setup I Actually Use https://muhammadamal.my.id/blog/rag-evaluation-pipeline-ci/ Mon, 20 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/rag-evaluation-pipeline-ci/ A practical RAG eval setup wired into CI — retrieval and generation metrics, golden questions, and catching silent regressions. Hybrid Retrieval with pgvector and BM25, A Practical Walkthrough https://muhammadamal.my.id/blog/hybrid-retrieval-pgvector-bm25/ Thu, 16 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/hybrid-retrieval-pgvector-bm25/ Building hybrid retrieval on Postgres with pgvector 0.5 and BM25 — schema, query, score fusion, and trade-offs vs managed vector DBs. Securing an Internal LLM Chatbot, Threats, Boundaries, and What I Got Wrong https://muhammadamal.my.id/blog/securing-internal-llm-chatbot-data/ Tue, 14 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/securing-internal-llm-chatbot-data/ A practical guide to securing internal LLM chatbots — prompt injection, leakage, access control, and the gaps people miss. The OpenAI Assistants API in Production, A Cautious Take https://muhammadamal.my.id/blog/openai-assistants-api-production-review/ Fri, 10 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/openai-assistants-api-production-review/ An honest production review of the OpenAI Assistants API beta — what it solves, where it falls short, and why I’m cautious about adoption. Migrating to GPT-4 Turbo, What 128K Context Actually Changes https://muhammadamal.my.id/blog/gpt-4-turbo-128k-context-migration/ Wed, 08 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/gpt-4-turbo-128k-context-migration/ Migrating a production RAG chatbot from gpt-3.5-turbo-16k to GPT-4 Turbo 128K — cost, latency, and when context matters. Shipping an Internal RAG Chatbot with LlamaIndex 0.8, What Actually Matters https://muhammadamal.my.id/blog/internal-rag-chatbot-llamaindex/ Thu, 02 Nov 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/internal-rag-chatbot-llamaindex/ Lessons from building a production internal RAG chatbot with LlamaIndex 0.8 — retrieval design, chunking, and the plumbing that makes it work. LangChain 0.0.13x, The Framework, the Hype, and the Real Engineering Tradeoffs https://muhammadamal.my.id/blog/langchain-framework-intro/ Thu, 27 Apr 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/langchain-framework-intro/ A senior engineer’s view of LangChain 0.0.13x - what it actually does, where it earns its complexity, and where you should write the code yourself instead. Chroma 0.3, The Local-First Vector Database for Notebook-Scale Prototyping https://muhammadamal.my.id/blog/chroma-local-prototyping/ Mon, 24 Apr 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/chroma-local-prototyping/ Chroma 0.3 for notebook-scale semantic search prototyping - embedded mode, persistence, LangChain integration, and when it’s time to graduate to a real database. Weaviate 1.18 and Hybrid Search, When Keyword and Vector Search Are Both Right https://muhammadamal.my.id/blog/weaviate-hybrid-search/ Thu, 20 Apr 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/weaviate-hybrid-search/ Hybrid search with Weaviate 1.18 - combining BM25 with vector similarity, tuning the alpha parameter, and when hybrid actually beats pure vector retrieval. Milvus 2.2 in Production, Self-Hosting the Heavyweight Open-Source Vector Database https://muhammadamal.my.id/blog/milvus-self-hosted-production/ Mon, 17 Apr 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/milvus-self-hosted-production/ Practical guide to Milvus 2.2 in production - architecture, Helm install, index selection, and the operational gotchas you’ll hit running it at scale. Building Semantic Search From Scratch, A Production Walkthrough https://muhammadamal.my.id/blog/semantic-search-from-scratch/ Thu, 13 Apr 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/semantic-search-from-scratch/ End-to-end semantic search build covering ingestion, chunking, embedding, indexing, and serving with Pinecone, OpenAI, and FastAPI - the parts that actually break in production. Embedding Models in 2023, ada-002, sentence-transformers, and What Actually Matters https://muhammadamal.my.id/blog/embedding-models-deep-dive/ Mon, 10 Apr 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/embedding-models-deep-dive/ Comparing text-embedding-ada-002 and sentence-transformers for semantic search, covering dimensions, cost, latency, and the quality tradeoffs that matter in practice. Pinecone in Production, Pod Sizing, Upserts, and the Cost Math That Surprises Teams https://muhammadamal.my.id/blog/pinecone-getting-started-pod-based/ Thu, 06 Apr 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/pinecone-getting-started-pod-based/ A practical Pinecone walkthrough covering pod selection, batched upserts, metadata filtering, and the cost math most teams miss on their first deployment. The Vector Database Landscape in 2023, Pinecone, Milvus, Weaviate, and Chroma Compared https://muhammadamal.my.id/blog/vector-databases-landscape-2023/ Mon, 03 Apr 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/vector-databases-landscape-2023/ A senior engineer’s honest comparison of Pinecone, Milvus, Weaviate, and Chroma in April 2023, covering architecture, pricing, and when each makes sense. Error Handling and Retries for LLM APIs https://muhammadamal.my.id/blog/llm-error-handling-retries/ Fri, 27 Jan 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/llm-error-handling-retries/ OpenAI error handling: transient vs permanent, backoff, fallbacks, keep service up. LLM Cost Control and Token Budgets https://muhammadamal.my.id/blog/llm-cost-control-token-budgets/ Tue, 24 Jan 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/llm-cost-control-token-budgets/ LLM cost control: budgets, compression, caching, model selection, alerts on runaway. Streaming Responses from LLM APIs https://muhammadamal.my.id/blog/streaming-llm-responses-sse/ Fri, 20 Jan 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/streaming-llm-responses-sse/ Stream OpenAI responses via SSE. UX matters, Python + Node patterns, proxy + CDN gotchas. Few-Shot Prompting and In-Context Learning https://muhammadamal.my.id/blog/few-shot-prompting-in-context/ Tue, 17 Jan 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/few-shot-prompting-in-context/ Few-shot prompting: 2-3 examples beat long instructions. Cost trade-offs, example selection, where it wins. Prompt Engineering Basics for Engineers https://muhammadamal.my.id/blog/prompt-engineering-basics-engineers/ Fri, 13 Jan 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/prompt-engineering-basics-engineers/ Prompt engineering for engineers: structure, role priming, examples, schema. From 50% to 90% accuracy. Calling OpenAI from Node.js https://muhammadamal.my.id/blog/openai-nodejs-integration-2023/ Tue, 10 Jan 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/openai-nodejs-integration-2023/ Node + OpenAI in 2023: SDK, prompt templates, Zod validation, p-retry, production patterns. Calling OpenAI from Python, Patterns and Pitfalls https://muhammadamal.my.id/blog/openai-python-patterns-pitfalls/ Fri, 06 Jan 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/openai-python-patterns-pitfalls/ Python + OpenAI in 2023: SDK, prompt templates, JSON parsing, retries, async, production pitfalls. Why Every Backend Needs an LLM Integration in 2023 https://muhammadamal.my.id/blog/why-llm-integration-backend-2023/ Tue, 03 Jan 2023 09:00:00 +0700 https://muhammadamal.my.id/blog/why-llm-integration-backend-2023/ Why backend engineers should integrate LLMs in 2023. Real cases beyond chatbots, OpenAI today, realistic path. IP, Licensing, and AI-Generated Code https://muhammadamal.my.id/blog/ai-code-licensing-ip/ Wed, 21 Dec 2022 09:00:00 +0700 https://muhammadamal.my.id/blog/ai-code-licensing-ip/ AI code legal: training data, Copilot lawsuit, commercial use, compliance guidance. Beyond Copilot, Tabnine, Codeium, Amazon CodeWhisperer https://muhammadamal.my.id/blog/ai-coding-tools-2022/ Mon, 19 Dec 2022 09:00:00 +0700 https://muhammadamal.my.id/blog/ai-coding-tools-2022/ AI coding tools 2022: Tabnine, Codeium, CodeWhisperer. Privacy, perf, language coverage. Codespaces + Copilot, Cloud Dev Loops https://muhammadamal.my.id/blog/codespaces-copilot-cloud-dev/ Fri, 16 Dec 2022 09:00:00 +0700 https://muhammadamal.my.id/blog/codespaces-copilot-cloud-dev/ Codespaces + Copilot: cloud dev environments. When they earn cost; AI workflow. Copilot for Tests, TDD or Anti-TDD? https://muhammadamal.my.id/blog/copilot-for-tests-tdd/ Wed, 14 Dec 2022 09:00:00 +0700 https://muhammadamal.my.id/blog/copilot-for-tests-tdd/ Copilot for tests: where it helps, where it misses, TDD compatibility. Pair Programming With an AI Assistant https://muhammadamal.my.id/blog/ai-pair-programming-2022/ Mon, 12 Dec 2022 09:00:00 +0700 https://muhammadamal.my.id/blog/ai-pair-programming-2022/ Pair programming with AI: what works, what doesn’t, vs human pairing dynamic. Reviewing AI-Suggested Code https://muhammadamal.my.id/blog/reviewing-ai-suggested-code/ Fri, 09 Dec 2022 09:00:00 +0700 https://muhammadamal.my.id/blog/reviewing-ai-suggested-code/ Reviewing AI code: checklist, failure modes, why not to trust, vs human-written. Prompt-Style Comments to Steer Copilot https://muhammadamal.my.id/blog/copilot-prompt-style-comments/ Wed, 07 Dec 2022 09:00:00 +0700 https://muhammadamal.my.id/blog/copilot-prompt-style-comments/ Prompt-style comments steer Copilot. Patterns that improve quality, with examples. What Copilot Is Good At (and What It Isn't) https://muhammadamal.my.id/blog/what-copilot-is-good-at/ Mon, 05 Dec 2022 09:00:00 +0700 https://muhammadamal.my.id/blog/what-copilot-is-good-at/ Copilot’s strengths + failures: granular. Excels at, fakes well, net-negative categories. A Year With GitHub Copilot in Production https://muhammadamal.my.id/blog/a-year-with-github-copilot-in-production/ Fri, 02 Dec 2022 09:00:00 +0700 https://muhammadamal.my.id/blog/a-year-with-github-copilot-in-production/ Honest year with Copilot: where it accelerates, where it misleads, real workflow shift.