Blog — Hassan Raza | Full-Stack & AI Engineering Insights

📊

How to Build LLM Observability Without Paid Tools Cost Tracking Python 2026

AI Engineering July 6, 2026 11 min read

How to Build LLM Observability Without Adding Another Paid Tool: Cost Tracking, Quality Monitoring, and Failure Alerts in Python

60-line LLMObserver class: $2.50/1M GPT-4o input cost per call, quality threshold alerts, MongoDB traces, Slack webhooks. No Langfuse Docker setup, no Helicone proxy risk. Two production patterns: 161-call pipeline and 10-tool SaaS.

⚡

How to Add Real-Time AI Streaming to Next.js App Router Gemini Flash Server-Sent Events 2026

SaaS Development July 5, 2026 11 min read

How to Add Real-Time AI Streaming to a Next.js App Router Application (Gemini Flash + Server-Sent Events)

8-second spinner → first word in under 2 seconds. Same API cost. Two approaches: Vercel AI SDK (streamText + useCompletion, 3-line server change) and raw SSE (ReadableStream + 4 headers including X-Accel-Buffering: no). Plus the Golden Rule that breaks streaming when you for-await before returning Response.

🛑

How to Prevent AI API Cost Explosions in Production Budget Guards Python 2026

Cost & Optimization July 4, 2026 11 min read

How to Prevent AI API Cost Explosions in Production: Budget Guards, Spending Alerts, and Hard Limits (Python)

$47K in 11 days — they had Helicone, not enforcement. The CostGuard class, 4 budget guards ($203→$14), TOKEN_BUDGET=12000, Redis cache TTL=86400s, INTER_BATCH_SLEEP=2.0s, and per-user quotas that block calls before they happen.

🤖

Python AI Agent Without LangChain CrewAI AutoGen Production 2026

LLM & Agents July 3, 2026 11 min read

How to Build a Production AI Agent in Python Without LangChain, CrewAI, or AutoGen (The 80-Line Loop)

161 GPT-4o calls, 4 hours, $14 per run, zero frameworks. The 80-line tool-calling loop, sequential pipeline with MongoDB checkpointing, and the 3 production additions — cost guards, rate limiting, crash recovery — that no framework installs for you.

💳

Stripe Metered Billing AI SaaS 2026 New Meter API

SaaS Development July 2, 2026 12 min read

How to Implement Stripe Metered Billing for an AI SaaS in 2026 (The New Meter API — Not the Deprecated Usage Records)

Every pre-2026 tutorial is broken — usage_type: 'metered' without a meter throws an error since API 2025-03-31.basil. Complete 5-object TypeScript setup, idempotent meterEvents.create(), webhook handler, and the 0.7% billing fee gotcha for $14/report AI billing.

⚡

Gemini 3.5 Flash vs 2.5 Flash Production Cost Comparison 2026

Cost & Optimization July 1, 2026 11 min read

Gemini 3.5 Flash vs 2.5 Flash: Real Performance and Cost Comparison for Production AI Tools (2026)

Gemini 3.5 Flash is 5× more per input token than 2.5 Flash — not a straight upgrade. Real math from 10 AI tools at $20–60/month, the 17–19s TTFT gotcha, caching economics, and when to migrate.

🏢

Industry Solutions June 30, 2026 11 min read

How Small Businesses Can Use AI to Compete With Enterprise (Without the Enterprise Budget)

Enterprise AI: $22,800-102,000/year. Custom AI: $840-10,680/year. $14 per 1,725-page report, $60/month for 10 content tools, 80% of support tickets automated — honest limits on compliance and scale.

📁

Freelancing June 29, 2026 11 min read

How to Turn Your Side Project Into a Freelance Portfolio That Wins AI Clients

Six-part writeup structure, honest complexity ($203→$14, 1,725 pages), five client questions your portfolio must answer, and why technical blog posts outperform GitHub repos for AI freelance work.

💼

Freelancing June 28, 2026 11 min read

How to Land Your First AI SaaS Client as a Freelance Developer (Without a Big Portfolio)

One production side project beats five client gigs. Two-layer portfolio description, $203→$14 as proof, 15-minute cold outreach template, LinkedIn strategy, and the discovery project that closes the first engagement.

🚀

Personal Insights June 27, 2026 11 min read

Why I Build AI Products Solo in 2026 (And Why the Timing Has Never Been Better)

4-5 engineers in 2020, 1 developer in 2026. LLM APIs replaced ML teams. Server Actions collapsed roles. Honest case for solo AI SaaS — and what it still doesn't solve.

🧠

Personal Insights June 26, 2026 11 min read

From Zero to Two AI SaaS Products: What 12 Months of Solo Building Taught Me

Python vs TypeScript. $203 first AI run. 1-week retrofit. Two stacks, 12 months solo — stack selection, cost estimation, timeline realism, and mistakes I made twice.

🗄️

Software Development June 25, 2026 12 min read

How to Set Up Prisma Database Migrations in a Production Next.js App on Vercel

3 gotchas: migrate dev hangs in CI, missing prisma generate breaks build, pooler blocks DDL without directUrl. postinstall script, schema.prisma directUrl, build command.

💰

Cost & Optimization June 24, 2026 11 min read

GPT-4o vs Gemini 2.5 Flash: Real Cost Comparison for a Production AI SaaS (2026)

Real invoice numbers: GPT-4o $203→$14 per report (161 calls, 4 optimizations). Gemini Flash $20–60/month for 7 tools. 33× cheaper — when it matters and when GPT-4o wins.

📱

Use Cases June 23, 2026 12 min read

How to Build an Automated Social Media Content Pipeline With AI (Creator Economy)

One ContentBrief → weekly strategy → 7 Instagram posts + YouTube scripts + Facebook ads in parallel. Zod platform constraints, Gemini generateObject, ~30s API time, ~20 min review, less than $0.01 per week.

🎧

Use Cases June 22, 2026 12 min read

How to Automate Customer Support With AI: From Chatbot to Smart Ticket Routing

80% repetitive tickets — 3-tier routing (≥0.82 auto-respond, 0.65–0.82 draft, escalate below), pgvector FAQ knowledge base, Claude Haiku classification, sentiment-first hard rules. Complete Python implementation.

🛒

Use Cases June 21, 2026 13 min read

How to Build an AI Product Description Generator for E-Commerce (With Real Code)

5 formats in one Gemini call — Zod input/output schemas, generateObject service, 4-gate Server Action, batch processor (5 concurrent, 1200ms delay). Full Next.js TypeScript tutorial with real code.

🏠

Use Cases June 20, 2026 11 min read

AI for Real Estate: What a Developer Would Build for a Real Estate Business in 2026

5 custom AI tools — listing descriptions (1–2 days), lead response (2–3 days), property reports, social pipeline — with build times, $5–150/month API costs, and real example outputs. Not subscriptions. A build plan.

🤖

LLM & Agents June 19, 2026 12 min read

How to Use the Claude API for Structured JSON Output in Production (2026)

~95% prompt-only vs ~99% forced tool_use — Python + TypeScript/Zod implementations, 3 production edge cases, Claude vs Gemini vs GPT-4o comparison. claude-sonnet-4-6, tool_choice forced JSON Schema.

🔍

LLM & Agents June 18, 2026 13 min read

How to Build a RAG Pipeline Without LangChain (Pure Python + OpenAI, Production-Ready)

~200 lines pure Python — chunk_text, embed_texts (text-embedding-3-small, 1536 dims), pgvector ivfflat, 0.7 similarity threshold, GPT-4o temperature 0.1. No LangChain, full debug control.

🔧

LLM & Agents June 17, 2026 13 min read

The Tool Registry Pattern: How to Build a Scalable Multi-Tool AI Platform

10 tools, 1 ToolConfig array — sidebar, routing, status badges, command palette, and stats auto-generate. 1 week retrofit cost, 6-step tool pattern, getToolsGroupedForNav(), zero hardcoded links.

🌙

Software Development June 16, 2026 12 min read

How to Implement Dark Mode in Next.js App Router With next-themes (2026 Guide)

next-themes ^0.4.6 — 3 gotchas (hydration flash, Providers pattern, mounted check), suppressHydrationWarning, class strategy, HSL CSS variables, WCAG 15.8:1 contrast, sidebar-stays-dark.

⌘

Software Development June 15, 2026 12 min read

How to Build a Command Palette in Next.js With shadcn/ui Command Component

Cmd+K in Next.js 16 — 10-tool registry, cmdk ^1.1.1, value prop fuzzy search trick, useCommandPalette hook, layout-level mounting, AI badges on 7 tools, 3 groups under 100 lines TypeScript.

🔒

SaaS Development June 14, 2026 13 min read

How to Build a Multi-Tenant SaaS With Prisma and PostgreSQL in Next.js

6 Prisma models, userId on every query, AES-256-GCM credentials, Server Action 4-gate security — and the findMany() without userId mistake that leaks every customer's data silently.

🚀

AI Engineering June 13, 2026 13 min read

How to Deploy FastAPI + Celery on Render: A Production Setup Guide

5 Render services from one render.yaml — OOM gotcha (Standard plan for 2 GB PDF worker), WeasyPrint Docker deps, sync: false for secrets, Redis via Upstash, cold-start fix for Stripe webhooks.

📧

Software Development June 12, 2026 12 min read

How to Email Large AI-Generated Files Using SendGrid and Cloud Storage

14 MB AI PDFs via SendGrid + Vercel Blob — upload to cloud first, attach for desktop, download link for everyone. Hybrid delivery, bundle reattach across Celery workers, non-fatal error handling.

⚔️

AI Engineering June 11, 2026 13 min read

FastAPI vs Next.js: Which Backend Should You Use for an AI SaaS in 2026?

Same developer, two stacks — 141–190 min Celery pipeline vs 10 Server Action tools under 30s. Real trade-offs on background jobs, AI ecosystem, MongoDB vs PostgreSQL, deployment, and type safety.

⏱️

AI Engineering June 10, 2026 13 min read

How to Build an Async AI Pipeline That Runs for Hours Without Timing Out

30-second server timeout vs 141–190 minute AI jobs — fire-and-forget FastAPI + Celery, HTTP 202 Accepted, soft_time_limit=21,600s, polling, SSE, email delivery, MongoDB checkpointing.

💳

SaaS Development June 9, 2026 13 min read

How to Integrate Stripe Payments Into a FastAPI Python SaaS That Triggers AI Jobs

checkout.session.completed → raw-byte signature verification → always-200 → idempotency guard → update-before-dispatch → Celery queue routing → 1,725-page PDF delivery. The parts most tutorials skip.

💸

Cost & Optimization June 8, 2026 12 min read

How I Reduced GPT-4o API Costs From $203 to $14 Per Order (Real Production Numbers)

$203 first test run → $14 production. Same gpt-4o, 161 calls, 8,620 answers — token budgets, smaller batches, Redis pre-computation, 2.0s inter-batch sleep. The counterintuitive fix that added 49 calls and cut $60.

📄

Software Development June 7, 2026 13 min read

How to Generate a 1,720-Page PDF Programmatically With Python and WeasyPrint

1,725 pages from 173 WeasyPrint chunks — CHUNK_SIZE=10, OOM guard at 30 sections, pypdf merge, bundled fonts, emoji-safe templates, and two Docker gotchas that break production PDFs.

⚙️

AI Engineering June 6, 2026 13 min read

How to Build a Celery + FastAPI + Redis Task Queue for AI Jobs in Production

161 GPT-4o calls per order, 5 dedicated Celery workers, Stripe webhook dispatch — task_acks_late, reject_on_worker_lost, prefetch=1, section-level MongoDB checkpointing for crash recovery.

🏗️

Personal Insights June 5, 2026 12 min read

The Exact Tech Stack I Use to Ship AI SaaS Products Fast in 2026

Next.js 16, PostgreSQL, Prisma, Gemini 2.5 Flash, Zod, Vercel — the production stack behind 9 AI tools and 280 TypeScript files. Every choice justified, Gemini 33× cheaper than GPT-4o.

💼

Freelancing & Business June 4, 2026 12 min read

How to Price AI Development Projects as a Freelance Developer in 2026

4 project types, 6 hidden AI complexity factors, and a 6-step framework. Agency $80k–$120k vs solo $35k–$55k — why most freelance AI developers undercharge by 3× and how to fix it.

📱

Industry Solutions June 3, 2026 11 min read

How Affiliate Marketers Can Use AI to Create Content 10x Faster (With Real Tool Examples)

Creator Dropp's 9 AI tools for Facebook, Instagram, and YouTube — page setup, policy-aware ad copy, faceless posts with 28 hashtags, and full video scripts. Real examples and ~90% time reduction math.

🧠

Personal Insights June 2, 2026 12 min read

What I Learned Building a 9-Tool AI SaaS From Scratch — Real Numbers and Honest Mistakes

280 TypeScript files, 58,600 lines, 9 AI tools, 6 affiliate networks — solo in ~3 months. The 4 decisions that saved weeks, the 4 mistakes that cost days, and what it actually runs for per month in 2026.

⏱️

AI Engineering June 1, 2026 12 min read

How to Build a Background Job System for Long-Running Tasks in Next.js on Vercel

CRON_SECRET-protected hourly sync from six affiliate APIs—incremental lastSyncedAt via SyncLog, per-network failure isolation, trustHost: true for NextAuth on Vercel, most runs under fifteen seconds on Pro's sixty-second ceiling.

📊

SaaS Development May 31, 2026 13 min read

How to Aggregate Sales Data from Multiple APIs Into One Dashboard

Six affiliate networks, one NetworkAdapter interface, idempotent Prisma upsert on a three-field composite key, hourly Vercel Cron with per-network failure isolation—dashboard reads Postgres, never live APIs.

🔒

Software Development May 30, 2026 12 min read

How to Add Role-Based Access Control to Next.js With NextAuth v5

proxy.ts replaces middleware.ts, jwt/session callbacks carry role, inactive users blocked at login, and two-layer admin enforcement—every NextAuth v5 RBAC gotcha that breaks v4 assumptions.

🔗

SaaS Development May 29, 2026 13 min read

How to Build a Smart Link Tracker with UTM Analytics in Next.js

/go/[alias] public redirects with fire-and-forget click logging, SHA-256 IP hashing, dual UTM storage, Prisma Link/Click/Conversion models, and a thirty-day Recharts dashboard—without Bitly per-click fees.

🔑

Software Development May 28, 2026 12 min read

How to Encrypt and Store Third-Party API Keys in a Next.js SaaS (AES-256-GCM)

Six affiliate integrations, zero plaintext tokens in Postgres—AES-256-GCM with twelve-byte IVs, sixteen-byte auth tags, ENCRYPTION_KEY never in NEXT_PUBLIC, Prisma upserts, and hourly cron decrypt lifetimes locked to RAM.

🔐

Software Development May 27, 2026 12 min read

How to Build Privacy-Safe Click Analytics Without Storing PII

SHA-256 salted IPs, zero cookies, six UTM slots, fire-and-forget Prisma writes from a Next.js /go route—and the analytics capabilities you trade away to stay GDPR-defensible.

📊

SaaS Development May 26, 2026 13 min read

Building a Sales Dashboard That Aggregates 6 Affiliate Networks

Adapter pattern integrations for ClickBank, Digistore24, BuyGoods, MaxWeb, JVZoo, Hotmart plus AES-256-GCM credentials, Postgres normalization, hourly Vercel Cron—and why Network #7 stays a fifty-line adapter.

🎨

Software Development May 25, 2026 13 min read

How I Built a Canva-Style Ad Editor With Fabric.js in Next.js

Fabric.js 5 in Next.js 16 — layers, undo/redo, snap guides, 4 preset ad sizes, and 2× export. Two gotchas every developer hits: SSR and no built-in history.

🛡️

Cost & Optimization May 24, 2026 12 min read

How I Rate-Limited an AI SaaS for $0 — And What It Cost Me Later

A $0 in-memory sliding window for 9 AI tools in 30 minutes — and what breaks when Vercel spins up a second instance. Includes the Upstash migration path.

⚙️

LLM & Agents May 24, 2026 13 min read

How I Built a Multi-Step AI Wizard With Next.js Server Actions

The 6-step pattern behind 9 AI tools on a Next.js 16 SaaS — Zod schema, Server Action pipeline, Gemini integration, and a 5-step Facebook Page Starter wizard.

🤖

Industry Solutions May 22, 2026 12 min read

How to Integrate AI Into Your Business Without Hiring a Full Team

I built 9 AI-powered tools that replaced a copywriter, designer, and social media manager for an affiliate SaaS. Here's how any business can integrate AI without hiring ML engineers.

💰

AI Engineering May 20, 2026 12 min read

How to Build Profitable AI Products — From $200 API Bills to $8 Unit Economics

Real strategies to control OpenAI costs at scale — from $200 API bills in testing to $8-15 per customer. Token budgets, batch optimization, and pre-computation patterns for 75%+ margins.

⚡

AI Engineering May 20, 2026 11 min read

How to Build Crash-Safe Long-Running AI Jobs — Lessons from Generating 1,720-Page PDFs

Production patterns for AI jobs that run for hours — idempotent task design, crash recovery with MongoDB, batching strategies, and cost control for large-scale LLM pipelines.

🚀

AI Engineering May 21, 2026 14 min read

Production-Ready LLM Apps: Batch Processing, Async Patterns & Scaling

How Pulse Clarity scales AI PDF reports from 15-second Life Clarity to 1,720-page horoscopes — product-specific Celery queues, GPT-4o batching, and pre-computation on FastAPI + Render.

Ideas, Systems & Lessons Learned

Blog statistics: 49 articles published, 591 minutes of content, 9 categories. Reader value: 12 minute average read time, 82% of articles include production code, 26 of 49 articles cite real production metrics including dollar figures, page counts, and API call volumes.