Back

Vibe Coding to Production

Mastering Cursor + AI

Anshuman Biswas, Ph.D.

17th October 2025
{ code }

What We'll Cover

1 Pair-programming with AI

  • Cursor IDE & Claude Code
  • Patterns for prompting
  • Advanced features (agent mode, file/folder context)

2 Production engineering

  • Think tokens & context windows
  • Generate tests first, iterate fast
  • Observability, security, ops
AI

How Agentic AI is Reshaping the Art of Software Development

  • Natural-language intent → production code
  • Agents, MCP, and the conductor-developer
  • Promise, perils, and practical guardrails
“build a user dashboard” “fix this login bug” “optimize the query” { } => def

What is Vibe Coding?

Paradigm shift: from line-by-line code to conversational direction of an AI agent.

  • Developer focuses on the what (vision, UX, outcomes)
  • Agent handles the how (syntax, boilerplate, glue)
  • Role evolves: manager/director/prompter + rigorous validator

Karpathy: “fully give in to the vibes… forget that the code even exists.”

Traditional Vibe

Origin: “English is the Hottest New Programming Language”

  • 2021–22: Copilot & ChatGPT normalize AI-assisted coding
  • 2023: “English is the hottest new programming language
  • Feb 2025: Karpathy coins vibe coding
2021 Copilot 2022 ChatGPT 2023 “English is…” Feb 2025 Vibe Coding

How AI “Thinks”: Next-Token Prediction

  • Text → tokens → predict next most-likely token
  • Powerful but probabilistic → hallucinations happen
Create a Python function Create a Python func tion LLM that

Tokens & Context Windows

  • A token ≈ ¾ of a word (roughly 4 chars on average)
  • Context window = how many tokens the model can "see" at once
  • Claude 3.5: 200k tokens ≈ ~500 pages of text
  • Plan for ~30–40% overhead from code structure, formatting
200k tokens Your code + prompt

“Tokens to Think” & Prompt Patterns

  • Externalize reasoning: “think step-by-step”, “show your work”
  • Turn vague prompts into structured tasks
Bad →  "What's 17*28?"
Good → "What's 17*28? Think step-by-step and show each partial sum."
Bad Prompt LLM → 512 Good Prompt 10×28=280; 7×20=140; 7×8=56 → 476

Tooling Spectrum: Assistant → Agent

Low agency High agency Tab Assistants In-IDE Chatbots Agents

Why AI-assisted coding?

Speed

Write skeleton code & tests in minutes, not hours.

Focus

Offload boilerplate so you think about logic & design.

Explore

Try new patterns, libraries, languages with confidence.

Meet the Tools

Cursor IDE

  • VS Code fork with AI at core
  • In-editor chat, Cmd+K inline edits
  • Multi-file awareness via @ mentions

Claude Code

  • Terminal-native agent from Anthropic
  • Autonomous "agent mode" for multi-step tasks
  • Built on Claude 3.5 Sonnet
Cursor Claude

Cursor Basics

  • Chat: ask questions, request code snippets
  • Cmd+K: inline code generation or modification
  • @files, @folders: attach context explicitly
  • @web, @docs: bring in external knowledge
  • Agent mode: let Cursor autonomously handle multi-step requests
Chat ⌘K Agent

Claude Code Basics

  • CLI-first: runs in your terminal, integrates with your workflow
  • Tool Use: reads/writes files, runs commands, searches code
  • Agent mode: breaks complex tasks into sub-tasks automatically
  • Long context: understands large codebases (200k+ tokens)
Terminal Claude Files

Agent Mode Deep Dive

What it does

  • Autonomously plans multi-step workflows
  • Reads files, runs tests, iterates on failures
  • Can self-correct and adapt to new info

When to use

  • Large refactors or feature additions
  • End-to-end debugging sessions
  • Exploratory tasks across many files
Plan Execute step OK Err

Prompt Patterns

Be specific

"Add a validateEmail function that returns true if valid."

Use context

"@readme Generate a setup script that matches our docs."

Iterate

"Now make it handle edge cases: empty string, special chars."

Prompt Code Review

Claude & Cursor: Practical Tips

  • Claude: keep a living CLAUDE.md; use /plan for steps; /diff for summaries; spawn subagents for tests/docs.
  • Cursor: codify a meta-prompt in cursor.rules; Ask (⌘/Ctrl+K) for focused edits; inline diff → accept hunks → stage → PR.
  • Prefer small, reviewable chunks; gate with tests + CI.
Claude Tips /plan → break down tasks /diff → summarize changes Keep CLAUDE.md updated Subagents for tests/docs Cursor Tips Meta-prompt in cursor.rules Ask (⌘/Ctrl+K) /fix tests Inline diff → accept hunks → stage → PR

The Command Line Titans

  • Claude Code: local-first, deep project context
  • Codex CLI: cloud sandboxes, tight GitHub integration
  • Gemini CLI: huge context, multimodal, strong GCP links

Codex CLI in Practice

$ codex run --repo . --task "refactor auth module" ▶ plan → edit → test → patch → PR Ephemeral sandbox; reproducible artifacts
  • Workflow: trigger from issues/PRs → plan → edits → tests → patch artifact → auto-PR
  • Strengths: speed, concurrency, GitHub-native
  • Habits: acceptance criteria; tests enforced; review diffs like a human PR

MCP: Why Agents Need a “USB-C Port”

LLMs are powerful but isolated—no built-in access to your files, repos, DBs, or cloud APIs.

Model Context Protocol (MCP) standardizes tool access so agents can work in your world.

🧠 MCP FilesDB CloudGit

MCP Servers: Real Examples

  • Context7: local project graph / embeddings (search, “open file by intent”)
  • GitHub: issues/PRs, reviews, repo content
  • Jira: fetch issues, change status, log work
  • Cloudflare: KV, DNS, pages, workers deploy
  • AWS: S3, STS, CloudWatch logs, SSM run-cmd
  • Sentri (example): internal secrets/feature flags

Agents discover tools/resources from each server at runtime.

IDE (MCP Host) Agent Context7 GitHub Jira

Add an MCP Server (example config)

Client config (JSON) — add servers to your agent host (e.g., VS Code, CLI, or desktop host):

{
  "mcpServers": {
    "context7": {
      "command": "context7-mcp",
      "args": ["--stdio"],
      "env": { "C7_PROJECT_ROOT": "${workspaceFolder}" }
    },
    "github": {
      "command": "mcp-github",
      "args": ["--stdio"],
      "env": { "GITHUB_TOKEN": "${env:GITHUB_TOKEN}" }
    },
    "jira":   { "command": "mcp-jira", "args": ["--stdio"], "env": { "JIRA_URL":"...", "JIRA_TOKEN":"..." } },
    "cloudflare": { "command":"mcp-cloudflare", "args":["--stdio"], "env": { "CF_API_TOKEN":"..." } },
    "aws": { "command":"mcp-aws", "args":["--stdio"], "env": { "AWS_PROFILE":"prod" } },
    "sentri": { "command":"mcp-sentri", "args":["--stdio"], "env": { "SENTRI_URL":"...", "SENTRI_KEY":"..." } }
  }
}

Call flow — discovery and invocation (JSON-RPC under the hood):

// 1) list tools on 'context7'
{ "method":"tools/list", "params":{ "server":"context7" } }

// 2) call a semantic search tool
{
  "method":"tools/call",
  "params":{
    "server":"context7",
    "name":"search",
    "arguments":{ "query":"open the file that creates EKS nodegroups" }
  }
}

// 3) act on result: open file, edit, create PR via GitHub server

Exact method names vary by implementation; the pattern is standard.

The Upside: Velocity

  • Near-instant prototyping
  • Democratizes building (intent > syntax)
  • Focus on architecture/UX & problem selection
Idea Prototype Launch

The Downside: Risks

  • Security flaws can ship fast
  • License contamination
  • Black-box maintainability; hallucinations
Fast Code Generation Hidden Security Flaws License Contamination Unmaintainable Code LLM Hallucinations

Responsible Vibe Coding

  • Guide, don’t follow: clear, single-task prompts
  • Human-in-the-loop: rigorous code review
  • Automated scans: SCA + SAST/DAST
  • Checkpoints: branch early, commit often
  • Avoid epistemic debt: ensure shared understanding
🔍 Rigorous Code Review 🛡️ Automated Security Scans 🌿 Version Control Checkpoints 🗺️ Provide Clear Instructions

The Developer → The Conductor

Not obsolete—elevated. Architect, prompt-engineer, validator, conductor of specialized agents.

👩‍💻 UI Backend Database

Tests First Philosophy

"Generate failing tests before writing implementation. Then iterate until they pass."

Why?

  • Clarity: forces you to define expected behavior upfront
  • Speed: AI can quickly scaffold test cases from requirements
  • Confidence: passing tests = shipping-ready code
Test Code

Practical Example: Blog Feature

Goal: Add a "Featured Post" toggle to blog admin

  1. Prompt: "Add a is_featured boolean column to Posts table."
  2. Prompt: "Generate a migration script + rollback."
  3. Prompt: "Add a toggle in the admin UI, wire to backend."
  4. Prompt: "Write tests for the feature flag logic."
  5. Run tests, iterate on failures.
DB API UI

Observability & Debugging

  • Logging: structured logs (JSON) for easy parsing
  • Metrics: track latency, error rates, throughput
  • Tracing: distributed traces for microservices
  • AI help: "Analyze these logs and suggest root cause"
Logs Metrics Traces

Security & Secrets Management

  • Never commit secrets: use .env, .gitignore
  • Vault services: AWS Secrets Manager, HashiCorp Vault, etc.
  • Rotate credentials: automate rotation, audit access
  • AI prompt: "Review this code for hardcoded secrets"
App Vault 🔒

CI/CD Integration

Continuous Integration

  • Run tests on every commit
  • Linting, type checks, security scans
  • Block merges if tests fail

Continuous Deployment

  • Auto-deploy to staging on merge
  • Smoke tests in staging
  • Manual or auto-promote to prod
Commit CI CD Prod

Deployment Best Practices

  • Blue-Green: run old + new versions, switch traffic gradually
  • Canary releases: roll out to 5% of users first
  • Rollback plan: always have a quick revert strategy
  • Health checks: automated smoke tests post-deploy
Blue Green → Traffic

Playwright Example

"Write an end-to-end test: user creates a project, adds a task, marks it done."
// e2e/project.spec.js
import { test, expect } from '@playwright/test';

test('user workflow: create project → add task → complete', async ({ page }) => {
  await page.goto('/projects');
  await page.getByRole('button', { name:/new project/i }).click();
  await page.getByRole('textbox', { name:/name/i }).fill('Demo');
  await page.getByRole('button', { name:/create/i }).click();
  await page.getByRole('link', { name:'Demo' }).click();
  await page.getByRole('textbox', { name:/title/i }).fill('First task');
  await page.getByRole('button', { name:/add/i }).click();
  await page.getByRole('checkbox', { name:/First task/i }).check();
  await expect(page.getByRole('listitem', { name:/First task/i }))
    .toHaveClass(/completed/);
});

Common Pitfalls

  • Over-reliance: don't skip code review—AI can be wrong
  • Context bloat: too many files → model gets confused
  • Vague prompts: "make it better" → unclear results
  • Ignoring tests: shipping untested AI code is risky
⚠️ 🚨 ☠️

DEMO_TIME

$ DEMO_TIME ▌

Q&A / Thank You