AI Product

Low Risk

by @sickn33Verified Source

4.4448 installsv1.0.0Updated May 25, 2026

How to Use

Run in Claude Code terminal

Step 1: Add Marketplace

/plugin marketplace add sickn33/antigravity-awesome-skills

Step 2: Install Plugin

/plugin install ai-product@antigravity-awesome-skills

About

Every product will be AI-powered. The question is whether you'll

name: ai-product description: Every product will be AI-powered. The question is whether you'll build it right or ship a demo that falls apart in production. risk: safe source: vibeship-spawner-skills (Apache 2.0) date_added: 2026-02-27

AI Product Development

Every product will be AI-powered. The question is whether you'll build it right or ship a demo that falls apart in production.

This skill covers LLM integration patterns, RAG architecture, prompt engineering that scales, AI UX that users trust, and cost optimization that doesn't bankrupt you.

Principles

LLMs are probabilistic, not deterministic | Description: The same input can give different outputs. Design for variance. Add validation layers. Never trust output blindly. Build for the edge cases that will definitely happen. | Examples: Good: Validate LLM output against schema, fallback to human review | Bad: Parse LLM response and use directly in database
Prompt engineering is product engineering | Description: Prompts are code. Version them. Test them. A/B test them. Document them. One word change can flip behavior. Treat them with the same rigor as code. | Examples: Good: Prompts in version control, regression tests, A/B testing | Bad: Prompts inline in code, changed ad-hoc, no testing
RAG over fine-tuning for most use cases | Description: Fine-tuning is expensive, slow, and hard to update. RAG lets you add knowledge without retraining. Start with RAG. Fine-tune only when RAG hits clear limits. | Examples: Good: Company docs in vector store, retrieved at query time | Bad: Fine-tuned model on company data, stale after 3 months
Design for latency | Description: LLM calls take 1-30 seconds. Users hate waiting. Stream responses. Show progress. Pre-compute when possible. Cache aggressively. | Examples: Good: Streaming response with typing indicator, cached embeddings | Bad: Spinner for 15 seconds, then wall of text appears
Cost is a feature | Description: LLM API costs add up fast. At scale, inefficient prompts bankrupt you. Measure cost per query. Use smaller models where possible. Cache everything cacheable. | Examples: Good: GPT-4 for complex tasks, GPT-3.5 for simple ones, cached embeddings | Bad: GPT-4 for everything, no caching, verbose prompts

Patterns

Structured Output with Validation

Use function calling or JSON mode with schema validation

When to use: LLM output will be used programmatically

import { z } from 'zod';

const schema = z.object({ category: z.enum(['bug', 'feature', 'question']), priority: z.number().min(1).max(5), summary: z.string().max(200) });

const response = await openai.chat.completions.create({ model: 'gpt-4', messages: [{ role: 'user', content: prompt }], response_format: { type: 'json_object' } });

const parsed = schema.parse(JSON.parse(response.content));

Streaming with Progress

Stream LLM responses to show progress and reduce perceived latency

When to use: User-facing chat or generation features

const stream = await openai.chat.completions.create({ model: 'gpt-4', messages, stream: true });

for await (const chunk of stream) { const content = chunk.choices[0]?.delta?.content; if (content) { yield content; // Stream to client } }

Prompt Versioning and Testing

Version prompts in code and test with regression suite

When to use: Any production prompt

// prompts/categorize-ticket.ts export const CATEGORIZE_TICKET_V2 = { version: '2.0', system: 'You are a support ticket categorizer...', test_cases: [ { input: 'Login broken', expected: { category: 'bug' } }, { input: 'Want dark mode', expected: { category: 'feature' } } ] };

// Test in CI const result = await llm.generate(prompt, test_case.input); assert.equal(result.category, test_case.expected.category);

Caching Expensive Operations

Cache embeddings and deterministic LLM responses

When to use: Same queries processed repeatedly

// Cache embeddings (expensive to compute) const cacheKey = embedding:${hash(text)}; let embedding = await cache.get(cacheKey);

if (!embedding) { embedding = await openai.embeddings.create({ model: 'text-embedding-3-small', input: text }); await cache.set(cacheKey, embedding, '30d'); }

Circuit Breaker for LLM Failures

Graceful degradation when LLM API fails or returns garbage

When to use: Any LLM integration in critical path

const circuitBreaker = new CircuitBreaker(callLLM, { threshold: 5, // failures timeout: 30000, // ms resetTimeout: 60000 // ms });

try { const response = await circuitBreaker.fire(prompt); return response; } catch (error) { // Fallback: rule-based system, cached response, or human queue return fallbackHandler(prompt); }

RAG with Hybrid Search

Combine semantic search with keyword matching for better retrieval

When to use: Implementing RAG systems

// 1. Semantic search (vector similarity) const embedding = await embed(query); const semanticResults = awai

Compatible Tools

Claude CodeCursor

AI Product

About

name: ai-product description: Every product will be AI-powered. The question is whether you'll build it right or ship a demo that falls apart in production. risk: safe source: vibeship-spawner-skills (Apache 2.0) date_added: 2026-02-27

AI Product Development

Principles

Patterns

Structured Output with Validation

Streaming with Progress

Prompt Versioning and Testing

Caching Expensive Operations

Circuit Breaker for LLM Failures

RAG with Hybrid Search

Compatible Tools

Tags

Related Skills

RAG Engineer

"orchestrate-batch-refactor"

Docx Official

Azure AI Agents Persistent Java

Azure Search Documents Ts

Agent Framework Azure AI Py