Recsys Pipeline Architect

Low Risk

by @affaan-mVerified Source

4528 installsv1.0.0Updated May 25, 2026

How to Use

Run in Claude Code terminal

Step 1: Add Marketplace

/plugin marketplace add affaan-m/ECC

Step 2: Install Plugin

/plugin install ecc@ecc

About

Design composable recommendation, ranking, and feed pipelines using the six-stage Source→Hydrator→Filter→Scorer→Selector→SideEffect framework popularized by xAI's open-sourced For You algorithm. Use this skill whenever the user is building any system that picks

recsys-pipeline-architect

A spec-and-scaffold skill for building composable recommendation, ranking, and feed pipelines. It encodes the six-stage pattern — Source → Hydrator → Filter → Scorer → Selector → SideEffect — popularized by xAI's open-sourced For You algorithm (Apache 2.0). This skill is an independent reimplementation of the pattern (MIT) — no code copied from the original.

Upstream: https://github.com/mturac/recsys-pipeline-architect

When to Use

User wants to build any system that picks "the top K items for a user/context"
User asks "how should I rank X" or describes a feed/personalization problem
User has a scoring function and needs the pipeline plumbing around it
User wants to migrate from a single relevance score to multi-action prediction with tunable weights
User is wrapping an LLM/ML scorer and needs filters, hydrators, side-effects, and a runnable scaffold in their stack (TypeScript / Go / Python)
Triggers: "recommendation system", "feed algorithm", "ranking pipeline", "for you feed", "candidate pipeline", "content recommender", "pipeline architecture for recsys", "RAG retrieval reranker"

When NOT to Use

Model architecture work (transformer design, two-tower retrieval, embedding training) — this skill is plumbing around the model, not the model itself
Pure ML training pipelines — the scoring function is the user's responsibility
Operating a deployed pipeline (monitoring, autoscaling) — out of scope

The six-stage framework

| # | Stage | Job | Parallel? | |---|---|---|---| | 1 | Source | Fetch candidates from one or more origins | Yes — multiple sources run in parallel | | 2 | Hydrator | Enrich each candidate with metadata needed for filtering and scoring | Yes — independent hydrators run in parallel | | 3 | Filter | Drop candidates that should never be shown (blocked, expired, duplicate, ineligible) | Sequential — each filter sees fewer items | | 4 | Scorer | Assign each surviving candidate one or more scores | Sequential — later scorers see earlier scores | | 5 | Selector | Sort by final score, return top K | Single op | | 6 | SideEffect | Cache served IDs, log impressions, emit events, update counters | Async — must never block the response |

Why this exact order

Sources before hydration: know what candidates exist before paying to enrich them
Hydration before filtering: many filters need metadata the source did not provide
Filtering before scoring: scoring is the expensive stage; drop the ineligible first
Scorer chain (not single scorer): real systems compose ML scoring + diversity reranking + business rules
Selector after scoring: keeps scoring deterministic and cacheable
SideEffects last and async: side effects must never block the user response

Workflow when invoked

Walk the user through these eight steps:

Clarify the use case (one round, three questions): items being ranked? input context? language/runtime?
Identify the candidate sources: usually in-network (followed/owned/subscribed) + out-of-network (ML retrieval / trending / similar-to-liked)
List required hydrations: for each filter and scorer, what data does it need that the source did not provide?
List the filters: duplicate, self, age, block/mute, previously-served, eligibility. Order matters — cheap before expensive.
Design the scorer chain: primary (ML) → combiner (multi-action with weights) → diversity → business rules
Selector: sort descending by final score, take top K (or stratified mix for in-network/out-of-network)
SideEffects: cache served IDs, emit impression events, update counters, log analytics — all fire-and-forget
Generate the scaffold in the user's stack

Key trade-offs to surface (don't default silently)

1. Single score vs multi-action prediction

Single score: train one model to predict relevance. To change behavior → retrain.
Multi-action: predict P(action) for many actions (read, like, share, skip, report), combine with weights at serving time. To change behavior → change weights. No retraining.

The X For You system uses multi-action with both positive and negative weights. Recommend multi-action when the user expects to tune frequently.

Recsys Pipeline Architect

About

recsys-pipeline-architect

When to Use

When NOT to Use

The six-stage framework

Why this exact order

Workflow when invoked

Key trade-offs to surface (don't default silently)

1. Single score vs multi-action prediction

2. Candidate isolation in scoring

Compatible Tools

Tags

Related Skills

RAG Engineer

"orchestrate-batch-refactor"

Docx Official

Azure AI Agents Persistent Java

Azure Search Documents Ts

Agent Framework Azure AI Py