Skip to content

Shopping agent with Mastra + samesake

This builds the apps/ecommerce-assistant example: a shopping assistant that handles requests like “I like vintage clothes under $200”, follow-ups that remember the budget, aggregations like “which brand lists the most shoes?”, and multi-collection brand questions — all by letting an LLM call into a samesake search layer.

It’s the Weaviate Query Agent e-commerce recipe, rebuilt with samesake for retrieval and Mastra for the agent, plus an MCP server so any MCP client (Claude Desktop, Cursor, another app) can use the same tools.

Mastra Agent (gpt-4.1-mini)
├── searchProducts → samesake.search (hybrid: FTS + vector, NLQ budget parsing)
├── searchBrands → samesake.search (brand hierarchy / country / rating)
├── countProductsByBrand → SQL GROUP BY (which brand lists the most shoes)
└── averagePrice → SQL AVG (avg price for a brand)
Postgres + pgvector
└── project_qa_ecommerce.c_products / c_brands (samesake, namespaced)
MCPServer (stdio) → exposes all four tools + the agent itself

Three pieces, three jobs:

  • samesake is the retrieval layer. You declare your catalog in TypeScript; it compiles to Postgres tables with full-text search and vector embeddings, and gives you one search() call that blends them.
  • Mastra is the agent. An LLM that you hand a set of tools (functions). It decides which tool to call, with what arguments, and turns the results into an answer. It also keeps conversation history.
  • MCP is the plug. Wrap your tools in an MCP server and any MCP-speaking client can drive them — you build the capability once.

The agent never talks to Postgres directly. It calls tools; the tools call samesake (or SQL). That separation is the whole design.

A standalone Bun app — no framework needed.

// package.json (deps)
"@samesake/core": "^1.2.0", // inside this monorepo the app uses "workspace:*"; published is ^1.2.0
"@samesake/server": "^1.2.0",
"@mastra/core": "^1.42.0",
"@mastra/mcp": "^1.10.0",
"@ai-sdk/openai": "^3.0.71",
"ai": "^6.0.184",
"postgres": "^3.4.9",
"zod": "^4.4.3"

You’ll need a Postgres database with pgvector (Neon, Supabase, or local), plus a DATABASE_URL, a GEMINI_API_KEY (embeddings + NLQ), and an OPENAI_API_KEY (the agent). Put them in apps/ecommerce-assistant/.env.

The recipe has two datasets — clothing items and brands — so we declare two samesake collection()s. Field descriptions matter: they’re what the agent leans on, so we mirror them into the tool descriptions later.

src/samesake.ts
import { z } from "zod";
import { collection, f, Channels } from "@samesake/core";
import { createMatcher } from "@samesake/server";
import { geminiEmbed, geminiGenerate } from "./providers.ts";
export const PROJECT = "qa-ecommerce";
export const PRODUCTS = "products";
export const BRANDS = "brands";
export const products = collection(PRODUCTS, {
fields: {
name: f.text({ searchable: true }),
description: f.text({ searchable: true }),
brand: f.text({ searchable: true, filterable: true, facet: true }),
category: f.text({ filterable: true, facet: true }),
subcategory: f.text({ filterable: true, facet: true }),
collection: f.text({ filterable: true, facet: true }),
price: f.number({ filterable: true, facet: "range", budget: true }),
image_url: f.text(),
},
embeddings: {
doc: { source: "$name $description $brand $category $subcategory", model: "gemini-embedding-2", dim: 1536 },
},
search: {
channels: [
Channels.fts({ fields: ["name", "description", "brand"], weight: 1 }),
Channels.cosine({ embedding: "doc", weight: 1 }),
],
combiner: "rrf",
// NLQ turns "less than $200" into a hard max_price filter.
nlq: {
enable: true,
semanticRewrite: true,
schema: z.object({ semantic_query: z.string(), max_price: z.number().optional() }),
},
},
});
// brands: name, parent_brand, description, country, avg_customer_rating, foundation_year
// (same shape — see the repo for the full definition)

The budget: true on price plus the nlq block is what lets a shopper say “under $200” in plain English and get a hard filter, not a fuzzy vibe.

@samesake/server ships no AI SDK — you pick the models. This app uses Gemini for embeddings and NLQ:

// src/providers.ts (abridged)
export const geminiEmbed: EmbedFn = async ({ text, dim }) => {
// POST to gemini-embedding-2:embedContent, outputDimensionality = dim
};
export const geminiGenerate: GenerateFn = async ({ prompt, system, schema }) => {
// POST to gemini-3.1-flash-lite:generateContent with responseJsonSchema = schema
};
// src/samesake.ts (continued)
let _matcher: ReturnType<typeof createMatcher> | null = null;
export function getMatcher() {
if (_matcher) return _matcher;
_matcher = createMatcher({
databaseUrl: process.env.DATABASE_URL!,
apiKey: process.env.API_KEY ?? "dev-key-please-change",
migrate: "eager",
embed: geminiEmbed,
generate: geminiGenerate,
});
return _matcher;
}

Two search tools, two aggregation tools. The split that matters: search vs. aggregate.

// src/tools.ts (search — abridged)
export const searchProducts = createTool({
id: "search_products",
description:
"Semantic + keyword search over the catalog. Prices are in USD; pass max_price to cap the budget (hard filter).",
inputSchema: z.object({
query: z.string(),
max_price: z.number().optional(),
category: z.enum(PRODUCT_CATEGORIES).optional(),
limit: z.number().int().min(1).max(50).default(8),
}),
execute: async ({ query, max_price, category, limit }) => {
const filters: Record<string, unknown> = {};
if (typeof max_price === "number") filters.price = { $lte: max_price };
if (category) filters.category = category;
const result = await getMatcher().search(PROJECT, PRODUCTS, { q: query, filters, limit });
return { count: result.hits.length, products: result.hits.map(toProduct) };
},
});
// src/tools.ts (aggregations — abridged)
export const countProductsByBrand = createTool({
id: "count_products_by_brand",
description: "Count items per brand, ranked. Pass `category` (e.g. shoes -> Footwear) or omit for the whole catalog.",
inputSchema: z.object({ category: z.enum(PRODUCT_CATEGORIES).optional() }),
execute: async ({ category }) => {
const rows = await sql()`
SELECT brand, count(*)::int AS count FROM ${sql().unsafe(PRODUCTS_TABLE)}
WHERE ${category ? sql()`category = ${category}` : sql()`true`}
GROUP BY brand ORDER BY count DESC`;
return { top: rows[0] ?? null, by_brand: rows };
},
});
export const averagePrice = createTool({
id: "average_price",
description: "Average price (USD) and item count for a brand across the whole catalog.",
inputSchema: z.object({ brand: z.string() }),
execute: async ({ brand }) => { /* SELECT count(*), avg(price) WHERE brand = $1 */ },
});
export const tools = { searchProducts, searchBrands, countProductsByBrand, averagePrice };
src/agent.ts
import { Agent } from "@mastra/core/agent";
import { openai } from "@ai-sdk/openai";
import { tools } from "./tools.ts";
export const ecommerceAgent = new Agent({
id: "ecommerce-assistant",
name: "ECommerceAssistant",
instructions:
"You are a friendly e-commerce shopping assistant. Help the user find products, compare options, " +
"and answer questions about brands. Recommend specific items with names, brands and prices. " +
"Use search_products (max_price for budgets), search_brands for brand details, and the aggregation " +
"tools for counts/averages. Prices are in USD.",
model: openai("gpt-4.1-mini"),
tools,
});
// Keep history so "same budget as before" resolves against earlier turns.
export class ECommerceAssistant {
private history: { role: "user" | "assistant"; content: string }[] = [];
constructor(private agent = ecommerceAgent) {}
async chat(message: string) {
this.history.push({ role: "user", content: message });
const res = await this.agent.generate(this.history);
this.history.push({ role: "assistant", content: res.text });
return res.text;
}
}

agent.generate(history) takes the whole message array, so multi-turn memory is just an array you append to.

src/mcp.ts
import { MCPServer } from "@mastra/mcp";
import { tools } from "./tools.ts";
import { ecommerceAgent } from "./agent.ts";
export const mcpServer = new MCPServer({
id: "samesake-ecommerce",
name: "Samesake E-commerce Assistant",
version: "1.0.0",
tools,
agents: { ecommerceAgent }, // becomes an `ask_ecommerceAgent` tool
});
if (import.meta.main) await mcpServer.startStdio();

Register it with any MCP client:

{
"mcpServers": {
"samesake-ecommerce": {
"command": "bun",
"args": ["run", "--cwd", "/abs/path/to/apps/ecommerce-assistant", "mcp"]
}
}
}

The seed pulls the same demo data the recipe uses — the public weaviate/agents datasets on Hugging Face — via the datasets REST API, applies the collections, pushes the rows, and builds the indexes.

  1. Install and configure. Fill apps/ecommerce-assistant/.env (DATABASE_URL, GEMINI_API_KEY, OPENAI_API_KEY).

    Terminal window
    bun install
  2. Seed (~448 products + 104 brands; the embedding step is the slow part — see the rate-limit note).

    Terminal window
    bun run --cwd apps/ecommerce-assistant seed
  3. Run the conversation.

    Terminal window
    bun run --cwd apps/ecommerce-assistant demo
  4. Or serve over MCP.

    Terminal window
    bun run --cwd apps/ecommerce-assistant mcp

The demo drives the recipe’s conversation:

  • “I like vintage clothes under $200” → a curated list of items with prices (semantic search + budget filter).
  • “What about some nice shoes, same budget?” → footwear under $200, the budget carried from the previous turn.
  • “Which brand lists the most shoes?” → a real GROUP BY over the Footwear category.
  • “Does Loom & Aura have parent/child brands, what countries, and the average price?” → a brand lookup plus an average — two collections, one answer.

The agent picked the tools; your app code just declared the catalog and wired the models. That’s the point.