Chatbot — Gemini 2.5 Flash
Vercel AI SDK v5+ over @google/genai (unified Google GenAI SDK). Streaming SSE. Provider-agnostic — can swap to Anthropic/OpenAI without rewriting tool calls. Hardcoded PO whitelist becomes a feature flag.
Corrected by audit (May 26):
@google/generative-aiis DEPRECATED (Nov 30, 2025). SDK releases after June 24, 2026 strip deprecated modules. Use@google/genai.- Pin Vercel AI SDK ≥ v5 (shipped July 2025). v5 has UIMessage vs ModelMessage split, native SSE, tools use
inputSchema/outputSchema(notparameters/result). v6 in late 2025. - Plan for Gemini retry/fallback (Gemini 3 Pro
INVALID_ARGUMENTreports in production). - OpenRouter as fallback proxy is cheap insurance — multi-provider hedging recommended.
Library pick
Vercel AI SDK (ai + @ai-sdk/google) over @google/generative-ai:
- Provider-agnostic
streamText/tool()— fallback to Anthropic on Gemini rate-limit. - First-class Next 15 RSC streaming via
useChat+toDataStreamResponse(). - Known issue: vercel/ai #6589 — Gemini 2.5 thinking-token shape sometimes breaks streaming function calls. Pin a known-good minor version + test on every bump. Downgrade individual tool to
@google/generative-aidirect if it bites.
# Vercel AI SDK v5+ + the unified Google GenAI SDK
# (NOT @google/generative-ai — that is deprecated as of Nov 30, 2025)
pnpm --filter @spacehub/web add "ai@^5" "@ai-sdk/google@^5" "@google/genai" zod
pnpm --filter @spacehub/api add "ai@^5" "@ai-sdk/google@^5" "@google/genai" zod
Architecture
Chat lives in two places:
- Next.js route handler
/app/api/chat/route.ts— handles streaming for web users (best DX withuseChat). - Hono
/v2/chat— same logic, for Flutter mobile (SSE). Shares tool definitions viapackages/chatbot.
Tool implementations import @spacehub/db directly and run inside RLS context (set app.user_id + app.owner_id from the session before query).
Route handler — Next.js
// apps/web/src/app/api/chat/route.ts
import { streamText } from "ai";
import { google } from "@ai-sdk/google";
import { getSession } from "@/lib/session";
import { tools } from "@spacehub/chatbot/tools";
import { persistTurn } from "@/lib/chat-persist";
export async function POST(req: Request) {
const session = await getSession();
if (!session) return new Response("Unauthorized", { status: 401 });
const { messages, conversationId } = await req.json();
const result = streamText({
model: google("gemini-2.5-flash"),
system: `You are a Spacehub assistant for ${session.user.name},
operating on behalf of property owner ${session.org.name}.
Use the provided tools to answer with real data. Reply in Mongolian by default,
switch to English if the user does.`,
messages,
tools: tools(session), // tools closed over the session for RLS scoping
maxTokens: 2048,
temperature: 0.3,
onFinish: ({ text, usage }) => persistTurn(conversationId, session.user.id, text, usage),
});
return result.toDataStreamResponse();
}
Tools (port the 10 from v1)
// packages/chatbot/src/tools.ts
import { tool } from "ai";
import { z } from "zod";
import { withRls } from "@spacehub/db/rls";
export function tools(session: Session) {
const rls = { userId: session.user.id, ownerId: session.org.id, role: session.user.role };
return {
getActiveContracts: tool({
description: "List currently active contracts (rentals) for the user's organization. Filter by property optional.",
parameters: z.object({
propertyId: z.string().optional(),
limit: z.number().min(1).max(50).default(20),
}),
execute: async ({ propertyId, limit }) => withRls(db, rls, (tx) =>
tx.query.contracts.findMany({
where: and(
eq(contracts.status, "active"),
propertyId ? eq(contracts.propertyId, propertyId) : undefined,
),
limit,
})),
}),
checkRentPayments: tool({
description: "Summarize rent payment status: paid, overdue, total amounts for a period.",
parameters: z.object({
periodStart: z.string().describe("YYYY-MM-DD"),
periodEnd: z.string().describe("YYYY-MM-DD"),
}),
execute: async ({ periodStart, periodEnd }) => withRls(db, rls, async (tx) => {
const rows = await tx.select({
status: invoices.status,
count: sql<number>`count(*)`,
total: sql<string>`sum(total)`,
paid: sql<string>`sum(paid_amount)`,
}).from(invoices)
.where(and(
eq(invoices.kind, "rent"),
gte(invoices.periodStart, periodStart),
lte(invoices.periodEnd, periodEnd),
))
.groupBy(invoices.status);
return rows;
}),
}),
getOccupancyInfo: tool({ /* property occupancy stats */ }),
listOverdueInvoices: tool({ /* overdue list */ }),
getContractDetail: tool({ /* by id */ }),
getCustomerHistory: tool({ /* by customer id */ }),
listPendingEbarimtPushes: tool({ /* outbox status */ }),
getMonthlyRevenue: tool({ /* aggregated */ }),
findRoom: tool({ /* search by unit number or feature */ }),
summarizeBankRecon: tool({ /* unmatched txn count */ }),
};
}
Persist history
// packages/db/src/schema/chat.ts
export const chatConversations = pgTable("chat_conversations", {
id: uuid().primaryKey().defaultRandom(),
userId: uuid("user_id").notNull(),
ownerId: uuid("owner_id").notNull(),
title: text(),
createdAt: timestamp({ withTimezone: true }).notNull().defaultNow(),
lastMessageAt: timestamp("last_message_at", { withTimezone: true }).notNull().defaultNow(),
});
export const chatMessages = pgTable("chat_messages", {
id: uuid().primaryKey().defaultRandom(),
conversationId: uuid("conversation_id").notNull().references(() => chatConversations.id, { onDelete: "cascade" }),
role: text({ enum: ["user","assistant","tool"] }).notNull(),
content: jsonb().notNull(), // text or tool-call/tool-result payloads
tokensIn: integer("tokens_in"),
tokensOut: integer("tokens_out"),
modelVersion: text("model_version"),
createdAt: timestamp({ withTimezone: true }).notNull().defaultNow(),
});
Cost control
- Implicit caching (free in Gemini 2.5): keep system prompt + tool definitions at the start of every request, user input at end. 90% input-cost reduction on cache hits. Min cache 1024 tokens.
- Explicit caching for >5k-token preambles (when context includes "tenant + property + contract" snapshot). ~1h TTL, cache-read = 10% base input price.
maxTokens: 2048for chat,4096for summarization tools.- Fallback chain:
gemini-2.5-flash→gemini-2.5-flash-liteon rate-limit → scripted reply on 429-after-retry.
Whitelist → feature flag
v1 has hardcoded PO whitelist for chatbot. v2: feature_flags table + Postgres-cached env-overrides. Per-owner toggle, default off. Surface in admin UI.
API surface
POST /v2/chat # streaming SSE (Hono); { messages, conversationId? }
GET /v2/chat/conversations # user's history
GET /v2/chat/conversations/{id}/messages
DELETE /v2/chat/conversations/{id}
# Next.js web uses /api/chat (RSC integration with useChat hook)
Build steps
- Schemas + RLS (chat_conversations, chat_messages).
packages/chatbotwith shared tool definitions.- Hono
/v2/chatSSE handler (port for Flutter). - Next.js
/api/chatroute handler +<Chatbot/>client component usinguseChat. - Persist on stream finish (
onFinishcallback). - Feature flag for owner allow-list.
- Cost monitoring: dashboard panel per owner (tokens in/out, $ estimate).
Open questions
- Flutter SSE vs polling? SSE works on Flutter via
eventsource; easier than WS. Use SSE. - Tool execution scope: any user can ask "show overdue invoices" — RLS already prevents cross-owner data. Confirm no extra guardrail needed.
- System prompt customization per owner? Defer. Single prompt for v2 launch.
- Cost cap per owner per month? Hard cutoff or soft alert? Recommend alert at 80%, soft cap at 100% with admin override.