Where does data come from?
AgentLoop is deliberately agnostic about which turns get
logged for review — that's a product decision you make. The SDK's
log_turn() accepts an arbitrary signals dict;
whatever reaches it ends up in the review queue. Here are the patterns
most teams combine.
1. Explicit user feedback
# Only log if the user took the time to say the answer was bad. loop.log_turn_if( user_clicked_thumbs_down, question=q, agent_response=a, user_id=user.id, signals={"thumbs_down": True}, )
2. Low model confidence
# If you can score confidence (logprobs, self-rating, etc.) response = call_llm(question) if response.confidence < 0.6: loop.log_turn( question=q, agent_response=response.text, signals={"confidence": response.confidence}, )
3. Domain heuristics in your own code
# Examples — pick the ones that match your domain: # User asked twice in a row about the same thing if is_rephrase_of_previous(question, history): loop.log_turn(question=q, agent_response=a, signals={"rephrase": True}) # Agent punted ("contact support") if "contact support" in answer.lower() or "i don't know" in answer.lower(): loop.log_turn(question=q, agent_response=a, signals={"agent_punted": True}) # Agent made a factual claim worth auditing (price, quote, promise) if mentions_price_or_quote(answer): loop.log_turn(question=q, agent_response=a, signals={"factual_claim": True})
4. Downstream outcome signals
# Support agent said "try restarting"; user came back with same issue. if user_returned_with_same_issue(session_id, within_hours=2): loop.log_turn( question=original_question, agent_response=original_answer, session_id=session_id, signals={"recurrence": True, "gap_hours": 1.5}, ) # Coding agent suggested a change that got reverted fast. if suggestion_reverted(suggestion_id, within_minutes=10): loop.log_turn( question=q, agent_response=a, signals={"reverted_quickly": True}, )
5. Random sampling (or: log everything)
import random # Sample 5% of all turns unconditionally. if random.random() < 0.05: loop.log_turn(question=q, agent_response=a, signals={"sample": True}) # Or log every single one — only safe for low-volume agents # (< a few hundred turns/day). Unreviewed turns TTL out at 30 days. loop.log_turn(question=q, agent_response=a)
Recommended stack
A mature AgentLoop integration usually combines 2–3 of these:
- Always: explicit user feedback (1)
- Usually: a handful of domain heuristics (3)
- Sometimes: downstream outcome signals (4) — if you have the telemetry
- Rarely: low confidence alone (2) — combine with others
- Rarely: log-everything (5) — only early-stage or regulated
Feedback widget
Every response can include a feedback_url. Embed it as
a link or iframe in your UI — when the user clicks it, they see a
minimal form. The submission arrives as an annotation in the
dashboard, no login required (the URL is HMAC-signed with the
annotation context).
The feedback URL is signed identically by the Python and JavaScript SDKs. A URL signed in one language validates in the other — feedback flows can cross language boundaries safely.