A Coding Implementation to Design a Stateful Tutor Agent with Long-Term Memory, Semantic Recall, and Adaptive Practice Generation

In this tutorial, we build a fully stateful personal tutor agent that moves beyond short-lived chat interactions and learns continuously over time. We design the system to persist user preferences, track weak learning areas, and selectively recall only relevant past context when responding. By combining durable storage, semantic retrieval, and adaptive prompting, we demonstrate how an agent can behave more like a long-term tutor than a stateless chatbot. Also, we focus on keeping the agent self-managed, context-aware, and able to improve its guidance without requiring the user to repeat information.

!pip -q install "langchain>=0.2.12" "langchain-openai>=0.1.20" "sentence-transformers>=3.0.1" "faiss-cpu>=1.8.0.post1" "pydantic>=2.7.0"


import os, json, sqlite3, uuid
from datetime import datetime, timezone
from typing import List, Dict, Any
import numpy as np
import faiss
from pydantic import BaseModel, Field
from sentence_transformers import SentenceTransformer
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage
from langchain_core.language_models.chat_models import BaseChatModel
from langchain_core.outputs import ChatGeneration, ChatResult


DB_PATH="/content/tutor_memory.db"
STORE_DIR="/content/tutor_store"
INDEX_PATH=f"{STORE_DIR}/mem.faiss"
META_PATH=f"{STORE_DIR}/mem_meta.json"
os.makedirs(STORE_DIR, exist_ok=True)


def now(): return datetime.now(timezone.utc).isoformat()


def db(): return sqlite3.connect(DB_PATH)


def init_db():
   c=db(); cur=c.cursor()
   cur.execute("""CREATE TABLE IF NOT EXISTS events(
       id TEXT PRIMARY KEY,user_id TEXT,session_id TEXT,role TEXT,content TEXT,ts TEXT)""")
   cur.execute("""CREATE TABLE IF NOT EXISTS memories(
       id TEXT PRIMARY KEY,user_id TEXT,kind TEXT,content TEXT,tags TEXT,importance REAL,ts TEXT)""")
   cur.execute("""CREATE TABLE IF NOT EXISTS weak_topics(
       user_id TEXT,topic TEXT,mastery REAL,last_seen TEXT,notes TEXT,PRIMARY KEY(user_id,topic))""")
   c.commit(); c.close()

We set up the execution environment and import all required libraries for building a stateful agent. We also define core paths and utility functions for time handling and database connections. It establishes the foundational infrastructure that the rest of the system relies on.

class MemoryItem(BaseModel):
   kind:str
   content:str
   tags:List[str]=Field(default_factory=list)
   importance:float=Field(0.5,ge=0,le=1)


class WeakTopicSignal(BaseModel):
   topic:str
   signal:str
   evidence:str
   confidence:float=Field(0.5,ge=0,le=1)


class Extracted(BaseModel):
   memories:List[MemoryItem]=Field(default_factory=list)
   weak_topics:List[WeakTopicSignal]=Field(default_factory=list)


class FallbackTutorLLM(BaseChatModel):
   @property
   def _llm_type(self)->str: return "fallback_tutor"
   def _generate(self, messages, stop=None, run_manager=None, **kwargs)->ChatResult:
       last=messages[-1].content if messages else ""
       content=self._respond(last)
       return ChatResult(generations=[ChatGeneration(message=AIMessage(content=content))])
   def _respond(self, text:str)->str:
       t=text.lower()
       if "extract_memories" in t:
           out={"memories":[],"weak_topics":[]}
           if "recursion" in t:
               out["weak_topics"].append({"topic":"recursion","signal":"struggled",
                                         "evidence":"User indicates difficulty with recursion.","confidence":0.85})
           if "prefer" in t or "i like" in t:
               out["memories"].append({"kind":"preference","content":"User prefers concise explanations with examples.",
                                       "tags":["style","preference"],"importance":0.55})
           return json.dumps(out)
       if "generate_practice" in t:
           return "\n".join([
               "Targeted Practice (Recursion):",
               "1) Implement factorial(n) recursively, then iteratively.",
               "2) Recursively sum a list; state the base case explicitly.",
               "3) Recursive binary search; return index or -1.",
               "4) Trace fibonacci(6) call tree; count repeated subcalls.",
               "5) Recursively reverse a string; discuss time/space.",
               "Mini-quiz: Why does missing a base case cause infinite recursion?"
           ])
       return "Tell me what you're studying and what felt hard; I’ll remember and adapt practice next time."


def get_llm():
   key=os.environ.get("OPENAI_API_KEY","").strip()
   if key:
       from langchain_openai import ChatOpenAI
       return ChatOpenAI(model="gpt-4o-mini",temperature=0.2)
   return FallbackTutorLLM()

We define the database schema and initialize persistent storage for events, memories, and weak topics. We ensure that user interactions and long-term learning signals are stored reliably across sessions. It enables agent memory to be durable beyond a single run.

EMBED_MODEL="sentence-transformers/all-MiniLM-L6-v2"
embedder=SentenceTransformer(EMBED_MODEL)


def load_meta():
   if os.path.exists(META_PATH):
       with open(META_PATH,"r") as f: return json.load(f)
   return []


def save_meta(meta):
   with open(META_PATH,"w") as f: json.dump(meta,f)


def normalize(x):
   n=np.linalg.norm(x,axis=1,keepdims=True)+1e-12
   return x/n


def load_index(dim):
   if os.path.exists(INDEX_PATH): return faiss.read_index(INDEX_PATH)
   return faiss.IndexFlatIP(dim)


def save_index(ix): faiss.write_index(ix, INDEX_PATH)


EXTRACTOR_SYSTEM = (
   "You are a memory extractor for a stateful personal tutor.\n"
   "Return ONLY JSON with keys: memories (list of {kind,content,tags,importance}) "
   "and weak_topics (list of {topic,signal,evidence,confidence}).\n"
   "Store durable info only; do not store secrets."
)


llm=get_llm()
init_db()


dim=embedder.encode(["x"],convert_to_numpy=True).shape[1]
ix=load_index(dim)
meta=load_meta()

We define the data models and the fallback language model used when no external API key is available. We formalize how memories and weak-topic signals are represented and extracted. It allows the agent to consistently convert raw conversations into structured, actionable memory.

def log_event(user_id, session_id, role, content):
   c=db(); cur=c.cursor()
   cur.execute("INSERT INTO events VALUES (?,?,?,?,?,?)",
               (str(uuid.uuid4()),user_id,session_id,role,content,now()))
   c.commit(); c.close()


def upsert_weak(user_id, sig:WeakTopicSignal):
   c=db(); cur=c.cursor()
   cur.execute("SELECT mastery,notes FROM weak_topics WHERE user_id=? AND topic=?",(user_id,sig.topic))
   row=cur.fetchone()
   delta=(-0.10 if sig.signal=="struggled" else 0.10 if sig.signal=="improved" else 0.0)*sig.confidence
   if row is None:
       mastery=float(np.clip(0.5+delta,0,1)); notes=sig.evidence
       cur.execute("INSERT INTO weak_topics VALUES (?,?,?,?,?)",(user_id,sig.topic,mastery,now(),notes))
   else:
       mastery=float(np.clip(row[0]+delta,0,1)); notes=(row[1]+" | "+sig.evidence)[-2000:]
       cur.execute("UPDATE weak_topics SET mastery=?,last_seen=?,notes=? WHERE user_id=? AND topic=?",
                   (mastery,now(),notes,user_id,sig.topic))
   c.commit(); c.close()


def store_memory(user_id, m:MemoryItem):
   mem_id=str(uuid.uuid4())
   c=db(); cur=c.cursor()
   cur.execute("INSERT INTO memories VALUES (?,?,?,?,?,?,?)",
               (mem_id,user_id,m.kind,m.content,json.dumps(m.tags),float(m.importance),now()))
   c.commit(); c.close()
   v=embedder.encode([m.content],convert_to_numpy=True).astype("float32")
   v=normalize(v); ix.add(v)
   meta.append({"mem_id":mem_id,"user_id":user_id,"kind":m.kind,"content":m.content,
                "tags":m.tags,"importance":m.importance,"ts":now()})
   save_index(ix); save_meta(meta)

We focus on embedding-based semantic memory using vector representations and similarity search. We encode memories, store them in a vector index, and persist metadata for later retrieval. It enables relevance-based recall rather than blindly loading all past context.

def extract(user_text)->Extracted:
   msg="extract_memories\n\nUser message:\n"+user_text
   r=llm.invoke([SystemMessage(content=EXTRACTOR_SYSTEM),HumanMessage(content=msg)]).content
   try:
       d=json.loads(r)
       return Extracted(
           memories=[MemoryItem(**x) for x in d.get("memories",[])],
           weak_topics=[WeakTopicSignal(**x) for x in d.get("weak_topics",[])]
       )
   except:
       return Extracted()


def recall(user_id, query, k=6):
   if ix.ntotal==0: return []
   q=embedder.encode([query],convert_to_numpy=True).astype("float32")
   q=normalize(q)
   scores, idxs = ix.search(q,k)
   out=[]
   for s,i in zip(scores[0].tolist(), idxs[0].tolist()):
       if i<0 or i>=len(meta): continue
       m=meta[i]
       if m["user_id"]!=user_id or s<0.25: continue
       out.append({**m,"score":float(s)})
   out.sort(key=lambda r: r["score"]*(0.6+0.4*r["importance"]), reverse=True)
   return out


def weak_snapshot(user_id):
   c=db(); cur=c.cursor()
   cur.execute("SELECT topic,mastery,last_seen FROM weak_topics WHERE user_id=? ORDER BY mastery ASC LIMIT 5",(user_id,))
   rows=cur.fetchall(); c.close()
   return [{"topic":t,"mastery":float(m),"last_seen":ls} for t,m,ls in rows]


def tutor_turn(user_id, session_id, user_text):
   log_event(user_id,session_id,"user",user_text)
   ex=extract(user_text)
   for w in ex.weak_topics: upsert_weak(user_id,w)
   for m in ex.memories: store_memory(user_id,m)
   rel=recall(user_id,user_text,k=6)
   weak=weak_snapshot(user_id)
   prompt={
       "recalled_memories":[{"kind":x["kind"],"content":x["content"],"score":x["score"]} for x in rel],
       "weak_topics":weak,
       "user_message":user_text
   }
   gen = llm.invoke([SystemMessage(content="You are a personal tutor. Use recalled_memories only if relevant."),
                     HumanMessage(content="generate_practice\n\n"+json.dumps(prompt))]).content
   log_event(user_id,session_id,"assistant",gen)
   return gen, rel, weak


USER_ID="user_demo"
SESSION_ID=str(uuid.uuid4())


print("✅ Ready. Example run:\n")
ans, rel, weak = tutor_turn(USER_ID, SESSION_ID, "Last week I struggled with recursion. I prefer concise explanations.")
print(ans)
print("\nRecalled:", [r["content"] for r in rel])
print("Weak topics:", weak)

We orchestrate the full tutor interaction loop, combining extraction, storage, recall, and response generation. We update mastery scores, retrieve relevant memories, and dynamically generate targeted practice. It completes the transformation from a stateless chatbot into a long-term, adaptive tutor.

In conclusion, we implemented a tutor agent that remembers, reasons, and adapts across sessions. We showed how structured memory extraction, long-term persistence, and relevance-based recall work together to overcome the “goldfish memory” limitation common in most agents. The resulting system continuously refines its understanding of a user’s weaknesses. It proactively generates targeted practice, demonstrating a practical foundation for building stateful, long-horizon AI agents that improve with sustained interaction.


Check out the Full Codes hereAlso, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

The post A Coding Implementation to Design a Stateful Tutor Agent with Long-Term Memory, Semantic Recall, and Adaptive Practice Generation appeared first on MarkTechPost.



from MarkTechPost https://ift.tt/h6lgxZP
via IFTTT

Comments

Popular posts from this blog

Implementing Persistent Memory Using a Local Knowledge Graph in Claude Desktop

Microsoft AI Proposes BitNet Distillation (BitDistill): A Lightweight Pipeline that Delivers up to 10x Memory Savings and about 2.65x CPU Speedup

Technical Deep Dive: Automating LLM Agent Mastery for Any MCP Server with MCP- RL and ART