Create Expert Content: Local Testing of a Multi-Agent System with Memory

1 week ago 13

In support of our mission to accelerate the developer journey on Google Cloud, we built Dev Signal: a multi-agent system designed to transform raw community signals into reliable technical guidance by automating the path from discovery to expert creation.

In part 1 and part 2 of this series, we established the essential groundwork by standardizing the core capabilities through the Model Context Protocol (MCP) and constructing a multi-agent architecture integrated with the Vertex AI memory bank to provide long-term intelligence and persistence. Now, we'll explore how to test your multi-agent system locally!

If you’d like to dive straight into the code and explore it at your own pace, you can clone the repository here.

Testing the agent Locally

Before transitioning your agentic system to Google Cloud Run, it is essential to ensure that its specialized components work seamlessly together on your workstation. This testing phase allows you to validate trend discovery, technical grounding, and creative drafting within a local feedback loop, saving time and resources during the development process.

In this section, you will configure your local secrets, implement environment-aware utilities, and use a dedicated test runner to verify that Dev Signal can correctly retrieve user preferences from the Vertex AI memory bank on the cloud. This local verification ensures that your agent's "brain" and "hands" are properly synchronized before moving to deployment.

Environment Setup

Create a .env file in your project root. These variables are used for local development and will be replaced by Terraform/Secret Manager in production.

Paste this code in dev-signal/.env and update with your own details.

Note: GOOGLE_CLOUD_LOCATION is set as global because that is where Gemini-3-flash-preview is supported. We will use GOOGLE_CLOUD_LOCATION for the model location.

Helper Utilities

Create a new directory for your application utils.

Environment configuration

This module standardizes how the agent discovers the active Google Cloud Project and Region, ensuring a seamless transition between development environments. Using load_dotenv(), the script first checks for local configurations before falling back to google.auth.default() or environment variables to retrieve the Project ID. This automated approach ensures your agent is properly authenticated and grounded in the correct cloud context without requiring manual configuration changes.

Beyond basic project discovery, the script provides a robust Secret Management layer. It attempts to resolve sensitive credentials, such as Reddit API keys, first from the local environment (for rapid development) and then dynamically from the Google Cloud Secret Manager API for production security. By returning these as a dictionary rather than injecting them into environment variables, the module maintains a clean security posture.

The script further calibrates the environment by distinguishing between global and regional requirements for different AI services. It specifically assigns the "global" location for models to access cutting-edge preview features while designating a regional location, such as us-central1, for infrastructure like the Vertex AI Agent Engine. By finalizing this setup with a global SDK initialization, the module integrates these settings into the session, allowing the rest of your application to interact with models and memory banks without having to repeatedly pass project or location parameters.

Paste this code in dev_signal_agent/app_utils/env.py

Local testing script

The Google ADK comes with a built-in Web UI, This UI is excellent for visualizing agent logic and tool composition.

You can launch it by running in the project root:

However, the default Web UI will not test the long-term memory integration described in this tutorial because it is not pre-connected to a Vertex AI memory session. By default, the generic UI often relies on in-memory services that do not persist data across sessions. Therefore, we use the dedicated test_local.py script to explicitly initialize the VertexAiMemoryBankService. This ensures that even in a local environment, your agent is communicating with the real cloud-based memory bank to validate preference persistence.

The test_local.py script:

Connects to the real Vertex AI Agent Engine in the cloud for memory storage.
Uses an in-memory session service for local chat history (so you can wipe it easily).
Run a chat loop where you can talk to your agent.

Go back to the root folder dev-signal:

Paste this code in dev-signal/test_local.py

Running the Test

First, ensure you have your Application Default Credentials set up:

Then run the script:

Test Scenario

This scenario validates the full end-to-end lifecycle of the agent: from discovery and research to multimodal content creation and long-term memory retrieval.

Phase 1: Teaching & Multimodal Creation (Session 1)

Goal: Establish technical context and set a specific stylistic preference.

Discovery

Ask the agent to find trending Cloud Run topics.

Input: "Find high-engagement questions about AI agents on Cloud Run from the last 21 days."

Research

Instruct the agent to perform a deep dive on a specific result.

Input: "Use the GCP Expert to research topic #1."

Personalization

Request a blog post and explicitly set your style preference.

Input: "Draft a blog post based on this research. From now on, I want all my technical blogs written in the style of a 90s Rap Song."

Image generation

Ask the agent to generate an image that demonstrates the main ideas in the blog using the Nano Banana Pro tool. The image would be saved to your bucket in Google Cloud and you should get the path to see it which will look like this: https://storage.mtls.cloud.google.com/...

Phase 2: Long-Term Memory Recall (Session 2)

Goal: Verify the agent recalls preferences across a completely fresh session.

Type new in the console to wipe local session history and start a fresh state.
Retrieval) Inquire about your stored preferences to test the Vertex AI memory bank.

Input: "What are my current topics of interest and what is my preferred blogging style?"

Verification: Confirm the agent successfully retrieves your "AI Agents on Cloud Run" interest and "Rap" style from the cloud.

Final Test: Ask for a new blog on a different topic (e.g., "GKE Autopilot") and ensure it is automatically written as a rap song without being prompted.

Summary

In this part of our series we focused on verifying the agent's functionality in a local environment before proceeding to cloud deployment. By configuring local secrets and utilizing environment-aware utilities, we used a dedicated test runner to confirm that the core reasoning and tool logic are properly integrated. We successfully validated the full lifecycle: from Reddit discovery to expert content creation, confirming that the agent correctly retrieves preferences from the cloud-based Vertex AI memory bank even in completely fresh sessions.

Ready to run the test scenario yourself? Clone the repository and try the test_local.py script to see 'Dev Signal' retrieve your preferences from the Vertex AI memory bank in real-time. For a deeper dive into the underlying mechanics of memory orchestration, check out this quickstart guide.

In the final part of this series, we will transition our prototype into production service on Google Cloud Run using Terraform for secure infrastructure and explore the roadmap to production excellence through continuous evaluation and security

Special thanks to Remigiusz Samborski for the helpful review and feedback on this article.

For more content like this, Follow me on Linkedin and X.

Posted in

Developers & Practitioners

Read Entire Article